Title: Prototyping an integrated information gathering system on CORBA
Authors: Chang, YS
Liang, KC
Cheng, MC
Yuan, SM
資訊工程學系
Department of Computer Science
Keywords: information gathering and integration;search service;information retrieval;CORBA;XML
Issue Date: 1-Jul-2004
Abstract: The sheer volume of information and variety of sources from which it may be retrieved on the Web make searching the sources a difficult task. Usually, meta-search engines can be used only to search Web pages or documents; other major sources such as data bases, library corpuses and the so-called Web data bases are not involved. Faced with these restrictions, an effective retrieval technology for a much wider range of sources becomes increasingly important. In our previous work, we proposed an Integrated Retrieval (IIR), which is based on Common Object Request Broker Architecture, to spare clients the trouble of complicated semantics when federating multiple sources. In this paper, we present an IIR-based prototype for integrated information gathering system. It offers a unified interface for querying heterogeneous interfaces or protocols of sources and uses SQL compatible query language for heterogeneous backend targets. We use it to link two general search engines (Yahoo and AltaVista), a science paper explorer (IEEE), and two library corpus explorers. We also perform preliminary measurements to assess the potential of the system. The results shown that the overhead spent on each source as the system queries them is within reason, that is, that using IIR to construct an integrated gathering system incurs low overhead. (C) 2003 Elsevier Inc. All rights reserved.
URI: http://dx.doi.org/10.1016/S0164-1212(03)00086-4
http://hdl.handle.net/11536/26581
ISSN: 0164-1212
DOI: 10.1016/S0164-1212(03)00086-4
Journal: JOURNAL OF SYSTEMS AND SOFTWARE
Volume: 72
Issue: 2
Begin Page: 281
End Page: 294
Appears in Collections:Articles


Files in This Item:

  1. 000221574800014.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.