標題: | Prototyping an integrated information gathering system on CORBA |
作者: | Chang, YS Liang, KC Cheng, MC Yuan, SM 資訊工程學系 Department of Computer Science |
關鍵字: | information gathering and integration;search service;information retrieval;CORBA;XML |
公開日期: | 1-Jul-2004 |
摘要: | The sheer volume of information and variety of sources from which it may be retrieved on the Web make searching the sources a difficult task. Usually, meta-search engines can be used only to search Web pages or documents; other major sources such as data bases, library corpuses and the so-called Web data bases are not involved. Faced with these restrictions, an effective retrieval technology for a much wider range of sources becomes increasingly important. In our previous work, we proposed an Integrated Retrieval (IIR), which is based on Common Object Request Broker Architecture, to spare clients the trouble of complicated semantics when federating multiple sources. In this paper, we present an IIR-based prototype for integrated information gathering system. It offers a unified interface for querying heterogeneous interfaces or protocols of sources and uses SQL compatible query language for heterogeneous backend targets. We use it to link two general search engines (Yahoo and AltaVista), a science paper explorer (IEEE), and two library corpus explorers. We also perform preliminary measurements to assess the potential of the system. The results shown that the overhead spent on each source as the system queries them is within reason, that is, that using IIR to construct an integrated gathering system incurs low overhead. (C) 2003 Elsevier Inc. All rights reserved. |
URI: | http://dx.doi.org/10.1016/S0164-1212(03)00086-4 http://hdl.handle.net/11536/26581 |
ISSN: | 0164-1212 |
DOI: | 10.1016/S0164-1212(03)00086-4 |
期刊: | JOURNAL OF SYSTEMS AND SOFTWARE |
Volume: | 72 |
Issue: | 2 |
起始頁: | 281 |
結束頁: | 294 |
Appears in Collections: | Articles |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.