標題: | 中文結構化文件之語意索引 Semantic Indexing for Chinese Structured Documents |
作者: | 曾志軒 Chih-Hsuan Tseng 柯皓仁 楊維邦 Hao-Ren Ke Wei-Pang Yang 資訊科學與工程研究所 |
關鍵字: | 資訊擷取;索引;語意索引;結構化文件;information retrieval;index;semantic indexing;structured documents |
公開日期: | 2000 |
摘要: | 在本篇論文中,我們提出一套結合結構化文件索引(Indexing for Structured Documents)及語意索引(Semantic Indexing)的檢索系統。結構化文件索引是針對文件內包含有結構化資訊的文件做索引,利用K-ary樹狀結構的特性來儲存元素資料並提供快速的檢索機制。語意索引則是利用語意矩陣將文件中關鍵詞鍵建置為一個「概念空間(Concept Space)」或「知識空間(Knowledge Space)」。關鍵詞鍵的概念空間乃是知識的一種表現形式,且能以語意網路的方式來描述。透過概念空間以及語意網路,我們期能將傳統的資訊擷取提升至知識擷取的層次。在論文中,我們分別針對結構化文件索引方法與結合結構化文件索引及語意索引的方法做關鍵詞檢索。效益評估結果顯示,結合兩種索引之方法雖然較傳統結構化文件需要較多的檢索處理時間,但卻能找出更多符合使用者興趣的資訊。 In this thesis, we propose a information retrieval scheme that combines indexing for structured documents and semantic indexing. Indexing for structured documents can index the documents with embedded document structures. By using characteristics of K-ary trees, we can store the element data and provide fast element access. On the other hand, semantic indexing constructs a conceptual space or knowledge space by using semantic matrices. Through the idea of conceptual space and semantic network, we expect that traditional information retrieval will be evolved into knowledge retrieval. In this thesis, we construct several evaluations to assess both traditional indexing scheme and our integrated indexing scheme. Although the integrated indexing scheme takes more time to build indexes, it can present more information relevant to user interests. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT890394023 http://hdl.handle.net/11536/66923 |
顯示於類別: | 畢業論文 |