完整後設資料紀錄
DC 欄位語言
dc.contributor.authorShieh, WYen_US
dc.contributor.authorChen, TFen_US
dc.contributor.authorShann, JJJen_US
dc.contributor.authorChung, CPen_US
dc.date.accessioned2014-12-08T15:41:27Z-
dc.date.available2014-12-08T15:41:27Z-
dc.date.issued2003-01-01en_US
dc.identifier.issn0306-4573en_US
dc.identifier.urihttp://dx.doi.org/10.1016/S0306-4573(02)00020-1en_US
dc.identifier.urihttp://hdl.handle.net/11536/28202-
dc.description.abstractThe inverted file is the most popular indexing mechanism for document search in an information retrieval system. Compressing an inverted file can greatly improve document search rate. Traditionally, the d-gap technique is used in the inverted file compression by replacing document identifiers with usually much smaller gap values. However, fluctuating gap values cannot be efficiently compressed by some well-known prefix-free codes. To smoothen and reduce the gap values, we propose a document-identifier reassignment algorithm. This reassignment is based on a similarity factor between documents. We generate a reassignment order for all documents according to the similarity to reassign closer identifiers to the documents having closer relationships. Simulation results show that the average gap values of sample inverted files can be reduced by 30%, and the compression rate of d-gapped inverted file with prefix-free codes can be improved by 15%. (C) 2002 Elsevier Science Ltd. All rights reserved.en_US
dc.language.isoen_USen_US
dc.subjectinformation retrievalen_US
dc.subjectinverted fileen_US
dc.subjectd-gapen_US
dc.subjectdocument identifier reassignmenten_US
dc.subjecttraveling salesman problemen_US
dc.titleInverted file compression through document identifier reassignmenten_US
dc.typeArticleen_US
dc.identifier.doi10.1016/S0306-4573(02)00020-1en_US
dc.identifier.journalINFORMATION PROCESSING & MANAGEMENTen_US
dc.citation.volume39en_US
dc.citation.issue1en_US
dc.citation.spage117en_US
dc.citation.epage131en_US
dc.contributor.department資訊工程學系zh_TW
dc.contributor.departmentDepartment of Computer Scienceen_US
dc.identifier.wosnumberWOS:000180495500006-
dc.citation.woscount19-
顯示於類別:期刊論文


文件中的檔案:

  1. 000180495500006.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。