完整後設資料紀錄
DC 欄位語言
dc.contributor.authorSuri, M.en_US
dc.contributor.authorRini, S.en_US
dc.date.accessioned2019-09-02T07:45:42Z-
dc.date.available2019-09-02T07:45:42Z-
dc.date.issued2019-01-01en_US
dc.identifier.isbn978-1-7281-0584-0en_US
dc.identifier.issn2374-3212en_US
dc.identifier.urihttp://hdl.handle.net/11536/152568-
dc.description.abstractIn the Dictionary-based String Matching (DSM) problem, an Information Retrieval (IR) system has access to a source sequence and stores the position of a certain number of strings in a posting table. When a user inquires the position of a string, the IR system, instead of searching in the source sequence directly, relies on the the posting table to answer the query more efficiently. In this paper, the Statistical DSM problem is proposed as a statistical and information-theoretic formulation of the classic DSM problem in which both the source and the query have a statistical description while the strings stored in the posting sequence are described as a code. Through this formulation, we define the communication efficiency of the IR system as the average cost in retrieving the entries of the posting list from the posting table, in the limit of an infinitely long source sequence. This formulation is used to study the communication efficiency for the case in which the dictionary is composed of (i) all the strings of a given length, referred to as k-grams , and (ii) run-length codes.en_US
dc.language.isoen_USen_US
dc.subjectDictionary-based string matchingen_US
dc.subjectContent based retrievalen_US
dc.subjectIndexing databaseen_US
dc.subjectInformation retrievalen_US
dc.subjectPhrase searchingen_US
dc.titleTHE STATISTICAL DICTIONARY-BASED STRING MATCHING PROBLEMen_US
dc.typeProceedings Paperen_US
dc.identifier.journalIRAN WORKSHOP ON COMMUNICATION AND INFORMATION THEORY (IWCIT 2019)en_US
dc.citation.spage0en_US
dc.citation.epage0en_US
dc.contributor.department電機工程學系zh_TW
dc.contributor.departmentDepartment of Electrical and Computer Engineeringen_US
dc.identifier.wosnumberWOS:000476947400006en_US
dc.citation.woscount0en_US
顯示於類別:會議論文