最長連續共同子序列與相關字搜尋演算法

Full metadata record

DC Field	Value	Language
dc.contributor.author	周冠宇	en_US
dc.contributor.author	Kuan-Yu Chou	en_US
dc.contributor.author	杜敏文	en_US
dc.contributor.author	Min-Wen Du	en_US
dc.date.accessioned	2014-12-12T02:22:58Z	-
dc.date.available	2014-12-12T02:22:58Z	-
dc.date.issued	1999	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#NT880394037	en_US
dc.identifier.uri	http://hdl.handle.net/11536/65532	-
dc.description.abstract	本論文提出一個相關字搜尋的問題，此問題與傳統近似字比對問題類似。傳統近似字比對問題的字距定義通常為『編輯距離』，因此常用於拼字或打字上錯誤的校正；而我們提出的相關字搜尋問題則是在一個很大的字集中尋找可能具有相關涵義的字。在此篇論文中，我們以兩個字的最長連續共同子序列之長度當作它們之間相關度的衡量標準。文中提出一個『連續子序列索引絡』的結構，藉由分析此結構的特性，我們設計找出最長連續共同子序列的方法。同時我們也提出一種求得連續子序列特徵的方法，並用此特徵做索引，提出一個有效率解決相關字搜尋問題的演算法。最後我們以實驗證明：時下的一般電腦能夠以此演算法即時地搜尋出我們所定義最相關的字。	zh_TW
dc.description.abstract	In this thesis, we formulated a Resemblance Words Finding (RWF) problem. It is very similar to the traditional Approximate String Matching (ASM) problem. In a traditional ASM problem, the similarity measure between two words is usually defined by the "editing distance", making it to be more suitable for spelling or typing error correction. The RWF problem, however, is to find words containing components that most resemble a component of a given word where the similarity measure is defined as the length of the Longest Contiguous Common Subsequences (LCCSs) measure between two words. A "Contiguous Substring Index Lattice" (CISL) structure is presented. By analyzing properties of the CISL structure, we have designed algorithms for finding LCCSs of two character strings. Based on the same CISL structure, we have also defined signatures of contiguous subsequences to construct an index scheme and developed efficient RWF algorithms. Our experimental results show that the final RWF algorithm developed in this thesis can find the words with LCCS in real-time by an ordinary computer.	en_US
dc.language.iso	en_US	en_US
dc.subject	最長連續共同子序列	zh_TW
dc.subject	演算法	zh_TW
dc.subject	相關字	zh_TW
dc.subject	最長共同子序列	zh_TW
dc.subject	字串比對	zh_TW
dc.subject	特徵	zh_TW
dc.subject	絡	zh_TW
dc.subject	longest contiguous common subsequence	en_US
dc.subject	algorithm	en_US
dc.subject	resemblance word	en_US
dc.subject	longest common subsequence	en_US
dc.subject	string matching	en_US
dc.subject	signature	en_US
dc.subject	lattice	en_US
dc.title	最長連續共同子序列與相關字搜尋演算法	zh_TW
dc.title	Longest Contiguous Common Subsequences and Resemblance Words Finding Algorithms	en_US
dc.type	Thesis	en_US
dc.contributor.department	資訊科學與工程研究所	zh_TW
Appears in Collections:	Thesis