容許間距之近似重覆樣式探勘

Full metadata record

DC Field	Value	Language
dc.contributor.author	邱欣怡	en_US
dc.contributor.author	Shin-Yi Chiu	en_US
dc.contributor.author	黃俊龍	en_US
dc.contributor.author	陳俊穎	en_US
dc.contributor.author	Huang, Jiun-Long	en_US
dc.contributor.author	Chen, Jing-Ying	en_US
dc.date.accessioned	2014-12-12T01:19:03Z	-
dc.date.available	2014-12-12T01:19:03Z	-
dc.date.issued	2008	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#GT009555535	en_US
dc.identifier.uri	http://hdl.handle.net/11536/39486	-
dc.description.abstract	以往對於repeating pattern mining的研究主要著重於從一個由音樂轉成較長的字串中找出經常重覆出現的子字串。舉例來說，A公司和B公司股價上漲，則C公司股價則會在4天之後上漲。然而，鄧教授所提出的問題給予太多的限制在從一長串set中找出repeating pattern，這使得許多潛在的frequent patterns會因為這個限制導致他們的support分散進而無法被找出。因此，在我們的論文中定義了一個新的pattern，它允許二個相鄰set之間有gap的存在，此外我們也提出了一個演算法，G-Apriori，找出允許gap的pattern。G-Apriori演算法產生candidates且透過掃描database來計算candidates的support。然而為了要避免掃描database太多次，GwI-Apriori被提出來解決這個問題。在GwI-Apriori中，我們設計了一個index list，它包含一個開始位置跟一串的結尾位且利用它來紀錄frequent pattern的所在位置。透過這些index lists，GwI-Apriori只需要掃瞄database一次且利用它們來進行較長pattern的support的計算。此外，在GwI-Apriori中我們也設計了pruning策略來加速support的計算。實驗的資料是以實際的資料評估，且實驗的結果顯示GwI-Apriori優於G-Apriori。	zh_TW
dc.description.abstract	Previous studies on mining repeating patterns focus on discovering sub-strings which appear frequently in a long string, converted from the music. An example of such repeating pattern is ”if the stock price of companies A and B both goes up on day one, the stock price of company C will go up on exactly day ﬁfth.” But the problem proposed by Tung gives too much limitation for mining repeating patterns from set sequence, the potential frequent patterns can not be found due to the frequencies distrusted. Hence, in our paper we deﬁne a new pattern, which allows the gap between two adjacent sets, and propose an algorithm, G-Apriori, to discover the repeating patterns with gap constraint from a set sequence. G-Apriori algorithm generates candidates and counts the frequency of these candidates by scanning the database. In order to avoid scanning the database so many times, the algorithm, GwI-Apriori is proposed to solve the problem. In GwI-Apriori method, it designs an index list, which contains the start position (SP) and end position (EP) list, for recording the positions of the frequent patterns. Besides, the GwI-Apriori also takes the additional strategy for pruning the searching space among the index lists. By using the index lists, the GwI-Apriori only scans the database once and computes the frequency of frequent patterns through the index lists. The experimental results show that the GwI-Apriori performs much better than G-Apriori.	en_US
dc.language.iso	en_US	en_US
dc.subject	重覆樣式	zh_TW
dc.subject	間距	zh_TW
dc.subject	Repeating pattern	en_US
dc.subject	Gap constraint	en_US
dc.title	容許間距之近似重覆樣式探勘	zh_TW
dc.title	Mining Repeating Pattern with Gap Constraint	en_US
dc.type	Thesis	en_US
dc.contributor.department	資訊科學與工程研究所	zh_TW
Appears in Collections:	Thesis

Files in This Item:

553501.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.