完整後設資料紀錄
DC 欄位語言
dc.contributor.author楊宗樺en_US
dc.contributor.authorYang, Zong-Huaen_US
dc.contributor.author黃俊龍en_US
dc.contributor.authorHuang, Jiun-Longen_US
dc.date.accessioned2014-12-12T01:52:03Z-
dc.date.available2014-12-12T01:52:03Z-
dc.date.issued2011en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT079855562en_US
dc.identifier.urihttp://hdl.handle.net/11536/48296-
dc.description.abstract找尋高利潤的序列是近年來重要的議題之一,不同於傳統在資料庫內找尋高出現次數的序列,我們傾向尋找的序列可以獲取較高的利潤。這樣的序列可以幫助我們得知哪些序列可以幫助我們在市場交易上了解哪些序列是可以獲得較高的利潤而非重視交易量。找尋高利潤的序列被稱為高利潤序列探勘.近年來,大部分的研究都專注於在交易型資料庫上找尋高利潤組合。此篇論文,我們專注在序列型資料庫內找尋高利潤序列。 我們提了新的探勘框架UPrefixSpan。UPrefixSpan不像大部分的方法需要額外的資料庫掃描。使用UPrefixSpan找尋高利潤序列可以減少確認候選者序列是否為高利潤序列的時間。另外,我們也提了兩個策略使候選序列的數量減少,此處稱為DUPES和DUI策略。最後我們提出了CLexTree在記憶體內維護高利潤序列。在我們的實驗,DUEPS和DUI策略可以減少大量的候選序列,另外UPrefixSpan在執行時間上也有不錯的優勢。使用UPrefixSpan搭配DUEPS和DUI策略是目前我們所知最具擴展性的演算法在高利潤探勘上。zh_TW
dc.description.abstractThe discovery of high profitable pattern is the one of important issue recently. Instead of mining the pattern which occurs frequently in database, it can be found the pattern which has high profits. In the marketing analysis, the pattern which can be obtained great sale is more practical than the sale volume sometimes. The task of mining high profitable pattern is also called high utility pattern mining. In most recent studies, they focus on mining high utility pattern in transaction database. However, in this paper, we propose a novel mining framework UPrefixSpan to discover the high utility pattern in sequence database. Unlike the most of existing study, UPrefixSpan does not require additional database scanning to identify the candidate of high utility patterns. Moreover, we propose two discarding strategies, DUEPS and DUI strategies, to tight the estimated utility of mined patterns. Finally, we devise the novel tree structure CLexTree to maintain the candidates compactly in memory. In our experimental result, our discarding strategies make the mining algorithm can running effectively, and UPrefixSpan outperform the state-of-the-art algorithm substantially in term of running time and the number of potential high utility sequences. UPrefixSpan which mines with DUEPS and DUI strategies is the most scalable algorithm in our study.en_US
dc.language.isozh_TWen_US
dc.subject資料探勘zh_TW
dc.subject序列資料探勘zh_TW
dc.subjectUtility miningen_US
dc.subjectSequence pattern miningen_US
dc.title有效的高利潤序列探勘zh_TW
dc.titleMining High Utility Sequential Pattern Efficientlyen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
顯示於類別:畢業論文