標題: | 使用位元向量在資料串流環境探勘封閉式頻繁項目集及循序樣式之研究 Mining of Closed Frequent Itemsets and Sequential Patterns in Data Streams Using Bit-Vector Based Method |
作者: | 何錦泉 Chin-Chuan Ho 李素瑛 Suh-Yin Lee 資訊科學與工程研究所 |
關鍵字: | 資料串流;滑動視窗;封閉式頻繁項目集;循序樣式;data stream;sliding window;closed frequent itemset;sequential pattern |
公開日期: | 2005 |
摘要: | 在資料串流環境中探勘有意義的樣式是一個重要的課題,在感測網路及股市分析等許多應用中都經常採用。由於資料串流環境的限制,探勘工作將會變得比較困難。我們在此篇論文的第一部份提出 New-Moment 演算法在資料串流環境中探勘封閉式頻繁項目集,New-Moment 使用位元向量以及精簡的 closed enumeration tree 大幅改進原來 Moment 演算法的效能。在第二部分我們提出 IncSPAM 演算法在串流環境中探勘循序樣式,它提供了一個全新的滑動視窗架構。IncSPAM 利用 SPAM 演算法以及記憶體索引的方法,動態維護目前最新的樣式。實驗顯示我們的方法能夠很有效率地在資料串流環境中探勘出有意義的樣式。 Mining a data stream is an important data mining problem with broad applications, such as sensor network, stock analysis. It is a difficult problem because of some limitations in the data stream environment. In the first part of this paper, we propose New-Moment to mine closed frequent itemsets. New-Moment uses bit-vectors and a compact lexicographical tree to improve the performance of Moment algorithm. In the second part, we propose IncSPAM to mine sequential patterns with a new sliding window model. IncSPAM is based on SPAM and utilizes memory indexing technique to incrementally maintain sequential patterns in current sliding window. Experiments show that our approaches are efficient for mining patterns in a data stream. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009317507 http://hdl.handle.net/11536/78719 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.