標題: | Efficient mining of sequential patterns with time constraints by delimited pattern growth |
作者: | Lin, MY Lee, SY 資訊工程學系 Department of Computer Science |
關鍵字: | data mining;pattern-growth;sequence mining;sequential patterns;time constraint |
公開日期: | 1-五月-2005 |
摘要: | An active research topic in data mining is the discovery of sequential patterns, which finds all frequent subsequences in a sequence database. The generalized sequential pattern (GSP) algorithm was proposed to solve the mining of sequential patterns with time constraints, such as time gaps and sliding time windows. Recent studies indicate that the pattern-growth methodology could speed up sequence mining. However, the capabilities to mine sequential patterns with time constraints were previously available only within the Apriori framework. Therefore, we propose the DELISP (delimited sequential pattern) approach to provide the capabilities within the pattern-growth methodology. DELISP features in reducing the size of projected databases by bounded and windowed projection techniques. Bounded projection keeps only time-gap valid subsequences and windowed projection saves nonredundant subsequences satisfying the sliding time-window constraint. Furthermore, the delimited growth technique directly generates constraint-satisfactory patterns and speeds up the pattern growing process. The comprehensive experiments conducted show that DELISP has good scalability and outperforms the well-known GSP algorithm in the discovery of sequential patterns with time constraints. |
URI: | http://dx.doi.org/10.1007/s10115-004-0182-5 http://hdl.handle.net/11536/13738 |
ISSN: | 0219-1377 |
DOI: | 10.1007/s10115-004-0182-5 |
期刊: | KNOWLEDGE AND INFORMATION SYSTEMS |
Volume: | 7 |
Issue: | 4 |
起始頁: | 499 |
結束頁: | 514 |
顯示於類別: | 期刊論文 |