標題: | DSM-TKP: Mining Top-K Path traversal patterns over Web click-streams |
作者: | Li, HF Lee, SY Shan, MK 資訊工程學系 Department of Computer Science |
公開日期: | 2005 |
摘要: | Online, single-pass mining Web click streams poses some interesting computational issues, such as unbounded length of streaming data, possibly very fast arrival rate, and just one scan over previously arrived click-sequences. In this paper, we propose a new, single-pass algorithm, called DSM-TKP (Data Stream Mining for Top-K Path traversal patterns), for mining top-k path traversal patterns, where k is the desired number of path traversal patterns to be mined. An effective summary data structure called TKP-forest (Top-K Path forest) is used to maintain the essential information about the top-k path traversal patterns of the click-stream so far. Experimental studies show that DSM-TKP algorithm uses stable memory usage and makes only one pass over the streaming data. |
URI: | http://hdl.handle.net/11536/18054 |
ISBN: | 0-7695-2415-X |
期刊: | 2005 IEEE/WIC/ACM International Conference on Web Intelligence, Proceedings |
起始頁: | 326 |
結束頁: | 329 |
Appears in Collections: | Conferences Paper |