| 標題: | DSM-TKP: Mining Top-K Path traversal patterns over Web click-streams |
| 作者: | Li, HF Lee, SY Shan, MK 資訊工程學系 Department of Computer Science |
| 公開日期: | 2005 |
| 摘要: | Online, single-pass mining Web click streams poses some interesting computational issues, such as unbounded length of streaming data, possibly very fast arrival rate, and just one scan over previously arrived click-sequences. In this paper, we propose a new, single-pass algorithm, called DSM-TKP (Data Stream Mining for Top-K Path traversal patterns), for mining top-k path traversal patterns, where k is the desired number of path traversal patterns to be mined. An effective summary data structure called TKP-forest (Top-K Path forest) is used to maintain the essential information about the top-k path traversal patterns of the click-stream so far. Experimental studies show that DSM-TKP algorithm uses stable memory usage and makes only one pass over the streaming data. |
| URI: | http://hdl.handle.net/11536/18054 |
| ISBN: | 0-7695-2415-X |
| 期刊: | 2005 IEEE/WIC/ACM International Conference on Web Intelligence, Proceedings |
| 起始頁: | 326 |
| 結束頁: | 329 |
| Appears in Collections: | Conferences Paper |

