標題: | Fast and memory efficient mining of high-utility itemsets from data streams: with and without negative item profits |
作者: | Li, Hua-Fu Huang, Hsin-Yun Lee, Suh-Yin 資訊工程學系 Department of Computer Science |
關鍵字: | Data mining;Data streams;Utility mining;High-utility itemsets;Utility itemset with positive item profits;Utility itemset with negative item profits |
公開日期: | 1-Sep-2011 |
摘要: | Mining utility itemsets from data steams is one of the most interesting research issues in data mining and knowledge discovery. In this paper, two efficient sliding window-based algorithms, MHUI-BIT (Mining High-Utility Itemsets based on BITvector) and MHUI-TID (Mining High-Utility Itemsets based on TIDlist), are proposed for mining high-utility itemsets from data streams. Based on the sliding window-based framework of the proposed approaches, two effective representations of item information, Bitvector and TIDlist, and a lexicographical tree-based summary data structure, LexTree-2HTU, are developed to improve the efficiency of discovering high-utility itemsets with positive profits from data streams. Experimental results show that the proposed algorithms outperform than the existing approaches for discovering high-utility itemsets from data streams over sliding windows. Beside, we also propose the adapted approaches of algorithms MHUI-BIT and MHUI-TID in order to handle the case when we are interested in mining utility itemsets with negative item profits. Experiments show that the variants of algorithms MHUI-BIT and MHUI-TID are efficient approaches for mining high-utility itemsets with negative item profits over stream transaction-sensitive sliding windows. |
URI: | http://dx.doi.org/10.1007/s10115-010-0330-z http://hdl.handle.net/11536/19881 |
ISSN: | 0219-1377 |
DOI: | 10.1007/s10115-010-0330-z |
期刊: | KNOWLEDGE AND INFORMATION SYSTEMS |
Volume: | 28 |
Issue: | 3 |
起始頁: | 495 |
結束頁: | 522 |
Appears in Collections: | Articles |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.