標題: | Efficient Mining of Uncertain Data for High-Utility Itemsets |
作者: | Lin, Jerry Chun-Wei Gan, Wensheng Fournier-Viger, Philippe Hong, Tzung-Pei Tseng, Vincent S. 資訊工程學系 Department of Computer Science |
關鍵字: | Data mining;Uncertainty;High-utility itemset;PU-list;Pruning strategies |
公開日期: | 2016 |
摘要: | High-utility itemset mining (HUIM) is emerging as an important research topic in data mining. Most algorithms for HUIM can only handle precise data, however, uncertainty that are embedded in big data which collected from experimental measurements or noisy sensors in real-life applications. In this paper, an efficient algorithm, namely Mining Uncertain data for High-Utility Itemsets (MUHUI), is proposed to efficiently discover potential high-utility itemsets (PHUIs) from uncertain data. Based on the probability-utility-list (PU-list) structure, the MUHUI algorithm directly mine PHUIs without candidate generation and can reduce the construction of PU-lists for numerous unpromising itemsets by using several efficient pruning strategies, thus greatly improving the mining performance. Extensive experiments both on real-life and synthetic datasets proved that the proposed algorithm significantly outperforms the state-of-the-art PHUI-List algorithm in terms of efficiency and scalability, especially, the MUHUI algorithm scales well on large-scale uncertain datasets for mining PHUIs. |
URI: | http://dx.doi.org/10.1007/978-3-319-39937-9_2 http://hdl.handle.net/11536/135695 |
ISBN: | 978-3-319-39937-9 978-3-319-39936-2 |
ISSN: | 0302-9743 |
DOI: | 10.1007/978-3-319-39937-9_2 |
期刊: | WEB-AGE INFORMATION MANAGEMENT, PT I |
Volume: | 9658 |
起始頁: | 17 |
結束頁: | 30 |
Appears in Collections: | Conferences Paper |