標題: Efficiently mining uncertain high-utility itemsets
作者: Lin, Jerry Chun-Wei
Gan, Wensheng
Fournier-Viger, Philippe
Hong, Tzung-Pei
Tseng, Vincent S.
資訊工程學系
Department of Computer Science
關鍵字: Large-scale dataset;Data mining;Uncertainty;High-utility itemset;Pruning strategies
公開日期: 1-六月-2017
摘要: Data mining consists of deriving implicit, potentially meaningful and useful knowledge from databases such as information about the most profitable items. High-utility itemset mining (HUIM) has thus emerged as an important research topic in data mining. But most HUIM algorithms can only handle precise data, although big data collected in real-life applications using experimental measurements or noisy sensors is often uncertain. In this paper, an efficient algorithm, named Mining Uncertain High-Utility Itemsets (MUHUI), is proposed to efficiently discover potential high-utility itemsets (PHUIs) in uncertain data. Based on the probability-utility-list (PU-list) structure, the MUHUI algorithm directly mines PHUIs without generating candidates, and can avoid constructing PU-lists for numerous unpromising itemsets by applying several efficient pruning strategies, which greatly improve its performance. Extensive experiments conducted on both real-life and synthetic datasets show that the proposed algorithm significantly outperforms the state-of-the-art PHUI-List algorithm in terms of efficiency and scalability, and that the proposed MUHUI algorithm scales well when mining PHUIs in large-scale uncertain datasets.
URI: http://dx.doi.org/10.1007/s00500-016-2159-1
http://hdl.handle.net/11536/145533
ISSN: 1432-7643
DOI: 10.1007/s00500-016-2159-1
期刊: SOFT COMPUTING
Volume: 21
起始頁: 2801
結束頁: 2820
顯示於類別:期刊論文