標題: Efficient Mining of Uncertain Data for High-Utility Itemsets
作者: Lin, Jerry Chun-Wei
Gan, Wensheng
Fournier-Viger, Philippe
Hong, Tzung-Pei
Tseng, Vincent S.
資訊工程學系
Department of Computer Science
關鍵字: Data mining;Uncertainty;High-utility itemset;PU-list;Pruning strategies
公開日期: 2016
摘要: High-utility itemset mining (HUIM) is emerging as an important research topic in data mining. Most algorithms for HUIM can only handle precise data, however, uncertainty that are embedded in big data which collected from experimental measurements or noisy sensors in real-life applications. In this paper, an efficient algorithm, namely Mining Uncertain data for High-Utility Itemsets (MUHUI), is proposed to efficiently discover potential high-utility itemsets (PHUIs) from uncertain data. Based on the probability-utility-list (PU-list) structure, the MUHUI algorithm directly mine PHUIs without candidate generation and can reduce the construction of PU-lists for numerous unpromising itemsets by using several efficient pruning strategies, thus greatly improving the mining performance. Extensive experiments both on real-life and synthetic datasets proved that the proposed algorithm significantly outperforms the state-of-the-art PHUI-List algorithm in terms of efficiency and scalability, especially, the MUHUI algorithm scales well on large-scale uncertain datasets for mining PHUIs.
URI: http://dx.doi.org/10.1007/978-3-319-39937-9_2
http://hdl.handle.net/11536/135695
ISBN: 978-3-319-39937-9
978-3-319-39936-2
ISSN: 0302-9743
DOI: 10.1007/978-3-319-39937-9_2
期刊: WEB-AGE INFORMATION MANAGEMENT, PT I
Volume: 9658
起始頁: 17
結束頁: 30
Appears in Collections:Conferences Paper