Efficient algorithms for mining high-utility itemsets in uncertain databases

doi:10.1016/j.knosys.2015.12.019

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	Lin, Jerry Chun-Wei	en_US
dc.contributor.author	Gan, Wensheng	en_US
dc.contributor.author	Fournier-Viger, Philippe	en_US
dc.contributor.author	Hong, Tzung-Pei	en_US
dc.contributor.author	Tseng, Vincent S.	en_US
dc.date.accessioned	2017-04-21T06:55:50Z	-
dc.date.available	2017-04-21T06:55:50Z	-
dc.date.issued	2016-03-15	en_US
dc.identifier.issn	0950-7051	en_US
dc.identifier.uri	http://dx.doi.org/10.1016/j.knosys.2015.12.019	en_US
dc.identifier.uri	http://hdl.handle.net/11536/133991	-
dc.description.abstract	High-utility itemset mining (HUIM) is a useful set of techniques for discovering patterns in transaction databases, which considers both quantity and profit of items. However, most algorithms for mining high utility itemsets (HUIs) assume that the information stored in databases is precise, i.e., that there is no uncertainty. But in many real-life applications, an item or itemset is not only present or absent in transactions but is also associated with an existence probability. This is especially the case for data collected experimentally or using noisy sensors. In the past, many algorithms were respectively proposed to effectively mine frequent itemsets in uncertain databases. But mining HUIs in an uncertain database has not yet been proposed, although uncertainty is commonly seen in real-world applications. In this paper, a novel framework, named potential high-utility itemset mining (PHUIM) in uncertain databases, is proposed to efficiently discover not only the itemsets with high utilities but also the itemsets with high existence probabilities in an uncertain database based on the tuple uncertainty model. The PHUI-UP algorithm (potential high-utility itemsets upper-bound-based mining algorithm) is first presented to mine potential high-utility itemsets (PHUIs) using a level-wise search. Since PHUI-UP adopts a generate-and test approach to mine PHUIs, it suffers from the problem of repeatedly scanning the database. To address this issue, a second algorithm named PHUI-List (potential high-utility itemsets PU-list-based mining algorithm) is also proposed. This latter directly mines PHUIs without generating candidates, thanks to a novel probability-utility-list (PU-list) structure, thus greatly improving the scalability of PHUI mining. Substantial experiments were conducted on both real-life and synthetic datasets to assess the performance of the two designed algorithms in terms of runtime, number of patterns, memory consumption, and scalability. (C) 2015 Elsevier B.V. All rights reserved.	en_US
dc.language.iso	en_US	en_US
dc.subject	High-utility itemset	en_US
dc.subject	Uncertain database	en_US
dc.subject	Probabilistic-based	en_US
dc.subject	Upper-bound	en_US
dc.subject	PU-list structure	en_US
dc.title	Efficient algorithms for mining high-utility itemsets in uncertain databases	en_US
dc.identifier.doi	10.1016/j.knosys.2015.12.019	en_US
dc.identifier.journal	KNOWLEDGE-BASED SYSTEMS	en_US
dc.citation.volume	96	en_US
dc.citation.spage	171	en_US
dc.citation.epage	187	en_US
dc.contributor.department	資訊工程學系	zh_TW
dc.contributor.department	Department of Computer Science	en_US
dc.identifier.wosnumber	WOS:000370907200014	en_US
顯示於類別：	期刊論文