Weighted frequent itemset mining over uncertain databases

doi:10.1007/s10489-015-0703-9

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lin, Jerry Chun-Wei	en_US
dc.contributor.author	Gan, Wensheng	en_US
dc.contributor.author	Fournier-Viger, Philippe	en_US
dc.contributor.author	Hong, Tzung-Pei	en_US
dc.contributor.author	Tseng, Vincent S.	en_US
dc.date.accessioned	2017-04-21T06:55:47Z	-
dc.date.available	2017-04-21T06:55:47Z	-
dc.date.issued	2016-01	en_US
dc.identifier.issn	0924-669X	en_US
dc.identifier.uri	http://dx.doi.org/10.1007/s10489-015-0703-9	en_US
dc.identifier.uri	http://hdl.handle.net/11536/133425	-
dc.description.abstract	Frequent itemset mining (FIM) is a fundamental research topic, which consists of discovering useful and meaningful relationships between items in transaction databases. However, FIM suffers from two important limitations. First, it assumes that all items have the same importance. Second, it ignores the fact that data collected in a real-life environment is often inaccurate, imprecise, or incomplete. To address these issues and mine more useful and meaningful knowledge, the problems of weighted and uncertain itemset mining have been respectively proposed, where a user may respectively assign weights to items to specify their relative importance, and specify existential probabilities to represent uncertainty in transactions. However, no work has addressed both of these issues at the same time. In this paper, we address this important research problem by designing a new type of patterns named high expected weighted itemset (HEWI) and the HEWI-Uapriori algorithm to efficiently discover HEWIs. The HEWI-Uapriori finds HEWIs using an Apriori-like two-phase approach. The algorithm introduces a property named high upper-bound expected weighted downward closure (HUBEWDC) to early prune the search space and unpromising itemsets. Substantial experiments on real-life and synthetic datasets are conducted to evaluate the performance of the proposed algorithm in terms of runtime, memory consumption, and number of patterns found. Results show that the proposed algorithm has excellent performance and scalability compared with traditional methods for weighted-itemset mining and uncertain itemset mining.	en_US
dc.language.iso	en_US	en_US
dc.subject	Data mining	en_US
dc.subject	Uncertain databases	en_US
dc.subject	Weighted frequent itemsets	en_US
dc.subject	Two-phase	en_US
dc.subject	Upper-bound	en_US
dc.title	Weighted frequent itemset mining over uncertain databases	en_US
dc.identifier.doi	10.1007/s10489-015-0703-9	en_US
dc.identifier.journal	APPLIED INTELLIGENCE	en_US
dc.citation.volume	44	en_US
dc.citation.issue	1	en_US
dc.citation.spage	232	en_US
dc.citation.epage	250	en_US
dc.contributor.department	資訊工程學系	zh_TW
dc.contributor.department	Department of Computer Science	en_US
dc.identifier.wosnumber	WOS:000368149500014	en_US
Appears in Collections:	Articles