On efficiently mining high utility sequential patterns

doi:10.1007/s10115-015-0914-8

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	Wang, Jun-Zhe	en_US
dc.contributor.author	Huang, Jiun-Long	en_US
dc.contributor.author	Chen, Yi-Cheng	en_US
dc.date.accessioned	2017-04-21T06:56:47Z	-
dc.date.available	2017-04-21T06:56:47Z	-
dc.date.issued	2016-11	en_US
dc.identifier.issn	0219-1377	en_US
dc.identifier.uri	http://dx.doi.org/10.1007/s10115-015-0914-8	en_US
dc.identifier.uri	http://hdl.handle.net/11536/132623	-
dc.description.abstract	High utility sequential pattern mining is an emerging topic in pattern mining, which refers to identify sequences with high utilities (e.g., profits) but probably with low frequencies. To identify high utility sequential patterns, due to lack of downward closure property in this problem, most existing algorithms first generate candidate sequences with high sequence-weighted utilities (SWUs), which is an upper bound of the utilities of a sequence and all its supersequences, and then calculate the actual utilities of these candidates. This causes a large number of candidates since SWU is usually much larger than the real utilities of a sequence and all its supersequences. In view of this, we propose two tight utility upper bounds, prefix extension utility and reduced sequence utility, as well as two companion pruning strategies, and devise HUS-Span algorithm to identify high utility sequential patterns by employing these two pruning strategies. In addition, since setting a proper utility threshold is usually difficult for users, we also propose algorithm TKHUS-Span to identify top-k high utility sequential patterns by using these two pruning strategies. Three searching strategies, guided depth-first search (GDFS), best-first search (BFS) and hybrid search of BFS and GDFS, are also proposed to improve the efficiency of TKHUS-Span. Experimental results on some real and synthetic datasets show that HUS-Span and TKHUS-Span with strategy BFS are able to generate less candidate sequences and thus outperform other prior algorithms in terms of mining efficiency.	en_US
dc.language.iso	en_US	en_US
dc.subject	High utility sequential pattern	en_US
dc.subject	High utility sequential pattern mining	en_US
dc.subject	Top-k high utility sequential pattern	en_US
dc.subject	Utility mining	en_US
dc.title	On efficiently mining high utility sequential patterns	en_US
dc.identifier.doi	10.1007/s10115-015-0914-8	en_US
dc.identifier.journal	KNOWLEDGE AND INFORMATION SYSTEMS	en_US
dc.citation.volume	49	en_US
dc.citation.issue	2	en_US
dc.citation.spage	597	en_US
dc.citation.epage	627	en_US
dc.contributor.department	資訊工程學系	zh_TW
dc.contributor.department	Department of Computer Science	en_US
dc.identifier.wosnumber	WOS:000385190100007	en_US
顯示於類別：	期刊論文