Full metadata record
DC FieldValueLanguage
dc.contributor.authorWang, Jun-Zheen_US
dc.contributor.authorHuang, Jiun-Longen_US
dc.contributor.authorChen, Yi-Chengen_US
dc.date.accessioned2017-04-21T06:56:47Z-
dc.date.available2017-04-21T06:56:47Z-
dc.date.issued2016-11en_US
dc.identifier.issn0219-1377en_US
dc.identifier.urihttp://dx.doi.org/10.1007/s10115-015-0914-8en_US
dc.identifier.urihttp://hdl.handle.net/11536/132623-
dc.description.abstractHigh utility sequential pattern mining is an emerging topic in pattern mining, which refers to identify sequences with high utilities (e.g., profits) but probably with low frequencies. To identify high utility sequential patterns, due to lack of downward closure property in this problem, most existing algorithms first generate candidate sequences with high sequence-weighted utilities (SWUs), which is an upper bound of the utilities of a sequence and all its supersequences, and then calculate the actual utilities of these candidates. This causes a large number of candidates since SWU is usually much larger than the real utilities of a sequence and all its supersequences. In view of this, we propose two tight utility upper bounds, prefix extension utility and reduced sequence utility, as well as two companion pruning strategies, and devise HUS-Span algorithm to identify high utility sequential patterns by employing these two pruning strategies. In addition, since setting a proper utility threshold is usually difficult for users, we also propose algorithm TKHUS-Span to identify top-k high utility sequential patterns by using these two pruning strategies. Three searching strategies, guided depth-first search (GDFS), best-first search (BFS) and hybrid search of BFS and GDFS, are also proposed to improve the efficiency of TKHUS-Span. Experimental results on some real and synthetic datasets show that HUS-Span and TKHUS-Span with strategy BFS are able to generate less candidate sequences and thus outperform other prior algorithms in terms of mining efficiency.en_US
dc.language.isoen_USen_US
dc.subjectHigh utility sequential patternen_US
dc.subjectHigh utility sequential pattern miningen_US
dc.subjectTop-k high utility sequential patternen_US
dc.subjectUtility miningen_US
dc.titleOn efficiently mining high utility sequential patternsen_US
dc.identifier.doi10.1007/s10115-015-0914-8en_US
dc.identifier.journalKNOWLEDGE AND INFORMATION SYSTEMSen_US
dc.citation.volume49en_US
dc.citation.issue2en_US
dc.citation.spage597en_US
dc.citation.epage627en_US
dc.contributor.department資訊工程學系zh_TW
dc.contributor.departmentDepartment of Computer Scienceen_US
dc.identifier.wosnumberWOS:000385190100007en_US
Appears in Collections:Articles