Mining fuzzy frequent itemsets for hierarchical document clustering

doi:10.1016/j.ipm.2009.09.009

Full metadata record

DC Field	Value	Language
dc.contributor.author	Chen, Chun-Ling	en_US
dc.contributor.author	Tseng, Frank S. C.	en_US
dc.contributor.author	Liang, Tyne	en_US
dc.date.accessioned	2014-12-08T15:07:19Z	-
dc.date.available	2014-12-08T15:07:19Z	-
dc.date.issued	2010-03-01	en_US
dc.identifier.issn	0306-4573	en_US
dc.identifier.uri	http://dx.doi.org/10.1016/j.ipm.2009.09.009	en_US
dc.identifier.uri	http://hdl.handle.net/11536/5765	-
dc.description.abstract	As text documents are explosively increasing in the Internet, the process of hierarchical document clustering has been proven to be useful for grouping similar documents for versatile applications. However, most document clustering methods still suffer from challenges in dealing with the problems of high dimensionality, scalability, accuracy, and meaningful cluster labels. In this paper, we will present an effective Fuzzy Frequent Item-set-Based Hierarchical Clustering (F(2)IHC) approach, which uses fuzzy association rule mining algorithm to improve the clustering accuracy of Frequent Item-set-Based Hierarchical Clustering (FIHC) method, In our approach, the key terms will be extracted from the document set, and each document is pre-processed into the designated representation for the following mining process. Then, a fuzzy association rule mining algorithm for text is employed to discover a set of highly-related fuzzy frequent itemsets, which contain key terms to be regarded as the labels of the candidate clusters. Finally, these documents will be clustered into a hierarchical cluster tree by referring to these candidate clusters. We have conducted experiments to evaluate the performance based on Classic4, Hitech, ReO, Reuters, and Wap datasets. The experimental results show that our approach not only absolutely retains the merits of FIHC, but also improves the accuracy quality of FIHC. Crown Copyright (C) 2009 Published by Elsevier Ltd. All rights reserved.	en_US
dc.language.iso	en_US	en_US
dc.subject	Fuzzy association rule mining	en_US
dc.subject	Text mining	en_US
dc.subject	Hierarchical document clustering	en_US
dc.subject	Frequent itemsets	en_US
dc.title	Mining fuzzy frequent itemsets for hierarchical document clustering	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1016/j.ipm.2009.09.009	en_US
dc.identifier.journal	INFORMATION PROCESSING & MANAGEMENT	en_US
dc.citation.volume	46	en_US
dc.citation.issue	2	en_US
dc.citation.spage	193	en_US
dc.citation.epage	211	en_US
dc.contributor.department	資訊工程學系	zh_TW
dc.contributor.department	Department of Computer Science	en_US
dc.identifier.wosnumber	WOS:000275611200007	-
dc.citation.woscount	8	-
Appears in Collections:	Articles

Files in This Item:

000275611200007.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.