An integration of Word Net and fuzzy association rule mining for multi-label document clustering

doi:10.1016/j.datak.2010.08.003

Full metadata record

DC Field	Value	Language
dc.contributor.author	Chen, Chun-Ling	en_US
dc.contributor.author	Tseng, Frank S. C.	en_US
dc.contributor.author	Liang, Tyne	en_US
dc.date.accessioned	2014-12-08T15:47:47Z	-
dc.date.available	2014-12-08T15:47:47Z	-
dc.date.issued	2010-11-01	en_US
dc.identifier.issn	0169-023X	en_US
dc.identifier.uri	http://dx.doi.org/10.1016/j.datak.2010.08.003	en_US
dc.identifier.uri	http://hdl.handle.net/11536/31957	-
dc.description.abstract	With the rapid growth of text documents, document clustering has become one of the main techniques for organizing large amount of documents into a small number of meaningful clusters. However, there still exist several challenges for document clustering, such as high dimensionality, scalability, accuracy, meaningful cluster labels, overlapping clusters, and extracting semantics from texts. In order to improve the quality of document clustering results, we propose an effective Fuzzy-based Multi-label Document Clustering (FMDC) approach that integrates fuzzy association rule mining with an existing ontology WordNet to alleviate these problems. In our approach, the key terms will be extracted from the document set, and the initial representation of all documents is further enriched by using hypernyms of WordNet in order to exploit the semantic relations between terms. Then, a fuzzy association rule mining algorithm for texts is employed to discover a set of highly-related fuzzy frequent itemsets, which contain key terms to be regarded as the labels of the candidate clusters. Finally, each document is dispatched into more than one target cluster by referring to these candidate clusters, and then the highly similar target clusters are merged. We conducted experiments to evaluate the performance based on Classic, Re0, R8, and WebKB datasets. The experimental results proved that our approach outperforms the influential document clustering methods with higher accuracy. Therefore, our approach not only provides more general and meaningful labels for documents, but also effectively generates overlapping clusters. (C) 2010 Elsevier B.V. All rights reserved.	en_US
dc.language.iso	en_US	en_US
dc.subject	Fuzzy association rule mining	en_US
dc.subject	Text mining	en_US
dc.subject	Document clustering	en_US
dc.subject	WordNet	en_US
dc.subject	Frequent itemsets	en_US
dc.title	An integration of Word Net and fuzzy association rule mining for multi-label document clustering	en_US
dc.type	Editorial Material	en_US
dc.identifier.doi	10.1016/j.datak.2010.08.003	en_US
dc.identifier.journal	DATA & KNOWLEDGE ENGINEERING	en_US
dc.citation.volume	69	en_US
dc.citation.issue	11	en_US
dc.citation.spage	1208	en_US
dc.citation.epage	1226	en_US
dc.contributor.department	資訊工程學系	zh_TW
dc.contributor.department	Department of Computer Science	en_US
Appears in Collections:	Articles

Files in This Item:

000283975800009.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.