標題: | An Integration of Fuzzy Association Rules and WordNet for Document Clustering |
作者: | Chen, Chun-Ling Tseng, Frank S. C. Liang, Tyne 資訊工程學系 Department of Computer Science |
關鍵字: | Fuzzy association rule mining;Text mining;Document clustering;Frequent itemsets;WordNet |
公開日期: | 2009 |
摘要: | With the rapid growth of text documents, document clustering has become one of the train techniques for organizing large amount of documents into a small number of meaningful clusters. However, there still exist several challenges for document clustering, such as high dimensionality, scalability. accuracy, meaningful cluster labels, and extracting semantics from texts. In order to improve the quality of document clustering results, we propose an effective Fuzzy Frequent Itemset-based Document Clustering (F(2)IDC) approach that combines fuzzy association rule mining with the background knowledge embedded in WordNet. A term hierarchy generated from WordNet is applied to discovery fuzzy frequent itemsets as candidate cluster labels for grouping documents. We have conducted experiments to evaluate our approach on Reuters-21578 dataset. The experimental result shows that our proposed method outperforms the accuracy quality of FIHC, HFTC, and UPGMA. |
URI: | http://hdl.handle.net/11536/13145 |
ISBN: | 978-3-642-01306-5 |
ISSN: | 0302-9743 |
期刊: | ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS |
Volume: | 5476 |
起始頁: | 147 |
結束頁: | 159 |
顯示於類別: | 會議論文 |