基 於 動 態 調 整 權 重 之 co-cluster

Full metadata record

DC Field	Value	Language
dc.contributor.author	張智愷	en_US
dc.contributor.author	Chang, Chih-Kai	en_US
dc.contributor.author	李嘉晃	en_US
dc.contributor.author	Lee, Chia-Hoang	en_US
dc.date.accessioned	2014-12-12T01:52:18Z	-
dc.date.available	2014-12-12T01:52:18Z	-
dc.date.issued	2010	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#GT079855595	en_US
dc.identifier.uri	http://hdl.handle.net/11536/48330	-
dc.description.abstract	由於科技的進步，網路的發展，造成資訊量迅速攀升，然而這樣的進步卻相對的造成使用者必須付出更多的時間去瀏覽所需的文件。有鑒於現今搜尋引擎的廣泛使用，人們希望以更高的效率與效能取得資訊，其中分群的技術應用，扮演著重要的角色。在搜尋的過程中，若能先將文件做好適當的分群，則可讓搜尋系統提供更結構性的結果給使用者。如此一來，不僅可以減少搜尋文件的時間，更可加快使用者找到自己想要的文件。本研究利用Co-Clustering 的分群方法為基底並做更進一步的改良，針對分群效能的改善以及feature 權重的增減加以討論，並且以Reuters、20newsgroup 及classic3 資料集做分析，萃取出核心關鍵字，並給予適當的權重，進而過濾一些不必要的雜訊以及加強關鍵字的強度。利用座標的資訊，利用核心關鍵字在距離群中心的距離為基礎做關鍵字之調整權重。接著，利用logistic function 的特性對關鍵字之權重調整到介於0 與1 之間，再將關鍵字賦予調整後權重之後，再做一次Co-Clustering，重複以上的動作達到收斂後，進而得到較高的分群結果。	zh_TW
dc.description.abstract	This paper proposes a weighted co-clustering algorithm and applies it to document clustering problem. The weighted co-clustering is an extension of co-clustering, and it makes use of co-clustering properties to design a dynamic weighting algorithm for terms. Firstly, co-clustering presents both documents and words on the same coordinate system using spectral embedding technique. Secondly, co-clustering clusters documents and words simultaneously, so the documents that are within the same cluster should be clustered together with their corresponding words. Based on these two properties, the weighted co-clustering changes term weights iteratively. In addition, an outlier detection mechanism is proposed in this paper to eliminate outlier documents from clustering process. When the clustering process is completed, these outlier documents are assigned to appropriate clusters. We conduct experiments on three data sets and the experimental results show that the weighted co-clustering can effectively improve the performance.	en_US
dc.language.iso	zh_TW	en_US
dc.subject	文件分群	zh_TW
dc.subject	文件分析	zh_TW
dc.subject	資料探勘	zh_TW
dc.subject	合作分群	zh_TW
dc.subject	Document Clustering	en_US
dc.subject	Text Analysis	en_US
dc.subject	Information Retrieval	en_US
dc.subject	Co-Clustering	en_US
dc.title	基於動態調整權重之 co-cluster	zh_TW
dc.title	Co-cluster with dynamic weighting	en_US
dc.type	Thesis	en_US
dc.contributor.department	資訊科學與工程研究所	zh_TW
Appears in Collections:	Thesis

Files in This Item:

559501.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.