完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.author | 蘇志明 | en_US |
dc.contributor.author | Chih-Ming Su | en_US |
dc.contributor.author | 曾憲雄 | en_US |
dc.contributor.author | Shian-Shyong Tseng | en_US |
dc.date.accessioned | 2014-12-12T02:20:27Z | - |
dc.date.available | 2014-12-12T02:20:27Z | - |
dc.date.issued | 1998 | en_US |
dc.identifier.uri | http://140.113.39.130/cdrfb3/record/nctu/#NT870394013 | en_US |
dc.identifier.uri | http://hdl.handle.net/11536/64151 | - |
dc.description.abstract | 在許多的應用領域中,找尋或分辨出與一般群集差異相當大的異常節點,是一個非常重要且基礎的步驟。在傳統的資料分析或人工智慧領域中,往往需要將此異常節點加以排除,或給予較低的權重,以避免分析的結果產生極大的誤差。近幾年來,「資料探勘」技術日漸被重視,如何從大量資料中找尋引含且有用的資訊,這方面已有許多的文獻相繼被提出。然而以往「資料探勘」的方向著重於找尋常出現的關連集合,或交易記錄的趨勢。本篇論文中以反向的觀點,探討如何在大量資料中,找尋異於一般集合的異常資料。例如在電子郵件記錄檔中,找尋異於一般合理使用範圍的特殊紀錄,在網路管理的角度上,對此紀錄加以追蹤分析,將會是未來重要的參考資料。 在本篇論文中,我們將提出一個兩階段的分群策略。在第一階段中,我們改良傳統k-means分群演算法,加入了一個「跳躍」的啟發策略,在遞迴的分群階段中,讓異常節點有更大的機率被視為獨立的群集。在第二階段中,利用「最小擴張樹」的概念,將第一階段的結果重新分群。最後我們藉由三類資料加以實驗,都得到非常良好的實驗結果。 | zh_TW |
dc.description.abstract | Identifying outliers and remainder clusters which are used to designate few patterns that much different from other clusters is a fundamental step in many application domain. However, current outliers diagnostics are often inadequate when in a large amount of data. In this thesis, we propose a two-phase clustering algorithm for outliers. In Phase 1 we modified k-means algorithm by using the heuristic "if one new input pattern is far enough away from all clusters' centers, then assign it as a new cluster center". So that the number of clusters found in this phase is more than that originally set in k-means algorithm. And then we propose a clusters merging process in the second phase to merge the resulting clusters obtained in Phase 1 into the same number of clusters originally set by the user. The results of three experiments show that the outliers or remainder clusters can be easily identified by our method. | en_US |
dc.language.iso | en_US | en_US |
dc.subject | 資料探勘 | zh_TW |
dc.subject | 分群法 | zh_TW |
dc.subject | 異常節點 | zh_TW |
dc.subject | Data Mining | en_US |
dc.subject | Clustering | en_US |
dc.subject | Outliers | en_US |
dc.title | 一個找尋異常群集的快速分群演算法 | zh_TW |
dc.title | A Fast Clustering Process for Outliers and Remainder Clusters | en_US |
dc.type | Thesis | en_US |
dc.contributor.department | 資訊科學與工程研究所 | zh_TW |
顯示於類別: | 畢業論文 |