一個找尋異常群集的快速分群演算法

Full metadata record

DC Field	Value	Language
dc.contributor.author	蘇志明	en_US
dc.contributor.author	Chih-Ming Su	en_US
dc.contributor.author	曾憲雄	en_US
dc.contributor.author	Shian-Shyong Tseng	en_US
dc.date.accessioned	2014-12-12T02:20:27Z	-
dc.date.available	2014-12-12T02:20:27Z	-
dc.date.issued	1998	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#NT870394013	en_US
dc.identifier.uri	http://hdl.handle.net/11536/64151	-
dc.description.abstract	在許多的應用領域中，找尋或分辨出與一般群集差異相當大的異常節點，是一個非常重要且基礎的步驟。在傳統的資料分析或人工智慧領域中，往往需要將此異常節點加以排除，或給予較低的權重，以避免分析的結果產生極大的誤差。近幾年來，「資料探勘」技術日漸被重視，如何從大量資料中找尋引含且有用的資訊，這方面已有許多的文獻相繼被提出。然而以往「資料探勘」的方向著重於找尋常出現的關連集合，或交易記錄的趨勢。本篇論文中以反向的觀點，探討如何在大量資料中，找尋異於一般集合的異常資料。例如在電子郵件記錄檔中，找尋異於一般合理使用範圍的特殊紀錄，在網路管理的角度上，對此紀錄加以追蹤分析，將會是未來重要的參考資料。在本篇論文中，我們將提出一個兩階段的分群策略。在第一階段中，我們改良傳統k-means分群演算法，加入了一個「跳躍」的啟發策略，在遞迴的分群階段中，讓異常節點有更大的機率被視為獨立的群集。在第二階段中，利用「最小擴張樹」的概念，將第一階段的結果重新分群。最後我們藉由三類資料加以實驗，都得到非常良好的實驗結果。	zh_TW
dc.description.abstract	Identifying outliers and remainder clusters which are used to designate few patterns that much different from other clusters is a fundamental step in many application domain. However, current outliers diagnostics are often inadequate when in a large amount of data. In this thesis, we propose a two-phase clustering algorithm for outliers. In Phase 1 we modified k-means algorithm by using the heuristic "if one new input pattern is far enough away from all clusters' centers, then assign it as a new cluster center". So that the number of clusters found in this phase is more than that originally set in k-means algorithm. And then we propose a clusters merging process in the second phase to merge the resulting clusters obtained in Phase 1 into the same number of clusters originally set by the user. The results of three experiments show that the outliers or remainder clusters can be easily identified by our method.	en_US
dc.language.iso	en_US	en_US
dc.subject	資料探勘	zh_TW
dc.subject	分群法	zh_TW
dc.subject	異常節點	zh_TW
dc.subject	Data Mining	en_US
dc.subject	Clustering	en_US
dc.subject	Outliers	en_US
dc.title	一個找尋異常群集的快速分群演算法	zh_TW
dc.title	A Fast Clustering Process for Outliers and Remainder Clusters	en_US
dc.type	Thesis	en_US
dc.contributor.department	資訊科學與工程研究所	zh_TW
Appears in Collections:	Thesis