決定群中心個數k與位置的分裂K-均值分群演算法

標題:	決定群中心個數k與位置的分裂K-均值分群演算法 Divisive K-Means Clustering Algorithm for Determining k and Positions of Cluster Centers
作者:	林佑信 Lin, You-Shin 李程揮 Lee, Tsern-Huei 電信工程研究所
關鍵字:	分群法;k-均值;初始群中心;Clustering;k-means;initial cluster center
公開日期:	2008
摘要:	分群法(clustering)近來是一個眾所周知的研究主題，而且它也被廣泛的應用在許多的領域中。在眾多的分群演算法之中，k-均值演算法 (k-means algorithm)是一個通俗、簡單且快速的分群演算法。然而在k-均值演算法的應用上，卻有兩個主要的問題︰第一，在一個真實的資料集合中，確切的k值是未知的；第二，k-均值演算法很難有效的去選擇初始的群聚中心點，而且群聚中心點的初始位置的選擇會大大影響了分群的結果。為了解決這兩個主要的問題，我們提出了一個新的演算法，其主要是在k-均值演算法的目標函數上多加了一個衝突的項，使得這分群過程對於初始群中心的選擇不會那麼敏感。結合分群的驗證方法，我們能夠決定最佳的群聚中心個數與其所在的位置。我們在許多自創的資料組裡作模擬，都能夠有效的得到最佳的分群結果。 Clustering is a well-known research topic, which applied widely in many fields. Among of the clustering algorithms, k-means algorithm is one of the most popular, simple, and fast clustering algorithm. However, there are two major problems in the application of the k-means algorithm. First, the right value of k is usually unknown in a real data set. Second, it is difficult to select effectively initial cluster centers, and the clustering result is sensitive to the initial cluster centers. In order to solve the two problems, we propose a new algorithm which extends the standard k-means algorithm by introducing a conflict term to the objective function to make the clustering process not sensitive to the initial cluster centers. Combined with the cluster validation technique, we can determine the optimal k and the positions of cluster centers. Simulation results on synthetic data sets show the effectiveness of the proposed algorithm in determining the number and positions of the cluster centers.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#GT079613544 http://hdl.handle.net/11536/41981
Appears in Collections:	Thesis

Files in This Item:

354401.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.