標題: 以區域性鄰集為基礎之相似度轉換方法應用於分群演算法
A Locality Based Similarity Transformation Method for Clustering Algorithms
作者: 陳彥嘉
Chen, Yen-Chia
胡毓志
Hu, Yuh-Jyh
資訊科學與工程研究所
關鍵字: 分群演算法;相似度轉換;凹凸形狀資料分布;維度縮減;Clustering algorithms;Similarity transformation;Non-convex shaped clusters;Dimensional reduction
公開日期: 2012
摘要: 如何選擇一個合適的相似度函式在分群演算法中是一項相當重要的問題,相似度函式會直接影響分群結果。我們提出一種以區域性鄰集為基礎之相似度轉換法,藉由觀察區域性鄰集的分布以調整資料的相似度。透過相似度的轉換,我們能夠調整資料的分布情形,凸顯資料間的邊界,以利於分群演算法尋找具有意義的集群。將此方法應用至非監督式或半監督式分群演算法,我們預期可以尋找出多種分布型態的集群。我們的實驗結果說明:1. 利用以區域最近鄰為基礎之相似度轉換法,在無配對限制的幫助下,能夠有效的處理多種凹凸形狀的資料分布,而配對限制的加入亦能進一步提升整體的準確率;2. 以區域最近鄰為基礎之相似度轉換方法亦能應用於資料維度的縮減。實驗顯示維度的縮減不僅能減少計算量以改善分群演算法的整體速度,同時在多項實驗中仍能維持分群的正確性。
An appropriate similarity function is crucial to clustering algorithms because it affects the clustering result directly. We propose a locality based similarity transformation method, which transforms the similarity between two data points based on the distribution of their neighbors in vicinity. The blurry boundary between clusters can be better revealed after transformation. By applying the locality based similarity transformation method to unsupervised or semi-supervised clustering, we can discover clusters more easily even if they are of irregular contours. Our experimental results demonstrate that: (1) the proposed locality based similarity transformation method can improve clustering methods in finding arbitrarily shaped clusters without any prior knowledge, (2) prior knowledge represented as pairwise constraints can be incorporated to further improve the performance of clustering, and (3) a dimension reduction method based on multiple dimension scaling can be combined with the transformation procedure not only to reduce the feature space but also the computation cost in transformation.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079955591
http://hdl.handle.net/11536/50498
顯示於類別:畢業論文


文件中的檔案:

  1. 559101.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。