標題: 基於半徑加權平均和平均半徑保留公式的分群
Clustering using Radius-Weighted Means and Analytical Radius-Preserved Formula
作者: 葉美伶
林志青
Yeh, Mei-Ling
Lin, Ja-Chen
多媒體工程研究所
關鍵字: 保留原始維度的分群法;降維度的分群法;分裂式分群法;高維度資料;full-dimension clustering;dimension reduction;divisive hierarchical clus- tering;high dimensional data
公開日期: 2016
摘要: 隨著智慧型手機以及穿戴式裝置的普及,造就了人們有別於以往的使用習慣。同時裝置自動化地蒐集資料,資料的型態亦相較以往更為多元、豐富。與此同時,多樣化的應用軟體也因應而生。在這樣的環境下,資料常含有大量的特徵,也就是資料成為高維度資料。特製化的嵌入式裝置,要如何在有限的硬體環境下,即時地分析資料,並在時間內做出反應,便成為棘手的問題。本論文基於“半徑加權平均二分法”和“平均半徑保留公式的二群分法分群法”,分別提出RWM式、以及解析式兩種處理分群的方法。這兩個演算法能處理高維度且較大量之資料點,並且群數不需預先告知。在分群結果與其他同樣不需預先給群數的演算法之分群結果相差不大的情況下,我們將本論文之計算速度依“保留原始資料維度的分群法”以及“基於降維的分群法”兩類,各挑數個文獻之方法做比較。透過實驗,我們比較當資料點的數量與維度變大時,各分群演算法速度之變化。我們發現我們提出的方法在速度上都有較為優異的表現,並且於需要即時回應的應用,我們提出的方法較能滿足即時的需求。
As mobile phones and wearable devices get popular, people are developing new habits that are totally different from before. Also, many devices now can collect abundant data more diversified and quickly, and this makes new applications are more various, and the data are often high-dimensional. In this situation, with limited performance of hardware, customized devices are often not easy to analyze data and make real-time responses. The thesis here is based on RWM Approach and Analytical Approach, which are, respectively, related to the “Radius Weighted Mean bisection method” and “Analytical Radius-Preserved Formula”. Both of the proposed approaches can deal with larger high-dimensional data without knowing in advance the number of clusters. When we compare ours with other approaches which also do not require inputting number of clusters, since there is no big difference between ours and others in clustering accuracy, we compare our processing speed with the processing speed of reported methods. Some of these reported methods are full- dimensional clustering, and some use Dimension Reduction of data. In the experiments, we see that the time efficiency of ours are good, as the number of data points increases, or as the dimensions changes. Hence, our approach has better speed and also meets the requirements of real-time responses.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070356624
http://hdl.handle.net/11536/139133
Appears in Collections:Thesis