Full metadata record
DC FieldValueLanguage
dc.contributor.author魏新宇zh_TW
dc.contributor.author李嘉晃zh_TW
dc.contributor.author劉建良zh_TW
dc.contributor.author莊仁輝zh_TW
dc.contributor.authorWEI, XINYUen_US
dc.contributor.authorLee, Chia-Hoangen_US
dc.contributor.authorLiu, Chien-Liangen_US
dc.contributor.authorChuang, Jen-Huien_US
dc.date.accessioned2018-01-24T07:37:05Z-
dc.date.available2018-01-24T07:37:05Z-
dc.date.issued2016en_US
dc.identifier.urihttp://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070356146en_US
dc.identifier.urihttp://hdl.handle.net/11536/138957-
dc.description.abstract隨著科技的進步,資料數量也以指數級在增長,因而使得資料分群問題愈發重要。龐雜的資料使得我們不能再依賴人工去完成,這時就需要開發出各式新的算法,實現機器自動化分群,達到準確高效的目的。之前的研究結果顯示,使用整合分群的方式結合多個分群方法,并整合其結果,往往可提昇單一分群法之效能,同時可以讓分群結果更穩定。因此本文提出一種基於印度餐廳過程的特征抽取整合分群法(Feature based Ensemble Clustering with Indian Buffet Process)。該方法在分群時不需要知道資料應有的群數,它會在分群的過程中,通過對資料的學習,自行得出它認為最適合的群數。本論文使用品質以及差異性作為分群整合之依據,我們提出一個以印度餐廳過程(Indian Buffet Process, IBP)為基礎結合貪婪算法的特征抽取方法。另外,我們提出一個整合算法,通過整合所有分群結果得到最終結果,使分群效果得到了提升。此外,最後實驗結果顯示本論文提出的方法表現優於其他非監督式學習法。zh_TW
dc.description.abstractAs the development of technology, the amount of data grows exponentially. This makes data clustering more and more important, since clustering is an important technique in data exploration. Clustering is an unsupervised learning method, so improving performance and obtaining robust clustering results are challenging tasks in machine learning. Moreover, specifying the number of clusters in another problem for a certain class of clustering algorithms. Previous studies have shown that ensemble learning considers many clustering methods and aggregates their results, which can always yield a better and more robust result than a single one. This thesis proposes a feature-based ensemble clustering model based on the Indian Buffet Process(IBP). Additionally, the proposed model does not need to know the number of clusters in advance, and obtain the most suitable one for the data during the process of clustering. The proposed method uses quality and diversity as performance criteria to select feature subsets based on IBP and the proposed greedy algorithm. Each feature subset is considered as a view of the data and each subset results in ten clustering results. The final clustering result is the aggregation of these results by using the proposed aggregation algorithm. The experimental results indicate that the proposed model generally outperforms other unsupervised methods.en_US
dc.language.isoen_USen_US
dc.subject印度餐廳過程zh_TW
dc.subject整合分群zh_TW
dc.subject特征抽取zh_TW
dc.subjectIndian buffet processen_US
dc.subjectensemble clusteringen_US
dc.subjectfeature subsets selectionen_US
dc.title基於印度餐廳過程的特征抽取整合分群法zh_TW
dc.titleFeature-Based Ensemble Clustering with Indian Buffet Processen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
Appears in Collections:Thesis