基於印度餐廳過程的特征抽取整合分群法

Full metadata record

DC Field	Value	Language
dc.contributor.author	魏新宇	zh_TW
dc.contributor.author	李嘉晃	zh_TW
dc.contributor.author	劉建良	zh_TW
dc.contributor.author	莊仁輝	zh_TW
dc.contributor.author	WEI, XINYU	en_US
dc.contributor.author	Lee, Chia-Hoang	en_US
dc.contributor.author	Liu, Chien-Liang	en_US
dc.contributor.author	Chuang, Jen-Hui	en_US
dc.date.accessioned	2018-01-24T07:37:05Z	-
dc.date.available	2018-01-24T07:37:05Z	-
dc.date.issued	2016	en_US
dc.identifier.uri	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070356146	en_US
dc.identifier.uri	http://hdl.handle.net/11536/138957	-
dc.description.abstract	隨著科技的進步，資料數量也以指數級在增長，因而使得資料分群問題愈發重要。龐雜的資料使得我們不能再依賴人工去完成，這時就需要開發出各式新的算法，實現機器自動化分群，達到準確高效的目的。之前的研究結果顯示，使用整合分群的方式結合多個分群方法，并整合其結果，往往可提昇單一分群法之效能，同時可以讓分群結果更穩定。因此本文提出一種基於印度餐廳過程的特征抽取整合分群法（Feature based Ensemble Clustering with Indian Buffet Process）。該方法在分群時不需要知道資料應有的群數，它會在分群的過程中，通過對資料的學習，自行得出它認為最適合的群數。本論文使用品質以及差異性作為分群整合之依據，我們提出一個以印度餐廳過程（Indian Buffet Process, IBP）為基礎結合貪婪算法的特征抽取方法。另外，我們提出一個整合算法，通過整合所有分群結果得到最終結果，使分群效果得到了提升。此外，最後實驗結果顯示本論文提出的方法表現優於其他非監督式學習法。	zh_TW
dc.description.abstract	As the development of technology, the amount of data grows exponentially. This makes data clustering more and more important, since clustering is an important technique in data exploration. Clustering is an unsupervised learning method, so improving performance and obtaining robust clustering results are challenging tasks in machine learning. Moreover, specifying the number of clusters in another problem for a certain class of clustering algorithms. Previous studies have shown that ensemble learning considers many clustering methods and aggregates their results, which can always yield a better and more robust result than a single one. This thesis proposes a feature-based ensemble clustering model based on the Indian Buffet Process(IBP). Additionally, the proposed model does not need to know the number of clusters in advance, and obtain the most suitable one for the data during the process of clustering. The proposed method uses quality and diversity as performance criteria to select feature subsets based on IBP and the proposed greedy algorithm. Each feature subset is considered as a view of the data and each subset results in ten clustering results. The final clustering result is the aggregation of these results by using the proposed aggregation algorithm. The experimental results indicate that the proposed model generally outperforms other unsupervised methods.	en_US
dc.language.iso	en_US	en_US
dc.subject	印度餐廳過程	zh_TW
dc.subject	整合分群	zh_TW
dc.subject	特征抽取	zh_TW
dc.subject	Indian buffet process	en_US
dc.subject	ensemble clustering	en_US
dc.subject	feature subsets selection	en_US
dc.title	基於印度餐廳過程的特征抽取整合分群法	zh_TW
dc.title	Feature-Based Ensemble Clustering with Indian Buffet Process	en_US
dc.type	Thesis	en_US
dc.contributor.department	資訊科學與工程研究所	zh_TW
Appears in Collections:	Thesis