標題: 機率式模型分群法之研究與其應用
Probabilistic Model-based Clustering and Its Applications
作者: 鄭士賢
Cheng, Shih-Sian
傅心家
王新民
Fu, Hsin-Chia
Wang, Hsin-Min
資訊科學與工程研究所
關鍵字: 模型分群法;高斯混合模型;EM演算法;機率式自我組織圖;自我組織圖;音訊分段;分割且克服;model-based clustering;Gaussian mixture model;EM algorithm;probabilistic self-organizing map;self-organizing map;audio segmentation;divide-and-conquer
公開日期: 2008
摘要: 機率式模型分群法藉由學習一個有限混合模型(finite mixture model)而達成資料分群的目的,其已經有很多成功的應用; 而最常用的混合模型乃是高斯混合模型(Gaussian mixture model, i.e., GMM)。當其目的函數的最佳化無法用分析方法來達成時,我們經常用迭代式、局部最佳(local-optimal)的演算法來學習混合模型。此外,因為來自於混合模型的資料樣本有非完全(incompleteness)的特性,因此我們通常採用EM形式(EM-type)的演算法,如EM演算法與分類EM演算法(classification EM, i.e., CEM)。然而,用傳統EM形式演算法來做模型分群有幾個缺點: 一、它的效能與模型初始值有高度相關性; 二、混合元件(mixture component)的個數需要事先給定; 三、在完成分群之後,資料樣本與群聚之間的拓樸關係(topological relationships)無法被保留。在本論文中,針對上述前兩樣缺點,我們提出一個自我分裂之高斯混合模型學習演算法(self-splitting Gaussian mixture learning, i.e., SGML)。此法起始於單一元件,然後迭代式地,用貝氏資訊法則(Bayesian information criterion, i.e., BIC)來確認分裂的有效性,分裂某一個元件而成為兩個新元件直到最佳的元件個數被找到為止,最佳的元件個數亦是用BIC來決定。基於此分裂程序,SGML可為學習不同模型複雜度的GMM提供不錯的模型初始值。關於拓樸保留(topology-preserving)的議題,我們將一個機率式自我組織圖(probabilistic self-organizing map, i.e., PbSOM)的學習視為一種可保留資料樣本與群聚之間拓樸關係於一個神經網路(neural network)的模型分群法。基於此概念,我們為PbSOM發展了一個耦合概似混合模型(coupling-likelihood mixture model),其延伸Kohonen SOM的參考向量(reference vectors)至多變量高斯模型。基於最大概似法則(maximum likelihood criterion),我們亦發展了三個學習PbSOM的EM形式演算法,亦即SOCEM、 SOEM、以及SODAEM演算法。SODAEM是SOCEM與SOEM的決定性退火(deterministic annealing, i.e., DA)變形; 此外,藉由逐漸縮小鄰域大小(neighborhood size),SOCEM與SOEM可分別被解釋成: 高斯模型分群的CEM與EM演算法的DA變形。實驗結果顯示,我們所提出的PbSOM演算法與決定性EM(DAEM)演算法有相近的分群效能,並能維持拓樸保留之特性。關於應用方面,我們用SGML來訓練用於語者識別(speaker identification)的語者GMMs;實驗結果顯示,SGML可以自動地為個別語者GMM決定適當的模型複雜度,而此在文獻裡通常是用經驗法則來決定的。我們將所提出的PbSOM學習演算法應用於資料視覺化與分析;實驗結果顯示它們可用於在一個二維的網路上探索資料樣本與群聚之間的拓樸關係。此外,我們提出幾種分割且克服(divide-and-conquer)的方法來做音訊分段(audio segmentation),其用BIC來評定兩個音段之間的不相似度(dissimilarity);在廣播新聞的實驗結果顯示,相較於公認最佳的視窗成長分段法(window-growing-based segmentation),我們的方法不僅有比較低的計算量,亦有較高的分段正確性。
Probabilistic model-based clustering is achieved by learning a finite mixture model (usually a Gaussian mixture model, i.e., GMM) and has been successfully applied to many tasks. When the objective likelihood function cannot be optimized analytically, iterative, locally-optimal learning algorithms are usually applied to learn the model. Moreover, because the data samples drawn from a mixture model have the property of incompleteness, EM-type algorithms, such as EM and classification EM (CEM), are usually employed. However, model-based clustering based on conventional EM-type algorithms suffer from the following shortcomings: 1) the performance is sensitive to the model initialization; 2) the number of mixture components (data clusters) needs to be pre-defined; and 3) the topological relationships between data samples and clusters are not preserved after the learning process. In this thesis, we propose a self-splitting Gaussian mixture learning algorithm (SGML) to address the first two issues. The algorithm starts with one single component and iteratively splits a component into two new components using Bayesian information criterion (BIC) as the validity measure for the splitting until the most appropriate component number is found, which is also achieved by BIC. Based on the splitting process, SGML provides a decent model initialization for learning Gaussian mixture models with different model complexities. For the topology-preserving issue, we consider the learning process of a probabilistic self-organizing map (PbSOM) as a model-based clustering procedure that preserves the topological relationships between data samples and clusters in a neural network. Based on this concept, we develop a coupling-likelihood mixture model for the PbSOM that extends the reference vectors in Kohonen's SOM to multivariate Gaussians distributions. We also derive three EM-type algorithms, called the SOCEM, SOEM, and SODAEM algorithms, for learning the model (PbSOM) based on the maximum likelihood criterion. SODAEM is a deterministic annealing (DA) variant of SOCEM and SOEM; moreover, by shrinking the neighborhood size, SOCEM and SOEM can be interpreted, respectively, as DA variants of the CEM and EM algorithms for Gaussian model-based clustering. The experiment results show that the proposed PbSOM learning algorithms achieve comparable data clustering performance to that of the deterministic annealing EM (DAEM) approach, while maintaining the topology-preserving property. For applications, we apply SGML to the training of speaker GMMs for the speaker identification task; the experiment results show that SGML can automatically determine an appropriate model complexity for a speaker GMM, which is usually determined empirically in the literature. We apply the proposed PbSOM algorithms to data visualization and analysis; the experiment results show that they can be used to explore the topological relationships between data samples and clusters on a two-dimensional network. In addition, we propose divide-and-conquer approaches for the audio segmentation task using BIC to evaluate the dissimilarity between two audio segments; the experiment results on the broad-cast news data demonstrate that our approaches not only have a lower computation cost but also achieve higher segmentation accuracy than the leading window-growing-based segmentation approach.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009317823
http://hdl.handle.net/11536/78854
顯示於類別:畢業論文


文件中的檔案:

  1. 782301.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。