Full metadata record
DC FieldValueLanguage
dc.contributor.author鄭士賢en_US
dc.contributor.authorShi-Sian Chengen_US
dc.contributor.author傅心家en_US
dc.contributor.authorHsin-Chia Fuen_US
dc.date.accessioned2014-12-12T02:27:41Z-
dc.date.available2014-12-12T02:27:41Z-
dc.date.issued2001en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#NT900392076en_US
dc.identifier.urihttp://hdl.handle.net/11536/68486-
dc.description.abstract本論文主要在探討高斯混合模型(Gaussian Mixture Model, GMM)的學習與其在語者識別(Speaker Identification)上的應用。在先前的研究中,用語者的GMM來做語者識別已經有很不錯的成果,但未對GMM的高斯元件個數與共變異數矩陣的型態(全(full)或對角(diagonal)共變異數矩陣)做深入的探討。在本論文中,我們提出一個“以BIC(Bayesian Information Criterion)為基礎的自我成長學習法”,用自動決定高斯元件的個數的方式來學習GMM;我們並且分別用全共變異數矩陣和對角共變異數矩陣的GMM來做語者識別,比較其實驗結果。我們將電視新聞節目錄成mpeg檔,從中擷取新聞主播的語料,其中包含了19位女主播和3位男主播。在此測試語料下,全共變異數矩陣GMM語者識別器的識別率可達95.84%;對角共變異數矩陣GMM語者識別器的識別率可達97.90%。我們並且用以GMM為基礎的語者識別方法來偵測新聞主播在新聞節目中的位置,做新聞故事的切割。我們用七小時的新聞節目作為測試資料,對於新聞主播的偵測我們有90.20%的精確率(precision rate),92.5%的召回率(recall rate)。zh_TW
dc.description.abstractThis paper mainly discusses the learning of Gaussian Mixture Model and its application on speaker identification. In the previous studies, it has been shown that using GMM for speaker identification would perform well. But they do not discuss deeply about the number of gaussian component of GMM and the type of covariance matrix(full or diagonal). In this paper, we propose a BIC-based self-growing learning method for GMM and determine the number of gaussian component of each GMM automatically. We also use full covariance matrix GMM and diagonal covariance matrix GMM for speaker identification separately and then compare their experiment result. Our speaker database include 19 anchor woman and 3 anchor man from mpeg files that we captured from TV news by capture card. Under this database, the GMM speaker identifier with full covariance attains 95.84% identification accuracy rate, and 97.90% accuracy rate with diagonal covariance matrix. In this paper, we also use the GMM-based speaker identification method for TV-news anchor detection and news story segmentation. We use 7 hours of TV-news program as testing data, and in our experiment the precision rate attains 90.20% and the recall rate attains 92.5%。en_US
dc.language.isozh_TWen_US
dc.subject高斯混合模型zh_TW
dc.subject語者識別zh_TW
dc.subject貝氏資訊法則zh_TW
dc.subject語者分段zh_TW
dc.subject新聞主播zh_TW
dc.subjectgaussian mixture modelen_US
dc.subjectGMMen_US
dc.subjectBICen_US
dc.subjectclusteringen_US
dc.subjectspeaker identificationen_US
dc.subjectspeaker segmentationen_US
dc.subjectnewsen_US
dc.title高斯混合模型的學習與其在語者識別上的應用zh_TW
dc.titleModel-based learning for Gaussian Mixture Model and its application on speaker identificationen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
Appears in Collections:Thesis