標題: 基於語者及方言屬性作自動分群之研究
Automatic utterance clustering based on speaker and dialect attributes
作者: 吳政麟
Cheng-Lin Wu
張文輝
Wen-Whei Chang
電信工程研究所
關鍵字: 語者分群;方言分群;高斯混合模型;BBN;音調;音素;Speaker clustering;Dialect clustering;GMM;BBN;prosody;phoneme
公開日期: 2001
摘要: 本論文之研究目的在於發展一種不特定詞彙之自動語句分群系統,特別是針對構音與韻律兩大聲學特性之整合,以期提供一具體可行的中文口語資料檢索方法。研究主題可依分群對象區分為兩部份,第一部份是探討不特定語者數目的語者分群系統,同時正確偵測出未知語者數目下的最佳分群數目。第二部份是針對台灣境內的三種主要方言-北京話、客家話與河洛話進行方言分群。根據實驗結果顯示,藉由訓練完成之高斯混合模型計算出之語句間距,成功整合構音與韻律特徵於提昇自動語句分群之正確率。
As a part of multilingual spoken language system, reliable techniques are needed to cluster various utterances in order to classify and index large speech database. This work is aimed to develop an automatic utterance clustering algorithm that takes context-independent utterances as input and outputs the identity of a speaker or a dialect. The system has been trained to cluster fifty speakers or three Chinese dialects (Mandarin, Holo, and Hakka), but could be easily extended to include more speakers or dialects as well. It is well-known that dialects or speakers differ from each other with respect to their typical sequential statistics of phonemes and pitch contours. By integrating phonetic and prosodic information through Gaussian Mixture Models (GMM), we reported the benefits of a new hybrid clustering system that is capable of achieving higher accuracy rates.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT900435066
http://hdl.handle.net/11536/68943
顯示於類別:畢業論文