標題: 語者調適在台灣方言辨識之研究
A Study of Speaker Adaptation on Automatic Taiwanese Dialect Identification
作者: 楊萬興
Wan-Hsing Yang
張文輝
Dr. Wen-Whei Chang
電信工程研究所
關鍵字: HMM辨識;方言辨識;語者調適;碼書調適;Chinese-dialect identification;CMS;MAP;MLLR;codebook adaptation
公開日期: 1998
摘要: 本論文之研究目的在於探討語者差異性對於方言辨識所造成的影響,並針對方言辨識需求而提出一種解決語者不匹配問題的調適架構。主要研究對象為台灣地區三種主要方言-北京話、河洛話及客家話,目標是將多語者模式推展至不特定語者模式。本論文初步採用一種雙層的HMM辨識架構,以獲取方言間存在之音律及聲學差異性作為鑑別之依據,並試圖利用語音辨認中常見之語者補償技術以消除語者不匹配問題。然而,方言辨識中的語者問題並未於這些技術應用之後而獲得解決。於是,我們轉而朝向發展另一種基於向量量化之方言辨識架構,以利於語者調適的有效實現。雖然此辨識架構本質上僅利用了方言中的聲學資訊,但其對於多語者模式下的效能卻直逼雙層HMM的辨識架構。此外經由一簡單的碼書調適之後,對於新測試語者的辨識率有了非常顯著的提升。
Previous work on automatic Chinese-dialect identification using an acoustic-phonotactic model allows the system to differentiate three dialects from each other in a multi-speaker (MS) environment. However, as we extend the task to the speaker-independent (SI) mode, the well-trained identifier suffers from serious degradation due to the mismatch between the training and the testing conditions. In order to overcome this problem, several well-developed solutions such as CMS, spectral transform, MAP, and MLLR were used. However, the experimental results indicate that such speaker compensation schemes developed for speech recognition are less successful. We speculate that the use of speaker compensation may destroy the discriminability of acoustic-phonotactic model. Recognizing this, an acoustic-based VQ-distortion identifier together with codebook adaptation is developed to alleviate the speaker mismatch problem. Simulation results indicate that a VQ-distortion identifier can easily extend to SI system with little degradation.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT870435104
http://hdl.handle.net/11536/64564
顯示於類別:畢業論文