語者調適和正規化技術在語音辨認之初步研究

標題:	語者調適和正規化技術在語音辨認之初步研究 A First Study on Speaker Adaptation and Normalization for Continuous Mandarin Speech Recognition
作者:	蔡忠安 Tsai, Chung-An 陳信宏 Sin-Horng Chen 電信工程研究所
關鍵字:	語者調適;語者正規化;語音辨認;Speaker Adaptation;Speaker Normalization;MLLR technique;speech recognition
公開日期:	1997
摘要:	本論文的研究重點在於國語連續語音辨認的語者調適和正規化技術.在語者調適方面:對於每一位測試語者,以MLLR技術估計出其轉換矩陣,以調適現有之不特定語者模型為最適合該測試語者之HMM模型.實驗結果顯示調適後的辨認率較基本系統升高許多,而且隨著調適語料的增加,辨認率隨之而遞增.在語者正規化方面:我們以MLLR技術所衍生出的三種方法達到語者正規化的目的.方法一:訓練語者的特徵參數直接扣除由MLLR技術所估計出的語者偏移量後重估HMM模型;方法二:訓練語者的特徵參數經由調適後的平均值向量圓滑化後重估HMM模型;方法三:以MLLR技術對每一位訓練語者的特徵參數估計出一轉換矩陣,特徵參數經轉換後重估HMM模型.實驗結果顯示,由方法二所估計出的語者正規化模型最精確.其辨認率為62.62%比 SBR(54.91%)和CMN(56.96%)法的辨認率都高. In this thesis, sspeaker adaptation and normalization for continuous Mandarinspeech recognition are discussed. In speaker adaptation, the MLLR technique isemployed to transform the SI HMM models into a version, which is more suitablefor a new testing speaker, using a small set of his/her utterances. In speakernormalization, three schemes based on the MLLR techniques are proposed to removespeakers' personal characteristics from the input speech signals in order totrain a set of speaker independent HMM models. Experimental results showed thatthe base-syllable recognition rate can be raised from 53.16% to 62.30% by theproposed speaker adaptation method using adaptation data sets of 5 sentential utterances (about 30 seconds). The proposed speaker normalization method isalso effective on improving the recognition performance. The base- syllablerecognition rate can be raised from 53.16% to 62.62%.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#NT860435015 http://hdl.handle.net/11536/63034
顯示於類別：	畢業論文