標題: 語音辨認鑑別式訓練之研究
A Study on Discriminative Training for Speech Recognition
作者: 張保忠
Chang, Pao-Chung
陳信宏;莊炳湟
Chen, Sin-Horng;Juang, Biing-Hwang
電子研究所
關鍵字: 鑑別訓練;語音辨認;廣義機率遞減法則;discriminative training;speech recognition;GPD
公開日期: 1992
摘要: 本文中,討論及分析數個鑑別式訓練法則並用來改善傳統動態規劃語音辨 認器。文中首先提出以一個線性鑑別元函數(Linear Discriminant Function) 取代傳統動態規劃辨認器以平均失真序列之方式來計算樣本間 之距離值,數個鑑別式訓練法則被用來訓練此線性鑑別元函數,實驗結果 顯示所提之線性鑑別元函數大大的改善了傳統辨認器之辨認率,其中以廣 義機率遞減法則(Generalized Probabilistic Descent Algorithm) 獲得 最佳之辨認結果,在英文E 群集不特定語者辨認中,將傳統方法所得之 67.6% 辨認率提升到78.1%。基於廣義機率遞減法則之優異鑑別力,本文 進而嘗試以此法來訓練傳統動態規劃辨認器,一系列的實驗過程被用來測 試此廣義機率遞減法則之特性及能力,實驗結果顯示系統之辨認率被進一 步提升到84.4%, 更重要的是實驗結果也驗証了此廣義機率遞減法則之收 斂目的與使辨認錯誤率最低之特性一致。最後本文並進一步將廣義機率遞 減法則運用到以次字元(Sub-word Unit) 為基本語音單元之國語408 音節 辨認系統,實驗結果顯示此廣義機率遞減法則對大字彙以次字元為基本語 音單元之辨認系統之改善仍然非常有效。 In this thesis, several discriminative training methods are discussed and suggested to improve the conventional dynamic programming based speech recognizers. Linear discriminant functions are first introduced and taken as recognition scores to replace the simple average distance scores in a conventional DTW- based recognizer. As we expected, experimental results show that the proposed approach can significantly improve the performance of the conventional DTW-based speech recognition system. The best improvement was obtained by the method using the GPD algorithm. The recognition rate was upgraded from 67.6\% to 78.1\% on the speaker- independent test of English E- set data. Due to its superiority, the GPD algorithm is then applied to the same system to adjust both weighting functions and reference templates. A new minimum recognition error formulation for applying the GPD algorithm is derived. A series of experiments were conducted to examine the characteristics of the GPD discriminative training method. The performance of the conventional system was further improved to reach a high recognition rate of 84.4\%. Most importantly, the experimental results also verify that the GPD algorithm with the new minimum recognition error formulation indeed converges to a solution that accomplishes the objective of minimum error rate. Finally, the GPD discriminative training of a sub-syllable based Mandarin syllable recognition system is presented. Experimental results confirmed that the GPD discriminative training algorithm is also effective on improving the performance of a large vocabulary sub-word based speech recognition system.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT810430004
http://hdl.handle.net/11536/56860
顯示於類別:畢業論文