Title: 語音辨認鑑別式訓練之研究
A Study on Discriminative Training for Speech Recognition
Authors: 張保忠
Chang, Pao-Chung
陳信宏;莊炳湟
Chen, Sin-Horng;Juang, Biing-Hwang
電子研究所
Keywords: 鑑別訓練;語音辨認;廣義機率遞減法則;discriminative training;speech recognition;GPD
Issue Date: 1992
Abstract: 本文中,討論及分析數個鑑別式訓練法則並用來改善傳統動態規劃語音辨
認器。文中首先提出以一個線性鑑別元函數(Linear Discriminant
Function) 取代傳統動態規劃辨認器以平均失真序列之方式來計算樣本間
之距離值,數個鑑別式訓練法則被用來訓練此線性鑑別元函數,實驗結果
顯示所提之線性鑑別元函數大大的改善了傳統辨認器之辨認率,其中以廣
義機率遞減法則(Generalized Probabilistic Descent Algorithm) 獲得
最佳之辨認結果,在英文E 群集不特定語者辨認中,將傳統方法所得之
67.6% 辨認率提升到78.1%。基於廣義機率遞減法則之優異鑑別力,本文
進而嘗試以此法來訓練傳統動態規劃辨認器,一系列的實驗過程被用來測
試此廣義機率遞減法則之特性及能力,實驗結果顯示系統之辨認率被進一
步提升到84.4%, 更重要的是實驗結果也驗証了此廣義機率遞減法則之收
斂目的與使辨認錯誤率最低之特性一致。最後本文並進一步將廣義機率遞
減法則運用到以次字元(Sub-word Unit) 為基本語音單元之國語408 音節
辨認系統,實驗結果顯示此廣義機率遞減法則對大字彙以次字元為基本語
音單元之辨認系統之改善仍然非常有效。
In this thesis, several discriminative training methods are
discussed and suggested to improve the conventional dynamic
programming based speech recognizers. Linear discriminant
functions are first introduced and taken as recognition scores
to replace the simple average distance scores in a conventional
DTW- based recognizer. As we expected, experimental results
show that the proposed approach can significantly improve the
performance of the conventional DTW-based speech recognition
system. The best improvement was obtained by the method using
the GPD algorithm. The recognition rate was upgraded from
67.6\% to 78.1\% on the speaker- independent test of English E-
set data. Due to its superiority, the GPD algorithm is then
applied to the same system to adjust both weighting functions
and reference templates. A new minimum recognition error
formulation for applying the GPD algorithm is derived. A series
of experiments were conducted to examine the characteristics of
the GPD discriminative training method. The performance of the
conventional system was further improved to reach a high
recognition rate of 84.4\%. Most importantly, the experimental
results also verify that the GPD algorithm with the new minimum
recognition error formulation indeed converges to a solution
that accomplishes the objective of minimum error rate. Finally,
the GPD discriminative training of a sub-syllable based
Mandarin syllable recognition system is presented. Experimental
results confirmed that the GPD discriminative training
algorithm is also effective on improving the performance of a
large vocabulary sub-word based speech recognition system.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT810430004
http://hdl.handle.net/11536/56860
Appears in Collections:Thesis