標題: 使用說話速度與耦合效應分類之華語連續音節辨識
The Use of Speaking Rate and Coarticulation classifications in Continuous Mandarin Speech Recognition
作者: 曹智欣
Zhi-Hsin Tsao
陳信宏
Sin-Horng Chen
電信工程研究所
關鍵字: 說話速度;耦合;speaking rate;coarticulation
公開日期: 2002
摘要: 本論文提出兩種將語音信號分類,以建立各類別聲韻母辨認模型,來改進華語連續語音辨認之方法。其一為依語者說話速度分成快、正常、慢三類,分別建立三組100個 RCD聲母及40 個CI韻母HMM模型;另一為依音節間耦合程度分成耦合、一般、無耦合三類,對耦合類建立219個縮短之合併聲韻母HMM模型,對另兩類則分別建立兩組100個RCD聲母及40個CI韻母HMM模型。由對MAT-4500之電話語料庫之實驗結果得知,兩種方法均能稍為改進整體之音節辨認率,而對速度較快及較慢或是耦合及無耦合類之語音則有較大幅度的改進,因此它們均為有效的方法。
In this thesis, two signal classification-based acoustic modeling methods for continuous Mandarin speech recognition are proposed. One is to firstly classify all speech signals into three classes of fast, normal and slow, and then construct separately for each class a set of 100 right-context-dependent (RCD) initial and 40 context-independent (CI) final HMM models. Another is to firstly classify all signals of final-initial pairs into highly coarticulated, normal, and non-coarticluated classes, and then construct 219 contracted, integrated final-initial HMM models and two sets of 100 RCD initial and 40 CI final HMM models for them. Experimental results based on the MAT-4500 telephone-speech database confirmed that both methods slightly improved the syllable recognition rate. By detailed analyses, we found that both speech classes of fast and slow or highly coarticulated and non-coarticluated had more significant performance improvements. So they are promising acoustic modeling methods.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT910435063
http://hdl.handle.net/11536/70596
顯示於類別:畢業論文