標題: | 以聲韻母為基礎之國語連續音辨認之改進 An improvement of initial-final based Mandarin continuous speech recognition |
作者: | 蔣松茂 S. M. Chiang 陳信宏 S. H. Chen 電信工程研究所 |
關鍵字: | 音節插入懲罰;限制狀態長度的隱藏式馬可夫模型;有限狀態機;;Syllable insertion penalty;Bounded state duration HMM (BSD- HMM); Finite state machine (FSM); |
公開日期: | 1994 |
摘要: | 在本論文中,我們進行以聲母韻母模型為基礎的國語連續語音辨認。研究 主題可以分成兩個部份:在第一個部份,我們以100 個考慮後接韻母相依 關係的聲母模型與 39 個韻母模型來進行辨認,並針對語音信號時間上的 結構特性採用限制狀態長度的隱藏式馬可夫模型來改進標準隱藏式馬可夫 模型在這方面模擬的缺失。在語者不特定的辨認中,我們適當運用語者之 間差異的資訊做語者分類,這種做法有助於提昇辨認的正確率。在第二個 部份,我們先以遞迴式類神經網路做連續語音聲母、韻母、靜音的預先切 割,然後利用切割結果建構一個有限狀態機,並將它合併入連續音辨認的 架構中,實驗結果顯示有限狀態機的輔助可以節省辨認過程中一半左右的 計算量。 In this thesis, several techniques to improve the initial-final based HMM method for continuous Mandarin speech recognition are proposed. The baseline system uses 100 right-context-dependent initial HMM models and 39 context-independent final HMM models. First, the technique of bounded state duration is employed to model the temporal structure of speech signals and incorporated into the recognition process. The technique of syllable penalty is then used to relieve the suffering of high insertion errors. We then employ the technique of signal normalization to improve the system. The performance of the recognizer is then further improved by using gender-dependent HMM models. Effectiveness of the above proposals was confirmed by simulations on a speaker- independent speech recognition task to recognize continuous Mandarin speech through telephone channel. Syllable recognition rate was raised from 30.86% to 42.14%. Finally, an RNN-based finite state machine is proposed to pre-segment the input signal into 4 states including initial, final, silence, and transient states. State-dependent Constraints are then set to restrict the search of optimal path for relieving the computation load of the one-stage recognition process. Experimental results showed that about half of the computations can be saved with a very minor loss on the recognition rate. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT830436031 http://hdl.handle.net/11536/59386 |
Appears in Collections: | Thesis |