標題: 中文詞語辨識系統之設計
On the Design of Mandarin Spoken Word Recognition
作者: 曹維嵩
Wei-Sung Tsao
劉啟民
Chi-Min Liu
資訊科學與工程研究所
關鍵字: 中文;詞語;辨識;連音;Mandarin;Spoken Word;Recognition;coarticulation
公開日期: 1994
摘要: 在中文,詞是基本的語意單位,其混淆性、同音性皆比單一音節顯著降低 ,因此,以詞語(spoken word) 為辨識單位的系統不失為一種可行的方法 。詞語辨識最大的困難就是因連續發音而產生的連音 (coarticulation) 問題。本論文以一500人名的詞庫來對中文詞語作研究,以context independent聲韻母建立連續型隱藏式馬可夫模型 (CDHMM),在混合數為 一時建立辨識率94.2%的特定語者系統。首先我們提高狀態混合數再加上 嵌入訓練法(embedded training) 時,在特定語者系統可提高辨識率 至99.4% 。另外我們也嘗試針對前後文不同去建立聲母模型,再加上嵌入 訓練法時,在特定語者系統可得到99.2% 的辨識率。在不限語者系統,針 對前後文不同去建立聲母模型再加上嵌入訓練法時可得最好的辨識率 為96.6% 。 For Mandarin speech recognition, syllables have been usually adopted as the recognition unit. However, the confusing sets in syllables lead to difficulties for high recognition rate. In Mandarin, word is the basic semantic unit is word. So, using the spoken words as recognition units may be a good candidate to avoid the confusing set problem. In this thesis, we consider Mandarin spoken word recognition based on a 500 vocabulary. The coarticulation effects is the main issue need considering for spoken word recognition. For comparison, we use one mixture context independent INITIALs and FINALs to create a speaker-dependent system with 94.2% recognition rate, based on the continuous density hidden Markov model. We first increase the mixture number to improve the recognition rate. Then we employ the embedded training to get 99.4% recognition rate in speaker-dependent systems. We also try to create INITIAL models depends on different left and right context. When applying embedded training, we create a speaker-dependent system with 99.2% recognition rate. For speaker-independent system, we adopt context-dependent INITIALs and context-independent FINALs, and apply embedded training to get a recognition rate 96.6%.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT830392065
http://hdl.handle.net/11536/58990
顯示於類別:畢業論文