標題: 中文詞語辨識系統之設計
On the Design of Mandarin Spoken Word Recognition
作者: 曹維嵩
Wei-Sung Tsao
劉啟民
Chi-Min Liu
資訊科學與工程研究所
關鍵字: 中文;詞語;辨識;連音;Mandarin;Spoken Word;Recognition;coarticulation
公開日期: 1994
摘要: 在中文,詞是基本的語意單位,其混淆性、同音性皆比單一音節顯著降低
,因此,以詞語(spoken word) 為辨識單位的系統不失為一種可行的方法
。詞語辨識最大的困難就是因連續發音而產生的連音 (coarticulation)
問題。本論文以一500人名的詞庫來對中文詞語作研究,以context
independent聲韻母建立連續型隱藏式馬可夫模型 (CDHMM),在混合數為
一時建立辨識率94.2%的特定語者系統。首先我們提高狀態混合數再加上
嵌入訓練法(embedded training) 時,在特定語者系統可提高辨識率
至99.4% 。另外我們也嘗試針對前後文不同去建立聲母模型,再加上嵌入
訓練法時,在特定語者系統可得到99.2% 的辨識率。在不限語者系統,針
對前後文不同去建立聲母模型再加上嵌入訓練法時可得最好的辨識率
為96.6% 。
For Mandarin speech recognition, syllables have been usually
adopted as the recognition unit. However, the confusing sets
in syllables lead to difficulties for high recognition rate. In
Mandarin, word is the basic semantic unit is word. So, using
the spoken words as recognition units may be a good candidate
to avoid the confusing set problem. In this thesis, we
consider Mandarin spoken word recognition based on a 500
vocabulary. The coarticulation effects is the main issue need
considering for spoken word recognition. For comparison, we use
one mixture context independent INITIALs and FINALs to create a
speaker-dependent system with 94.2% recognition rate, based on
the continuous density hidden Markov model. We first increase
the mixture number to improve the recognition rate. Then we
employ the embedded training to get 99.4% recognition rate in
speaker-dependent systems. We also try to create INITIAL models
depends on different left and right context. When applying
embedded training, we create a speaker-dependent system with
99.2% recognition rate. For speaker-independent system, we
adopt context-dependent INITIALs and context-independent
FINALs, and apply embedded training to get a recognition rate
96.6%.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT830392065
http://hdl.handle.net/11536/58990
顯示於類別:畢業論文