Title: 中文詞語辨識系統之設計
On the Design of Mandarin Spoken Word Recognition
Authors: 曹維嵩
Wei-Sung Tsao
Chi-Min Liu
Keywords: 中文;詞語;辨識;連音;Mandarin;Spoken Word;Recognition;coarticulation
Issue Date: 1994
Abstract: 在中文,詞是基本的語意單位,其混淆性、同音性皆比單一音節顯著降低
,因此,以詞語(spoken word) 為辨識單位的系統不失為一種可行的方法
。詞語辨識最大的困難就是因連續發音而產生的連音 (coarticulation)
independent聲韻母建立連續型隱藏式馬可夫模型 (CDHMM),在混合數為
嵌入訓練法(embedded training) 時,在特定語者系統可提高辨識率
至99.4% 。另外我們也嘗試針對前後文不同去建立聲母模型,再加上嵌入
訓練法時,在特定語者系統可得到99.2% 的辨識率。在不限語者系統,針
為96.6% 。
For Mandarin speech recognition, syllables have been usually
adopted as the recognition unit. However, the confusing sets
in syllables lead to difficulties for high recognition rate. In
Mandarin, word is the basic semantic unit is word. So, using
the spoken words as recognition units may be a good candidate
to avoid the confusing set problem. In this thesis, we
consider Mandarin spoken word recognition based on a 500
vocabulary. The coarticulation effects is the main issue need
considering for spoken word recognition. For comparison, we use
one mixture context independent INITIALs and FINALs to create a
speaker-dependent system with 94.2% recognition rate, based on
the continuous density hidden Markov model. We first increase
the mixture number to improve the recognition rate. Then we
employ the embedded training to get 99.4% recognition rate in
speaker-dependent systems. We also try to create INITIAL models
depends on different left and right context. When applying
embedded training, we create a speaker-dependent system with
99.2% recognition rate. For speaker-independent system, we
adopt context-dependent INITIALs and context-independent
FINALs, and apply embedded training to get a recognition rate
Appears in Collections:Thesis