標題: | 中文連續語音之聲調辨認 Tone Recognition in Continuous Mandarin Speech |
作者: | 王逸如 Yih-Ru Wang 陳信宏 Sin-Horng Chen 電子研究所 |
關鍵字: | 連音效應,下降效應,韻律效應,隱藏馬可夫模式,類神經網路;Coarticulation effect,Declination effect,Prosody,Hidden Markov model,Neural network |
公開日期: | 1994 |
摘要: | 這篇論文研究有關中文連續語音中之聲調辨認之問題。在本文中,幾種不 同的辨認方法被使用在中文連續語音中之聲調辨認問題上。最主要是考慮 連續語音中的連音、下降及韻律效應對中文連續語音中基週及能量軌跡所 造成的影響。以提升在中文連續語音中之聲調辨認率。首先,在本文中使 用兩層式前後文相關的半連續式隱藏馬可夫模式用以克服連音及下降效應 。結果對一非特定語者語音資料庫做辨認時可得到 86.62%的辨認率。其 次,我們使用類神經網路來做聲調辨認,為了考慮連音效應,先提出前後 文相關的類神經網路聲調辨認器,接著再增加上層的馬可夫模式,利用隱 藏控制式類神經網路及來克服下降效應;經實驗証實,其辨認結果較兩層 式前後文相關的半連續式隱藏馬可夫模式為佳,對一個較大的非特定語者 語音資料庫做辨認時可得到86.72%的辨認率。最後,本文中提出了一個簡 單遞迴式類神經網路來模擬中文連續語音中的韻律效應。經實驗觀察發現 ,利用適當的語音參數做為輸入,此簡單遞迴式類神經網路可用來描述中 文連續語音中的韻律信息發音狀態。我們更進一步將此網路之隱藏層輸出 用在前後文相關的類神經網路聲調辨認器上以期提高辨認率,對一特定語 者語音資料庫做實驗,証實可將辨認率由91.38%提高至93.10%。 The characteristics of the tones of Mandarin speech is in general not soley determined by its lexical tonality, but also affected by other factors. In this dissertation, three tone recognizer approaches are studied. First, the HMM-based approach is discussed. Serveral schemes to consider the effects of the coarticulation and the intonation of utterence on tone recogni- tion are proposed. Effectiveness of these schemes were examined by simulation on a multi-speaker database. A recognition rate of 86.62% was archieved. Second, the neural net based approaches are studied. There also consider the coarticulation effect by including the features from neighboring syllables. And, the hidden control neural net(HCNN) and hidden state multi-layer perceptron(HSMLP) are used for compensating the declination effect. Experiment shows 86.72% recognition rate was achieved for a larger database. Last, the approach using a prosodic model to assist tone recognition is studied. A simple recurrent neural net(SRNN) is emploed to model the prosody of an utterance. By using the outputs of the hidden nodes of SRNN to assist an MLP tone recognizer, an improvement on recognition rate from 91.38% to 93.10% was achieved for a single male speaker database. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT830430026 http://hdl.handle.net/11536/59210 |
Appears in Collections: | Thesis |