標題: 使用基頻資訊之國語分散式語音辨識系統
The Mandarin Distributed Speech Recognition System Using Pitch Information
作者: 魯柏暄
Bo-Xuan Lu
Yih-Ru Wang
關鍵字: 分散式語音辨識;基頻偵測器;DSR;Pitch detection
公開日期: 2004
摘要: 在本論文中,將在分散式語音辨認架構之標準-ETSI ES 202 212 XAFE下,建立國語語音辨認之效能評估。論文中共作了國語數字串及國語大詞彙連續語音兩種語音辨認實驗。首先在實驗發現ETSI分散式語音辨認架構之基頻偵測器在語音信號的訊噪比低於10dB時,ETSI架構之基頻偵測器的效能嚴重變壞;這使得國語語音辨識器在低訊噪比時,使用基頻資訊會比未使用基頻資訊的結果差;在論文中提出了一個小幅修改ETSI架構之基頻偵測方法後,可以增進在低訊噪比時的基頻偵測效能。論文中更藉由整合使用基頻資訊及未使用基頻資訊辨識器之辨認分數,可有效增進環境雜訊下的國語語音辨識率。最後在國語數字串可獲得86.8%辨認率,在國語大詞彙連續語音可獲得65.3%、45.4%的音節及字元辨認率。
In the thesis, the performance of Mandarin digit-string and continuous large vocabulary Mandarin speech recognition were evaluated under ETSI ES-202-212 XAFE environment. First, the experimental results showed that the performance of the pitch detection algorithm degraded seriously when the SNR of speech signal was lower than 10dB. This makes the Mandarin speech recognizer using pitch information perform inferior to the recognizer without using pitch information in low SNR environments. A modification of the pitch detection algorithm is therefore proposed to improve the performance of ETSI’s pitch detector in low SNR environments. The recognition performance of Mandarin speech can be improved for most SNR levels by integrating the recognizers with and without using pitch information. Finally, 86.8% recognition rate can be achieved for Mandarin digit-string. 65.3% syllable and 45.4% character recognition rates can be achieved for Mandarin continuous speech.


  1. 362801.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。