標題: | 通道偏移量分析以及不匹配環境下的電話語音辨認 Channel Bias Analysis and Telephone-Speech Recognition with Mismatch Condition |
作者: | 廖于棻 Yu-Fen Liao 陳信宏 Sin-Horing Chen 電信工程研究所 |
關鍵字: | 語音辨認;通道偏移量;Speech Recogniton;Channel Bias;Mismatch |
公開日期: | 2001 |
摘要: | 在本論文中,首先由幾個不同的觀點來檢驗通道偏移量:在HMM訓練過程中,使用已知的HMM切割位置來估計通道偏移量,可以使得每個HMM狀態更為緊密。此外,也將SBR所估計的偏移量與上述HMM偏移量作一系列的比較。在此同時,觀察語料中語句的長短以及語音的穩定部分對於偏移量的影響,期望可以使用較少的語料來減少求取偏移量時間。之後,將研究重心移至不同語料庫之間的匹配問題:訓練語料採用MAT電話線語料庫,測試語料則是工研院所提供的ATC行動電話語料庫。首先發現CMN相較於SBR有較好的能力對抗語料庫間不匹配的問題,更深入的分析發現兩個語料庫之間的不同在於兩語料庫之間的距離。因此,我們根據HMM切割位置設計了一個遞迴的系統,來補償每一個ATC的輸入語料。使用ATC與MAT兩者中心點的距離來補償,所得到的辨認率為59.97%。而使用遞迴的系統,所得到的辨認率為58.42%。 In this thesis, we try to examine channel bias from several points of view. In HMM training procedure, bias estimated from HMM segment is addressed to compact each HMM model. Besides, a series of studies between SBR and HMM biases are perused. Meanwhile, to make bias evaluated more efficiently we also develop some ideas of involving syllable number and stable speech consideration. After that, the problem of mismatched condition in which HMM models (and SBR codebook) are trained in the MAT database and tested in a cellular-phone database provided by ATC, ITRI is discussed. We first find that CMN has a better ability than SBR to compensate the mismatch. A further study shows the mismatch is the mean vector of the HMM state between these two databases. Therefore, each ATC input feature can be compensated to match the MAT database by estimating mismatch recursively. The recognition rate of applying mismatch as database distance is 59.97%, which is a little higher than estimating mismatch recursively, 58.42%. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT900435034 http://hdl.handle.net/11536/68909 |
Appears in Collections: | Thesis |