標題: | 語音辨識中語者調適方法之研究 The Study of Speaker Adaptation for Speech Recognition |
作者: | 謝宗儒 Zong-Ru Hsieh 傅心家 資訊科學與工程研究所 |
關鍵字: | 語音辨識;語者調適;Speech Recognition;Speaker Adaptation |
公開日期: | 2004 |
摘要: | 在隱藏式馬可夫模型為語音辨識核心的語者調適方法中,最大相似度線性迴歸(Maximum Likelihood Linear Regression)調適法是一個有效且快速的方法。為了彈性地調整迴歸分類以達到調適參數的共享,其使用迴歸分類樹架構來定義那些欲調適的隱藏式馬可夫模型的參數應屬於同一個迴歸類別,使用相同的調適參數。然而,此分類樹的類別數,需要人為經驗才能做好的決定。針對此點,我們使用了貝氏資訊基準(Bayesian Information Criterion, BIC),提出了由上而下的二元分裂法(Top-down binary splitting)來建立迴歸分類樹,其可以自動決定類別的個數,而不需人為的介入,而經過實驗的驗證,可以看出Top-down binary splitting方法所決定的類別數是合適的。另外我們也提出了由下而上的二元合併法(Bottom-up binary merging)來建立迴歸分類樹,其基於Top-down binary splitting的結果,建立更能代表資料在空間上分佈的迴歸分類樹,而對於語音辨識的效能,也能有效的加以提升。最後我們應用所提出的語者調適改進方法,實作在手持式設備上的語音辨識系統,用以辨識使用者的語音輸入,此系統以分散式的運算架構,以實現大字彙的語音辨識系統。經過多名使用者的測試後,觀察出加入語者調適技術後的語音辨識系統,正確率及準確率達到90.09%及87.21%。 In this paper, we focus on speaker adaptation technique for speech recognition. The main method we used is Maximum Likelihood Linear Regression (MLLR). MLLR makes use of regression classes to group model parameters, so that the parameters in the same group can share the same adaptation transformation. The Regression class tree is one approach to dynamically define number of regression class, but the construction of regression class tree need to determine manually. Therefore, we use the Bayesian Information Criterion (BIC) and propose a Top-down binary splitting algorithm. This algorithm can construct a deterministic regression class tree automatically and the experiment result is reasonable. We also propose a Bottom-up binary merging algorithm to refine the regression class tree constructed by Top-down binary splitting algorithm and has an improved result. Moreover, we apply the proposed methods and implement a distributed large vocabulary speech recognition system on handheld device. The correct rate and accuracy are 90.09% and 87.21%. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009217533 http://hdl.handle.net/11536/73312 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.