基於加權有限狀態轉換器國語語音辨認系統之設計

標題:	基於加權有限狀態轉換器國語語音辨認系統之設計 System design of a WFST-based Mandarin speech recognizer
作者:	蘇仲銘 Su, Chung-Ming 王逸如 Wang, Yih-Ru 電信工程研究所
關鍵字:	基於加權有限狀態轉換器國語語音辨認系統之設計;System design of a WFST-based Mandarin speech recognizer
公開日期:	2013
摘要:	本論文主要針對語音辨識系統中之語言模型做改善，對訓練語料做正規化，包含合併同義詞、異體詞、又讀詞等等，選詞依照詞性分類，提高開放式詞類選入詞典之門檻，降低封閉式詞類之門檻，並且考量詞彙在訓練語料中分布之均勻性，最後再藉由音節解碼來評估語言模型。實驗結果得知，在相同辨識效能下，加權有限狀態機的辨識速度比傳統辨識系統快將近20倍，因此本論文主要探討如何使用加權有限狀態轉換器來建構中文大詞彙連續語音辨識系統，首先介紹加權有限狀態轉換器的相關演算法，以及不同層級之語音模型如何以有限狀態機圖形來表示，並且以最佳化來縮小有限狀態機之路徑。接著調整建構語言模型之參數，改變加權有限狀態轉換器之大小，且改變辨識時所需之參數，探討這些因素和辨識率與辨識速度之間的關係。 This thesis is mainly focus on improving language model in Automatic Speech Recognition(ASR). The studies normalize the training data including combining synonym, variant word, multi-pronunciation. The words are categorized by word class to choose dictionary. Raise the opening word class threshold and reduce the closing word class threshold when choosing dictionary. We also consider the word distribution in training data when choosing word in dictionary. Using syllables to decode to estimate language model whether good or not after training language model. We can find that the recognition rate of WFST is 20 times faster than traditional recognition system at the same recognition rate, hence this thesis is mainly studying how to use Weighted Finite-State Transducer (WFST) to build Large Vocabulary Continuous Mandarin Speech Recognition. We first introduce the algorithm of WFST and represent different ASR layer with WFST also use optimization to minimize WFST. Modify the features when building language model. Finally, we change the size of WFST and the features when recognizing, so we can find the relationship between recognition rate and recognition time.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#GT070060275 http://hdl.handle.net/11536/73686
Appears in Collections:	Thesis

Files in This Item:

027501.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.