標題: 基於深度信念網路之國語單音辨識
Mandarin Chinese Phoneme Recognition Based on Deep Belief Networks
作者: 洪非凡
Hung, Fei-Fan
王聖智
簡鳳村
Wang, Sheng-Jyh
Chien, Feng-Tsun
電子工程學系 電子研究所
關鍵字: 深度學習;深度信念網路;國語單音辨識;Deep Learning;Deep Belief Networks;Mandarin Chinese Phoneme Recognition
公開日期: 2015
摘要: 在本篇論文裡,主要是利用深度學習中的深度信念網路,對國語的單音進行辨識。在國語裡,每個單字都以單音呈現,再配合上四聲的變化,不同單音配上不同聲調便會代表不同的語意;因此,語音辨識中,不同於英文,聲調在辨識國語當中占了很重要的部分。我們希望藉由深度信念網路,去自動學習出國語中單音及聲調之特徵,藉此進行單音辨識。深度信念網路有以下三個特點:他由很多層非線性之隱藏層所組成,越多層非線性網路能夠解決越複雜的問題;第二,他是生成式的非監督式預先學習,透過預先學習,可以解決傳統學習深度神經網路會遇到的梯度消失之問題;最後,提供預先學習後的模型對應之類別,便可微調模型的參數,讓資料和類別做連結。經由這樣的方式,我們希望深度信念網路能夠學習出國語單音之特徵,自動區別出國語聲調之變化,最終正確地辨識出國語單音。
In this thesis, we apply a deep architecture, Deep Belief Networks (DBNs), to perform Mandarin Chinese phoneme recognition. There are four main tones in Mandarin Chinese and every Chinese character is articulated as an individual syllable with distinct tones. Since a syllable pronounced in different tones may result in distinctive meanings, tones play an essential role in Mandarin Chinese phoneme recognition. This property is quite different from English and some other non-tonal languages. In this thesis, we recognize Mandarin Chinese by utilizing DBNs (Deep Belief Network) to automatically learn the appropriate features of syllables and tones in Mandarin Chinese. DBNs have several advantages if compared with other machine learning techniques. First, there are many non-linear hidden layers in a DBN network which can deal with complex problem. Second, DBNs are generatively pre-trained in an unsupervised way that can not only learn useful features from data but also avoid the vanishing gradient problems in traditional neural network training. Lastly, we can provide corresponding targets of the training data to fine-tune our model after the pre-training process. With the above properties, DBNs can learn a proper connection between phonetic features and targets, and can thus better recognize Mandarin Chinese phonemes.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070150203
http://hdl.handle.net/11536/126152
Appears in Collections:Thesis