標題: 以二維倒頻譜為基礎的噪音下多語者語音辨識
GA-based Noisy Speech Recognition using Two Dimensional Cepstrum
作者: 胡景淵
Hwu, Jiing-Yuan
林進燈
Lin Chin-Teng
電控工程研究所
關鍵字: 二維倒頻譜;噪音;類神經網路;基因遺傳演算法則;Two dimensional cepstrum;noise;neural network;Genetic Algorithm
公開日期: 1996
摘要: 在眾多語音參數中,二維倒頻譜它同時將語音信號局部的特性與整體的變 化包含在一個係數矩陣中。根據分析在乾淨音下,在係數矩陣較低階係數 區域的參數為重要的語音參數。因此只需一小部份的係數被用來形成參數 向量,代表該語音,每一個語音信號是用一個參數向量而不是一串參數向 量代表,所以它所需儲存參數空間較少且計算量少。但是根據分析在噪音 環境下,它的辨識率會急速下降,為了解決此一問題,本篇論文提出以二 維倒頻譜為基礎的改良式二維倒頻譜方法來提高在噪音下的辨識率。改良 式二維倒頻譜方法是利用一個高通濾波器對音框方向加以濾除噪音成份, 並且應用基因遺傳演算法則從所得的係數矩陣中找尋具有抗雜訊的係數以 提高噪音下的辨識率。在實驗部份我們利用五種噪音源及十位語者的語音 資料來辨識國語數字。最後我們由實驗結果可以看出我們的方法的確可以 提高在噪音下的辨識率。 在本篇論文的附錄A中,我們將改良式二維倒 頻譜方法與類神經網路系統合以更進一步提高在噪音環境下的辨識率,我 們由實驗結果可以看出改良式二維倒頻譜與類神經網路系統結合這種方法 的確可以更進一步提高原系統在噪音下的辨識率。我們所使用的類神經網 路為SONFIN模糊類神經網路。 There are many kinds of parameters that are used for speech feature extraction. Two dimensional cepstrum (TDC) is one of them. It can simultaneously represent several kinds of information contained in the speech waveform: static and dynamic features, as well as global and fine frequency structures. From analysis, the coeffi-cients located at lower indexes portion of the TDC matrix seem to be more significant than others. Hence, to represent an utterance only some TDC coefficients will be selected to form a feature vector instead of the sequences of feature vectors. It has the advantages of simple computation and less storage space. However, our experiments show that it is quite sensitive to background noise. In order to solve this problem, we propose the GA-based M_TDC method in this dissertation to improve the performance of TDC under noisy condition. In the GA-based M_TDC method, we use the temporal filter to remove the components of noise in the feature extraction phase andwe apply the genetic algorithms (GAs) to find the robust speech parameters in the M_TDC matrix. From the experiments with five noise types, we found that the GA-based M_TDC have better recognition results than the TDC under the noisy environments. In Appendix A of this thesis, the combination of GA-based M_TDC with neural network was proposed to improve the recognition rate under the noisy environment furthermore. From the experiments with five noise types, we found that the combi-nation of GA-based M_TDC with neural network method have better recognition re-sults than the GA- based M_TDC under the noisy environments. The neural network used in our system is the Self-cOnstructing Neural Fuzzy Inference Network (SONFIN).
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT850327049
http://hdl.handle.net/11536/61706
顯示於類別:畢業論文