標題: | 基於聽覺語言學與模糊類神經網路之英文母音辨識技術 Speaker-Independent English Vowel Recognition Technique Based on Acoustic-Phonetics and Fuzzy Neural Networks |
作者: | 洪英士 Ying-Shih Hung 周志成 林進燈 Chi-Cheng Jou Chin-Teng Lin 電控工程研究所 |
關鍵字: | 語音辨識;頻譜分析;語者不相關;聽覺語言學;模糊類神經網路;speech recognition;spectrum analysis;speaker-independent;acoustic-phonetic;fuzzy neural network |
公開日期: | 2003 |
摘要: | 在本論文中,我們提出一新的語者不相關的英文母音辨識技術。首先,我們提出一組名為「聽學增強型-離散餘弦序列係數(AE-DCSC)」的新特徵。此特徵的想法是將許多聽學語言學上有關英文母音的研究成果實現在頻譜的強化上,讓其更具有代表性與差異化。其中,頻譜正規化(Spectrum-Level-Normalization)用以平衡不同共振峰的高度差異。根據語言學的研究,共振峰的位置比其高度來的重要。諧音的強化(Enhancement of Spectral Peaks)則能有效的壓抑介於諧音間頻譜微小的變化,使其更具強健性。為了能在有限的特徵維度裡有效地保留母音頻譜隨時間的變化情形,我們採用了離散餘弦序列係數這項技術。此技術具有可改變的頻率與時間的彎曲比例,這讓我們能根據訊號的特性,找出最具有代表性的特徵。而在本系統中,我們採用一前向式自我建構類神經模糊推理網路(SONFIN)做為核心辨識器。利用其可自我建構並調整的架構與參數學習功能,與優異的模糊類神經推論過程,來達到較佳之辨識效果。最後,我們提出一基於語言學特徵的確認程序。針對較為混淆的辨識結果,擷取其在聽學語言學上的特徵,並與我們事先建立的知識庫理的模型比對。以找出最可信的辨識結果。實驗證明,在TIMIT的資料庫下,此系統的辨識率可達74.75%,優於其他在文獻上所見的結果。這說明了我們在此所提出的辨識系統所具有的潛力與優越性。 In this thesis, we proposed a novel speaker-independent English vowel recognition technique based on acoustic-phonetics and fuzzy neural networks. At first, we proposed a new feature set called as “AE-DCSC”. It was derived from the researches of acoustic-phonetics and implemented here to enhance the spectrum so that the features became more representative and discriminative. The technique spectrum-level-normalization was used to balance the amplitude difference between formants. Moreover, the enhancement of spectral peaks was used to suppress the variation of valley between harmonics. These processes let the spectrum more robust and noise-free. In order to preserve the temporal cues of vowels, the technique DCSC was used. The flexible time/frequency warping scales were adjusted according to properties of signals. An on-line self-constructing neural fuzzy inference network (SONFIN) was adopted as the main classifier in this system. SONFIN found its optimal structure and parameters automatically and achieved the better classification result via superior inference process. Finally an acoustic-checking procedure was proposed. We applied it to the ambiguous case in which the acoustic characteristics was evaluated and compared with the model in our knowledge-base database. The proposed approach resulted in an accuracy rate of 74.75% in TIMIT database, which higher than other published result for the same task. The potential and effectiveness of the proposed system was verified. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009112533 http://hdl.handle.net/11536/44868 |
顯示於類別: | 畢業論文 |