以信心值量測為基礎之麥克風陣列強健性語音辨識技術

標題:	以信心值量測為基礎之麥克風陣列強健性語音辨識技術 Confidence Measure Based Dual-Microphone Robust Speech Recognition
作者:	莊孟魁 Chuang, Meng-Kuei 林寶樹張森嘉 Lin, Bao-Shuh Chang, Sen-Chia 多媒體工程研究所
關鍵字:	雙耳時間差;麥克風陣列;語音增強;語音辨識;Interaural Time Diffrence;Microphone Array;Specch Enhancement;Speech Recognition
公開日期:	2010
摘要:	近年來，因為其顯著的語音增強(Speech Enhancement)效能，雙麥克風(Dual-Microphone)抗噪技術逐漸受到重視。本論文發展一種在雙麥克風下，以信心值（Confidence Measure, CM）為基礎，自動快速挑選雙耳時間差（Interaural Time Difference, ITD) 範圍以增進語音辨識正確率之技術。首先驗證最大信心值( Maximum Confidence Measure, MCM)和辨識結果的關係。接著為了提升整體運算效能，利用K-means 分群演算法將語音模型進行分類與簡化，使模型類別數減少。最後以工研院玩具遙控車語音命令語料庫進行測試，在雜訊角度位於30度及60度、訊噪比( Signal-to-Noise Ratio, SNR)為0dB的情況下進行測試，其辨識率可從原本的10%提升至約90%。 In recent years, dual-microphone plays an importance role in noise suppression gradually due to the significant speech enhancement performance. In this thesis, we develop a fast and automatic procedure based on confidence measure (CM) to choose a threshold of interaural time difference (ITD) used in dual-microphone robust speech recognition. First of all, we verify the relation between the maximum confidence measure (MCM) and the recognition rate. Secondly, we cluster the speech models by K-means algorithm to reduce the computation load in our system. In a voice command experiment a toy car voice command corpus from Industrial Technology Research Institute database was used to test the performance of our method. When the input signal-to-noise ratio (SNR) is 0 dB with noise located at 30 degrees or 60 degrees, the recognition accuracy was improved from10% to 90% .
URI:	http://140.113.39.130/cdrfb3/record/nctu/#GT079857534 http://hdl.handle.net/11536/48456
顯示於類別：	畢業論文