標題: | 以信心值量測為基礎之麥克風陣列強健性語音辨識技術 Confidence Measure Based Dual-Microphone Robust Speech Recognition |
作者: | 莊孟魁 Chuang, Meng-Kuei 林寶樹 張森嘉 Lin, Bao-Shuh Chang, Sen-Chia 多媒體工程研究所 |
關鍵字: | 雙耳時間差;麥克風陣列;語音增強;語音辨識;Interaural Time Diffrence;Microphone Array;Specch Enhancement;Speech Recognition |
公開日期: | 2010 |
摘要: | 近年來,因為其顯著的語音增強(Speech Enhancement)效能,雙麥克風(Dual-Microphone)抗噪技術逐漸受到重視。 本論文發展一種在雙麥克風下,以信心值(Confidence Measure, CM)為基礎,自動快速挑選雙耳時間差(Interaural Time Difference, ITD) 範圍以增進語音辨識正確率之技術。首先驗證最大信心值( Maximum Confidence Measure, MCM)和辨識結果的關係。接著為了提升整體運算效能,利用K-means 分群演算法將語音模型進行分類與簡化,使模型類別數減少。最後以工研院玩具遙控車語音命令語料庫進行測試,在雜訊角度位於30度及60度、訊噪比( Signal-to-Noise Ratio, SNR)為0dB的情況下進行測試,其辨識率可從原本的10%提升至約90%。 In recent years, dual-microphone plays an importance role in noise suppression gradually due to the significant speech enhancement performance. In this thesis, we develop a fast and automatic procedure based on confidence measure (CM) to choose a threshold of interaural time difference (ITD) used in dual-microphone robust speech recognition. First of all, we verify the relation between the maximum confidence measure (MCM) and the recognition rate. Secondly, we cluster the speech models by K-means algorithm to reduce the computation load in our system. In a voice command experiment a toy car voice command corpus from Industrial Technology Research Institute database was used to test the performance of our method. When the input signal-to-noise ratio (SNR) is 0 dB with noise located at 30 degrees or 60 degrees, the recognition accuracy was improved from10% to 90% . |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079857534 http://hdl.handle.net/11536/48456 |
Appears in Collections: | Thesis |