標題: 在加成雜訊及迴音下的強健性語者辨識系統
Robust Speaker Identification Systems against Additive Noise and Reverberation
作者: 黃紘斌
Huang, Hung-Pin
冀泰石
Chi, Tai-Shih
電信工程研究所
關鍵字: 語者辨識;時頻域調變;迴音;加成性雜訊;Speaker Identification;Spectro-temporal Modulation;Reverberation;Additive Noise
公開日期: 2011
摘要: 傳統的語者辨識系統,一般使用MFCCs 做為特徵參數。由於MFCCs 只包含低 階的語音資訊,在受到加成性雜訊及迴音干擾時,辨識能力會大幅的下降。而使 用時頻域調變濾波,去擷取語音中較高階的語音資訊,一般認為在雜訊環境中具 有較強健性的辨識能力。在本論文中,首先將語音通過人耳前級聽覺模型,再根 據語音受到環境雜訊干擾後的特性選出了9 組時頻域的二維調變濾波器對於聽 覺頻譜進行二維濾波,取出九組較不易受到環境雜訊影響的特徵向量,應用於各 種類型雜訊下的語者辨識。由實驗結果顯示,所提出的參數在迴音環境中T60 大於0.6 後辨識能力較MFCCs 好,而在加成性雜訊中辨識率比MFCC 高出許多, 而比較ANTCC,則在低SNR 下有較好的辨識能力。
Conventional speaker recognition systems usually use MFCCs as features, which are known as low-level features and the performance are severely compromised by interference, such as additive or convolutional noises. On the contrary, the high-level features seemed to be less accuracy in clean condition but more robustness in noisy environment. In this thesis, we first pass the speech wave thorough the hearing models, then choose 9 sets of outputs after spectro-temporal modulation filtering as our features and apply those features for robust speaker recognition. The results show that, in reverberant condition, we performs better after T60 is more than 0.6, and in noisy criteria, we get significant improvement for all SNR conditions compare to MFCCs ,and superior performance to ANTCCs is also showed in low SNR conditions.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079813560
http://hdl.handle.net/11536/47043
顯示於類別:畢業論文