标题: | 在加成杂讯及回音下的强健性语者辨识系统 Robust Speaker Identification Systems against Additive Noise and Reverberation |
作者: | 黄纮斌 Huang, Hung-Pin 冀泰石 Chi, Tai-Shih 电信工程研究所 |
关键字: | 语者辨识;时频域调变;回音;加成性杂讯;Speaker Identification;Spectro-temporal Modulation;Reverberation;Additive Noise |
公开日期: | 2011 |
摘要: | 传统的语者辨识系统,一般使用MFCCs 做为特征参数。由于MFCCs 只包含低 阶的语音资讯,在受到加成性杂讯及回音干扰时,辨识能力会大幅的下降。而使 用时频域调变滤波,去撷取语音中较高阶的语音资讯,一般认为在杂讯环境中具 有较强健性的辨识能力。在本论文中,首先将语音通过人耳前级听觉模型,再根 据语音受到环境杂讯干扰后的特性选出了9 组时频域的二维调变滤波器对于听 觉频谱进行二维滤波,取出九组较不易受到环境杂讯影响的特征向量,应用于各 种类型杂讯下的语者辨识。由实验结果显示,所提出的参数在回音环境中T60 大于0.6 后辨识能力较MFCCs 好,而在加成性杂讯中辨识率比MFCC 高出许多, 而比较ANTCC,则在低SNR 下有较好的辨识能力。 Conventional speaker recognition systems usually use MFCCs as features, which are known as low-level features and the performance are severely compromised by interference, such as additive or convolutional noises. On the contrary, the high-level features seemed to be less accuracy in clean condition but more robustness in noisy environment. In this thesis, we first pass the speech wave thorough the hearing models, then choose 9 sets of outputs after spectro-temporal modulation filtering as our features and apply those features for robust speaker recognition. The results show that, in reverberant condition, we performs better after T60 is more than 0.6, and in noisy criteria, we get significant improvement for all SNR conditions compare to MFCCs ,and superior performance to ANTCCs is also showed in low SNR conditions. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079813560 http://hdl.handle.net/11536/47043 |
显示于类别: | Thesis |