标题: | 语者辨别的研究 A Study on Speaker Identification |
作者: | 詹子杰 Chan, Tzu-Chieh 陈玲慧 Chen, Ling-Hwei 多媒体工程研究所 |
关键字: | 语者辨别;高斯混合模型;梅尔倒频谱系数;speaker identification;Gaussian mixture model;MFCC |
公开日期: | 2011 |
摘要: | 近年来,以生物特征为基础的认证系统已经广泛的被应用在我们的日常生活中,像是智慧型手机、笔记型电脑、门禁管理…等。声音为人类最自然、简单的表现行为,将其应用在以生物特征为基础的认证系统中是合适的。因为不同录音装置还有录音环境的影响,会导致以声音为基础的认证系统辨识率下降。而我们称这些录音装置还有环境的影响叫做通道效应。在本论文中,我们提出了一个去除通道效应的新方法。基于已被广泛使用的梅尔倒频谱(Mel-scale frequency cepstral coefficients)系数特征,使用我们的去除通道效应方法去取得新特征。然后根据我们取出的新特征和高斯混合模型(Gaussian Mixture Models),就可以判断语者是谁。根据实验结果,我们的去通道效应方法拥有比较高的辨识率。 In recent years, the biometric-based authentication systems have been widely used in our life, like the smart-phones, laptops, access control systems, etc. As the most natural, economical, and expressive behavior, the voice is a suitable characteristic for an authentication system. But the channel effects that speeches recorded form different record devices or in a noisy environment make the identification rate decreased. In this thesis, we provide a new channel effect remover to improve the identification rate. Based on the Mel-scale frequency cepstral coefficients (MFCC) features, we use our channel effect remover to extract the new features. According to these new features and Gaussian Mixture models (GMMs), we can recognize the speaker. Experiment results show that our method has higher identification rate than other methods. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079957506 http://hdl.handle.net/11536/50585 |
显示于类别: | Thesis |
文件中的档案:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.