標題: 語者辨別的研究
A Study on Speaker Identification
作者: 詹子杰
Chan, Tzu-Chieh
陳玲慧
Chen, Ling-Hwei
多媒體工程研究所
關鍵字: 語者辨別;高斯混合模型;梅爾倒頻譜係數;speaker identification;Gaussian mixture model;MFCC
公開日期: 2011
摘要: 近年來,以生物特徵為基礎的認證系統已經廣泛的被應用在我們的日常生活中,像是智慧型手機、筆記型電腦、門禁管理…等。聲音為人類最自然、簡單的表現行為,將其應用在以生物特徵為基礎的認證系統中是合適的。因為不同錄音裝置還有錄音環境的影響,會導致以聲音為基礎的認證系統辨識率下降。而我們稱這些錄音裝置還有環境的影響叫做通道效應。在本論文中,我們提出了一個去除通道效應的新方法。基於已被廣泛使用的梅爾倒頻譜(Mel-scale frequency cepstral coefficients)係數特徵,使用我們的去除通道效應方法去取得新特徵。然後根據我們取出的新特徵和高斯混合模型(Gaussian Mixture Models),就可以判斷語者是誰。根據實驗結果,我們的去通道效應方法擁有比較高的辨識率。
In recent years, the biometric-based authentication systems have been widely used in our life, like the smart-phones, laptops, access control systems, etc. As the most natural, economical, and expressive behavior, the voice is a suitable characteristic for an authentication system. But the channel effects that speeches recorded form different record devices or in a noisy environment make the identification rate decreased. In this thesis, we provide a new channel effect remover to improve the identification rate. Based on the Mel-scale frequency cepstral coefficients (MFCC) features, we use our channel effect remover to extract the new features. According to these new features and Gaussian Mixture models (GMMs), we can recognize the speaker. Experiment results show that our method has higher identification rate than other methods.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079957506
http://hdl.handle.net/11536/50585
Appears in Collections:Thesis


Files in This Item:

  1. 750601.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.