標題: 應用於關鍵字監聽之諧波結構語音活動偵測演算法
Voice Activity Detection based on Harmonic Structure in Keyword Listening Application
作者: 簡佑軒
Chien, Yu-Hsuan
胡竹生
Hu Jwu-Sheng
電控工程研究所
關鍵字: 諧波結構;諧波;語音活動偵測;harmonic structure;harmonic;Voice Activity Detection;VAD
公開日期: 2014
摘要: 本論文針對關鍵字的持續監聽,提出以諧波結構(Harmonic Structure)為特徵的語音活動偵測演算法(Voice Activity Detection, VAD),諧波結構是頻率軸上具有週期性的能量分佈,搜尋頻譜區域性明顯的諧波結構做為語音特徵,並在VAD決策方法上,判斷諧波結構在時間軸上的連續性。本論文演算法以不同種類的非穩態純噪音,以及關鍵字在不同SNR下的情況測試,並針對語音命中率和非語音命中率進行分析,其結果與G.729、長時間訊號變動程度(LTSD)、高斯混合模型(GMM)等文獻方法比較,顯示本論文提出的演算法較具優勢。
This thesis proposes a new voice activity detection (VAD) algorithm which is based on harmonic structure feature in keyword listening application. Harmonic structure is a feature that using the periodicity of energy in frequency domain. This approach searches the obvious part of harmonic structure in frequency domain as speech feature, and check the continuity of harmonic structure in time domain as VAD decision rule. The proposed algorithm is tested under different types of non-stationary noises and different SNR condition. Experimental results demonstrate its advantages over other VADs such as G.729, long-term spectral divergence (LTSD) and Gaussian mixture model (GMM).
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070160050
http://hdl.handle.net/11536/76483
顯示於類別:畢業論文