標題: 應用盲訊號分離法於語音訊號分離之研究
A Study on Applying Source Separation Algorithms to Audio Signal Separation
作者: 黃靖雯
冀泰石
電信工程研究所
關鍵字: 獨立成分分析;語音分離;時頻域遮蔽;Independent Component Analysis;Audio Signal Separation;Time-Frequency Masking
公開日期: 2011
摘要: 本論文探討盲訊號分離法對於語音分離的成效,主要考量經頭部相關傳輸函數(Head-Related Transfer Function, HRTF)為脈衝響應的摺積混和訊號。此舉是為模擬音源來自不同方位時,人的左右耳所聽到的聲音,再利用雙耳所收到的混和訊號之間的差異來做分離。論文中實驗了三種不同的方法,第一種方法為影像分離演算法,將混和訊號的聲譜圖作為影像訊號來處理,利用影像的邊緣圖具有稀疏的特性,以最大化影像邊緣圖作為演算法的目標,並以FastICA計算疊代法的初始條件。第二種方法加入稀疏成分分析法的考量,利用非線性投影將影像的邊緣訊號轉成較為稀疏的訊號,作為影像分離演算法的前置處理。第三種方法將聲譜圖分頻,利用每個頻帶的混和訊號稀疏的特性,使用非線性函數計算音源對於混和訊號的貢獻值,用非線性遮蔽的方法將音源抽取出來。實驗結果顯示,當兩個音源分別位於左右兩邊時,三種方法皆可達到良好的分離效果;當音源來自同一邊的混和情況,所提出的第三種方法也可成功分離音源,與其他現有演算法的比較結果也顯示此方法的方離效果良好且穩定。
In this thesis, we propose audio separation algorithms stemmed from blind source separation algorithms. The Head-Related Transfer Functions (HRTFs) are used to simulate the convolutive mixing of audio sources to model the sound mixtures perceived by listeners. Three different methods are proposed and investigated. The first method is an image separation method named ISBS, in which we consider audio spectrograms as images. The criterion of this algorithm is to maximize the sparsity of the edges of the signals and the FastICA algorithm is used to initiate its iteration. The second method adopts a nonlinear projection as the pre-processing of the ISBS algorithm. The nonlinear projection transforms the edges onto a sparser domain. The third method considers the frequency bin-wise mixtures and utilizes the sparsity in the time-frequency (T-F) domain. For each T-F unit, we calculate the contributions from each source by a nonlinear function. A masking matrix is formed based on contributions from each source and used to extract original sounds. Simulation results showed that all three methods performed well when sound sources located far apart from each other. When sources are at close locations, only the proposed third method performed well. Comparison with some conventional methods also showed the third method performed better and more robust in most cases.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079813520
http://hdl.handle.net/11536/47006
顯示於類別:畢業論文


文件中的檔案:

  1. 352001.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。