標題: | 深層信念網路在雙耳語音分離及消除迴響上的應用 Application of Deep Belief Network on Binaural Speech Separation and Dereverberation |
作者: | 陳奕廷 Chen, Yi-Ting 冀泰石 Chi, Tai-Shih 電信工程研究所 |
關鍵字: | 雙耳語音分離;消除迴響;深層信念網路;Binaural speech separation;Dereverberation;Deep Belief Network |
公開日期: | 2015 |
摘要: | 雙耳的音訊分離以及消除迴響一直都是很熱門的議題,我們之前有嚐試在頻譜使用非監督式(unsupervised)學習的方法來進行音訊分離及消除迴響。本論文中,我們將使用監督式(supervised)學習分類方法來進行音訊分離及消除迴響,我們以深層信念網路(deep belief network)為分類器,以每個時頻單元(T-F unit)的理想二元遮罩(ideal binary mask, IBM)為目標,採用雙耳產生的空間資訊:雙耳時間差(interaural time difference, ITD)以及雙耳能量差(interaural level difference, ILD)為我們的訓練特徵。而為了提高消除迴響的效能,我們在訓練IBM的過程中加入了雙耳一致性(interaural coherence, IC)的資訊,最後在此架構下檢視雙耳個別訓練、雙耳共同訓練以及加入多工學習(multitask learning, MTL)的概念,在雙耳分離及消除迴響上的效能差異,其中我們將不同的迴響環境視為多工學習下的訓練任務,希望藉由多工學習能提升音訊分離和消除迴響的效果,最後我們將雙耳個別訓練、雙耳共同訓練以及多工學習的效能與之前發展的非監督式學習方法的效能做比較。 Binaural speech separation and de-reverberation are popular research topics and we have developed an unsupervised clustering method for these purposes. In this thesis, we adopt a supervised classification method for binaural speech separation and de-reverberation using the ideal binary mask (IBM) as the training target and a deep belief network (DBN) as the classifier. We extract the interaural time difference (ITD) and the interaural level difference (ILD) of each T-F unit as the binaural features. To boost the performance of de-reverberation, the interaural coherence (IC) is considered when building the target IBM. We propose three different DBN architectures, the side-by-side training (monaural training), the joint training (binaural training) and the multitask learning, and compare their binaural de-reverberation performance with the performance of our previously developed unsupervised clustering method in terms of many objective criteria. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT070260256 http://hdl.handle.net/11536/127140 |
顯示於類別: | 畢業論文 |