以空間線索為根據的時頻遮罩應用於雙耳回響聲源分離與雜訊消除

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	蕭方濟	en_US
dc.contributor.author	Hsiao Fang-Chi	en_US
dc.contributor.author	冀泰石	en_US
dc.date.accessioned	2014-12-12T02:44:30Z	-
dc.date.available	2014-12-12T02:44:30Z	-
dc.date.issued	2014	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#GT070160235	en_US
dc.identifier.uri	http://hdl.handle.net/11536/75950	-
dc.description.abstract	本論文中，我們從混合聲源的音訊頻譜中萃取出空間線索，如雙耳間的能量差與時間差，並藉由分類混合音訊頻譜上的時頻單元重建回目標聲源的頻譜。然而，聲音頻率的高低會影響時間差與能量差在空間定位上的鑑別度，所以本論文根據聽覺感知，在不同的頻率範圍分別選用鑑別度較高的資訊。本論文中所使用到的空間資訊有聲源在空間的角度、能量差以及雙耳間的頻譜所構成的混合向量，並利用雙耳間的一致性以及雜訊語句功率頻譜密度與雜訊功率頻譜密度的比值來判別時頻單元的可靠性，之後將可靠時頻單元上的空間線索利用最大期望演算法將它們做分類後建構出目標聲源的遮罩，並在目標聲源遮罩中對可靠性較低的時頻單元給定一個常數，之後利用濾波器組來平滑化目標聲源遮罩。最後，我們利用訊號對失真的能量比值(Signal-to-Distortion ratio, SDR)與聲源分離的感知評分(Overall Perceptual Score, OPS)來評比分離出的目標聲源效果，主客觀的實驗結果均顯示我們提出的方法較文獻[29]上的方法有較佳的聲源分離結果。	zh_TW
dc.description.abstract	In this thesis, we extract the spatial cues such as interaural level differences (ILDs) and interaural time differences (ITDs) from the mixture spectrograms to reconstruct a spectrogram for the target source by classifying and assigning the time-frequency (T-F) units of the mixture spectrograms to the target source. However, the frequency of the sound affects the efficacy of ITD and ILD in localizing the sound. Hence, we select appropriate cues within different frequency ranges based on hearing perception. The sound angles derived from ITDs, ILDs, and mixing vectors are used as the spatial cues in this thesis. The interaural coherence (IC) and the power ratio of noisy speech and estimated noise are used to determine if the T-F unit is reliable. After selecting reliable T-F units, we employ the expectation-maximization (EM) algorithm to obtain the mask of the target source. The mask values of unreliable T-F units are set to a constant. We then apply the gammatone filterbank to the derived target mask to obtain the smoothed mask. Subjective tests and objective scores, the signal-to-distortion ratio (SDR) and the overall perceptual score (OPS), demonstrate our proposed method outperforms the state-of-the-art method [29] in segregating sounds.	en_US
dc.language.iso	zh_TW	en_US
dc.subject	聲源分離	zh_TW
dc.subject	雙耳線索	zh_TW
dc.subject	EM 演算法	zh_TW
dc.subject	統計模型	zh_TW
dc.subject	source separation	en_US
dc.subject	binaural cues	en_US
dc.subject	EM algorithm	en_US
dc.subject	statistical modeling	en_US
dc.title	以空間線索為根據的時頻遮罩應用於雙耳回響聲源分離與雜訊消除	zh_TW
dc.title	Time-Frequency Masking Based on Spatial Cues for Binaural Reverberant Source Separation and Noise Reduction	en_US
dc.type	Thesis	en_US
dc.contributor.department	電信工程研究所	zh_TW
顯示於類別：	畢業論文