完整後設資料紀錄
DC 欄位語言
dc.contributor.author蕭方濟en_US
dc.contributor.authorHsiao Fang-Chien_US
dc.contributor.author冀泰石en_US
dc.date.accessioned2014-12-12T02:44:30Z-
dc.date.available2014-12-12T02:44:30Z-
dc.date.issued2014en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT070160235en_US
dc.identifier.urihttp://hdl.handle.net/11536/75950-
dc.description.abstract本論文中,我們從混合聲源的音訊頻譜中萃取出空間線索,如雙耳間的能量差與時間差,並藉由分類混合音訊頻譜上的時頻單元重建回目標聲源的頻譜。然而,聲音頻率的高低會影響時間差與能量差在空間定位上的鑑別度,所以本論文根據聽覺感知,在不同的頻率範圍分別選用鑑別度較高的資訊。本論文中所使用到的空間資訊有聲源在空間的角度、能量差以及雙耳間的頻譜所構成的混合向量,並利用雙耳間的一致性以及雜訊語句功率頻譜密度與雜訊功率頻譜密度的比值來判別時頻單元的可靠性,之後將可靠時頻單元上的空間線索利用最大期望演算法將它們做分類後建構出目標聲源的遮罩,並在目標聲源遮罩中對可靠性較低的時頻單元給定一個常數,之後利用濾波器組來平滑化目標聲源遮罩。最後,我們利用訊號對失真的能量比值(Signal-to-Distortion ratio, SDR)與聲源分離的感知評分(Overall Perceptual Score, OPS)來評比分離出的目標聲源效果,主客觀的實驗結果均顯示我們提出的方法較文獻[29]上的方法有較佳的聲源分離結果。zh_TW
dc.description.abstractIn this thesis, we extract the spatial cues such as interaural level differences (ILDs) and interaural time differences (ITDs) from the mixture spectrograms to reconstruct a spectrogram for the target source by classifying and assigning the time-frequency (T-F) units of the mixture spectrograms to the target source. However, the frequency of the sound affects the efficacy of ITD and ILD in localizing the sound. Hence, we select appropriate cues within different frequency ranges based on hearing perception. The sound angles derived from ITDs, ILDs, and mixing vectors are used as the spatial cues in this thesis. The interaural coherence (IC) and the power ratio of noisy speech and estimated noise are used to determine if the T-F unit is reliable. After selecting reliable T-F units, we employ the expectation-maximization (EM) algorithm to obtain the mask of the target source. The mask values of unreliable T-F units are set to a constant. We then apply the gammatone filterbank to the derived target mask to obtain the smoothed mask. Subjective tests and objective scores, the signal-to-distortion ratio (SDR) and the overall perceptual score (OPS), demonstrate our proposed method outperforms the state-of-the-art method [29] in segregating sounds.en_US
dc.language.isozh_TWen_US
dc.subject聲源分離zh_TW
dc.subject雙耳線索zh_TW
dc.subjectEM 演算法zh_TW
dc.subject統計模型zh_TW
dc.subjectsource separationen_US
dc.subjectbinaural cuesen_US
dc.subjectEM algorithmen_US
dc.subjectstatistical modelingen_US
dc.title以空間線索為根據的時頻遮罩應用於雙耳回響聲源分離與雜訊消除zh_TW
dc.titleTime-Frequency Masking Based on Spatial Cues for Binaural Reverberant Source Separation and Noise Reductionen_US
dc.typeThesisen_US
dc.contributor.department電信工程研究所zh_TW
顯示於類別:畢業論文