Full metadata record
DC FieldValueLanguage
dc.contributor.author黃奕鈞zh_TW
dc.contributor.author冀泰石zh_TW
dc.contributor.authorHuang, Yi-Chunen_US
dc.contributor.authorChi, Tai-Shihen_US
dc.date.accessioned2018-01-24T07:37:27Z-
dc.date.available2018-01-24T07:37:27Z-
dc.date.issued2016en_US
dc.identifier.urihttp://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070351901en_US
dc.identifier.urihttp://hdl.handle.net/11536/139110-
dc.description.abstract非負矩陣分解是一種熱門的聲音聲源分離工具,它可以從聲源頻譜中習得頻譜字典,並且利用這些習得之字典將混合訊號加以分離。然而,標準的非負矩陣分解在學習的過程中並沒有考慮聲源內的時間特性。而非負矩陣分解類似生成模型的特性,使得它無法保證具有良好代表性的頻譜字典對於聲音聲源分離有幫助。此外,字典的學習也應該被劃分為數個子區塊以處理聲音訊號的不同時頻特性,例如不同語者的語音訊號,或者音樂訊號中的不同樂器。因此,我們提出的方法結合數種非負矩陣分解的延伸方法以解決上述問題,應用於語音降噪與歌聲及背景音樂分離。在時間特性建模部分,我們使用一套向量自回歸模型的後處理方法;在子區塊劃分方面,則引進一套局部基底學習方法。我們也引進了一套修改過後的鑑別式學習程序,用以解決代表性與分離效能之問題。總而言之,我們基於非負矩陣分解的延伸方法考慮了局部的時間特性以及模型對不同聲源的鑑別能力。zh_TW
dc.description.abstractThe nonnegative matrix factorization (NMF), which learns dictionaries from source spectra and uses the learned dictionaries to decompose the mixture in the test phase, is a widely used tool for audio source separation. However, the standard NMF does not consider temporal properties of the signals when learning dictionaries. The standard NMF is also a generative model, which do not guarantee that a good representation model is also a good separation model. Besides, the learned dictionaries should be partitioned into subgroups to account for sources with different spectro-temporal properties, such as speech signals from different speakers or music signals from different instruments. Therefore, we propose a method by combine extensions of NMF to address these problems for speech denoising and singing voice separation. For temporal modeling, our method adopts a post-filtering technique, which derives a source specific vector autoregressive (VAR) model to smooth the NMF coefficients in the test phase. For partitioning, we make use of the mixture of local dictionaries (MLD) technique to divide dictionaries into subgroups by considering intra- and inter- group distances. We also introduce a modified discriminative learning procedure to deal with the representation-separation problem. To sum up, our NMF-extended method put additional considerations on the temporal properties of each subgroup and discrimination between sources.en_US
dc.language.isozh_TWen_US
dc.subject單聲道聲源分離zh_TW
dc.subject局部保留zh_TW
dc.subject歌聲分離zh_TW
dc.subject語音消噪zh_TW
dc.subject鑑別式學習zh_TW
dc.subject非負矩陣分解zh_TW
dc.subjectMonaural audio source separationen_US
dc.subjectLocal preservingen_US
dc.subjectSinging voice separationen_US
dc.subjectSpeech denoisingen_US
dc.subjectDiscriminative learningen_US
dc.subjectNonnegative matrix factorizationen_US
dc.title使用鑑別式動態非負矩陣分解之單聲道聲源分離zh_TW
dc.titleDiscriminative and Dynamic Nonnegative Matrix Factorization on Monaural Audio Source Separationen_US
dc.typeThesisen_US
dc.contributor.department工學院聲音與音樂創意科技碩士學位學程zh_TW
Appears in Collections:Thesis