音高追蹤法應用於單通道語音和歌聲分離

Full metadata record

DC Field	Value	Language
dc.contributor.author	林澤恩	en_US
dc.contributor.author	Lin, Tse-En	en_US
dc.contributor.author	冀泰石	en_US
dc.contributor.author	Chi,Tai-Shih	en_US
dc.date.accessioned	2014-12-12T01:27:54Z	-
dc.date.available	2014-12-12T01:27:54Z	-
dc.date.issued	2011	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#GT079613521	en_US
dc.identifier.uri	http://hdl.handle.net/11536/41960	-
dc.description.abstract	隨著智慧型手機的流行，語音和多媒體的應用因而蓬勃發展。對於語音來說，消除背景的噪音是重要的應用，由於噪音的存在使得語音的品質和辨識結果變差，如何將目標語音分離出來變成必要的課題。對多媒體來說，音樂檢索被認為是重要的議題之一，例如卡拉OK的歌詞同步系統。為了達到這樣的目的，善用歌聲的資訊是重要的研究方向。然而，當唱片在錄音間製作時，歌聲往往混著音樂伴奏，純歌聲通常是不可得的，因此同樣面臨著分離的問題。在本論文中，將會利用音高擷取作為語音分離和歌聲分離的基礎。語音分離的部份，將考慮語音和語音混合的情景，本研究提出兩個音高擷取的演算法。歌聲分離的部份，將利用歌聲音高的擷取作為歌聲分離的依據。在音高擷取的過程當中，音高追蹤法藉由音框間成本函數的設定，提高音高的精準度。分離的實驗顯示，對於語音分離，與近期常見的Hu-Wang演算法相比，本研究提出的演算法在男聲混女聲下在主觀和客觀的評比下有較好的結果，但是對於男聲混男聲下，Hu-Wang的演算法比較好，本研究也提出可能的原因和改善的方向。對於歌聲分離，與三個最近提出的演算法相比，本研究提出的演算法可以改善客觀評比下的效能。	zh_TW
dc.description.abstract	Since smartphones are ubiquitous nowadays, the demand for speech and multimedia re-lated applications grows vigorously. For speech applications, reduction of noise is one of the high demanded techniques. The existence of noise degrades speech quality and performance of speech recognition dramatically. To separate target speech from interferences in the con-taminated recording is a popular research topic. For multimedia, music information retrieval is needed in many applications, for example, the synchronization between singing voice and lyrics. However, singing voice is always mixed with background music when albums are produced in the studio. The post-processing of the vocal/music separation is also on demand. In this thesis, pitch is used as a basic feature for speech and singing voice separation. For speech separation applications, the scenario of speech mixed with speech is considered and an algorithm to extract two pitch values is proposed. For singing voice separation applica-tions, a system to extract singing voice is proposed. For the pitch extraction, temporal pitch tracking is also engaged to improve the accuracy of estimated pitch values in each frame. Experiment results show the proposed speech separation algorithm performs better than the Hu-Wang system in male-female speech mixtures using objective and subjective performance measures, while Hu-Wang system performs better in male-male mixtures. Experiment results show the proposed singing voice separation algorithm performs better than three systems using an objective performance measure.	en_US
dc.language.iso	zh_TW	en_US
dc.subject	單通道語音分離	zh_TW
dc.subject	單通道歌聲分離	zh_TW
dc.subject	音高估計和追蹤	zh_TW
dc.subject	維特比演算法	zh_TW
dc.subject	音高個數	zh_TW
dc.subject	歌聲偵測	zh_TW
dc.subject	monaural speech separation	en_US
dc.subject	monaural vocal/music separation	en_US
dc.subject	pitch estimation and tracking	en_US
dc.subject	Viterbi algorithm	en_US
dc.subject	pitch number	en_US
dc.subject	singing voice detection	en_US
dc.title	音高追蹤法應用於單通道語音和歌聲分離	zh_TW
dc.title	Monaural Speech Separation and Vocal/Music Separation Using Viterbi based Pitch Tracking	en_US
dc.type	Thesis	en_US
dc.contributor.department	電信工程研究所	zh_TW
Appears in Collections:	Thesis