標題: | 以韻律模型為基礎之中文韻律轉換研究 A Study on Model-based Prosody Conversion for Mandarin Chinese |
作者: | 宋柏毅 Sung, Po-Yi 陳信宏 Chen, Sin-Horng 電信工程研究所 |
關鍵字: | 聲音轉換;韻律轉換;韻律模型;voice conversion;prosody conversion;prosody model |
公開日期: | 2008 |
摘要: | 本研究提出以韻律模型為基礎的中文韻律轉換方法,其系統架構可分為訓練以及轉換部份。在訓練部份,先以A-PLM演算法分別對來源以及目標語料標示韻律標記並建立韻律模型,接著建立彼此韻律標記上的轉換關係。本論文提出兩種轉換方法,在方法一中以線性轉換的方式預估目標韻律狀態,此方法不需特別用到平行語料;而在方法二中,以MMSE(Minimum Mean Square Error)原則,建立來源與目標韻律標記的轉換關係,它需使用平行語料。在轉換部份,首先以A-PLM演算法標記欲轉換的語句,即可將得到的標記資訊透過轉換函式,預估目標語者的韻律標記;最後,藉由預估得到的目標語者標記資訊以及目標韻律模型還原音節基頻軌跡、音節長度以及音節能量位階,並利用目標語音原始之頻譜參數,以STRAIGHT合成器合成轉換之聲音。實驗結果證實,本論文所提出之方法在中央研究院COSPRO語料庫上轉換效果優於傳統轉換方法。以平行語料為基礎的方法中,方法二之轉換效果在不同轉換組別皆優於以高斯混合模型為基礎之轉換,而以非平行語料為基礎所推導的方法中,方法一則優於高斯正規化轉換。 In this thesis, a novel model-based prosody conversion method for Mandarin speech is presented. In the training phase, the source and target speech datasets are first analyzed by the A-PLM method to label all utterances with prosody tags and to construct their own prosodic models; then, a mapping function is built to relate the prosodic phrase structure of the two speakers. Two schemes of building mapping function are proposed. Scheme 1 builds a linear mapping function to relate the source and target prosodic states. No parallel training datasets are needed. Scheme 2 builds a probabilistic mapping function to relate the source and target prosody tags. A set of parallel data is required to train the mapping function. In the conversion phase, the source utterance is first analyzed by the A-PLM method. The labeled prosody tags are then converted to the target prosody tags by the mapping function. The transformed syllable pitch contour, duration and energy level is lastly generated by the target prosodic model. Experimental results on the Sinica COSPRO corpus confirmed that the proposed method performed very well. The two proposed schemes outperformed the conventional methods of mean/variance transformation and GMM-based mapping conversion, respectively, for the cases without and with using parallel data. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079613559 http://hdl.handle.net/11536/41995 |
顯示於類別: | 畢業論文 |