標題: 考慮語速影響之漢語韻律模型建立與語音合成之應用
A Modeling of Speaking Rate Influences on Mandarin Speech Prosody and its Application to TTS
作者: 謝喬華
Hsieh, Chiao-Hua
王逸如
Wang, Yih-Ru
電信工程研究所
關鍵字: 語速;韻律模型;語音合成;speaking rate;prosodic model;speech synthesis
公開日期: 2011
摘要: 本論文提出一個新方法,考慮漢語說話速度對韻律變化的影響,建立一個語速相依的漢語階層式韻律模型(SR-HPM)。本方法修正了先前的非監督式韻律標記與模式(PLM)方法,將語速當作一新的連續獨立變數,讓韻律聲學參數及韻律模型參數受其影響。本研究之SR-HPM建構於一位專業女性播報員所錄製四種不同語速之平行語料庫。實驗結果顯示語速對於模型參數之影響符合現有的語言學知識,證實了本研究所提出之方法能系統化地量化語速對漢語韻律之影響。 最後將本研究所提出之韻律模型應用在文字轉語音上,我們製作了一個可控制語速的中文文字轉語音系統。實驗主觀測試結果顯示,我們所提出之方法在快、慢語速都明顯優於傳統ML為基礎的語速控制方法。
In this thseis, a new approach of Mandarin-speech prosody modeling to consider the effects of speaking rate is proposed. The approach is a modification of previous prosody labeling and modeling (PLM) method to take speaking rate as a continuous independent vaiable and let prosodic-acoustic features and some parameters of prosodic models depend on it in order to account for its influences. A speaking rate-dependent hierarchical prosodic model (SR-HPM) is hence constructed from four speech corpra of a single female speaker with four different speaking rates. An analysis of the effects of speaking rate on the model parameters showed that they agreed well with our prior knowledge. So, the proposed approach provides a systematic and effective way to quantify the effects of speaking rate on Mandarin-speech prosody.   Last, an application to the prosody generation for Mandarin text-to-speech (TTS) is proposed. By using the well-trained SR-HPM, a speaking rate-controlled TTS system that can generate fluent speech for any given speaking rate is implemented.The subjective testing results indicated that the proposed methed was significantly better than the conveninal ML-based method for fast and slow rate.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079913555
http://hdl.handle.net/11536/49334
顯示於類別:畢業論文


文件中的檔案:

  1. 355501.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。