英語階層式韻律模型之建立與其在語音合成之應用

標題:	英語階層式韻律模型之建立與其在語音合成之應用 Hierarchical Prosody Modeling for English Speech and its Application to TTS
作者:	蔡仲堯王逸如電信工程研究所
關鍵字:	英文韻律模型;階層式韻律模型;字轉音;English prosody model;Hierarchical Prosody Modeling;TTS
公開日期:	2014
摘要:	本論文目的在於針對英語朗讀語料，建立一個英語的階層式韻律模型，並完成自動化的韻律標記，所用之語料為一位以中文為母語之女性所錄製，語者大學主修英文，英文發音正確且朗讀流暢。本研究以中文HPM為基礎，針對英文的特性設計syllable-based的英語韻律模型，用以描述語音信號的韻律聲學參數、文字的語法參數及韻律標記之間的關係，而所用的韻律標記為韻律狀態及停頓標記，用以描述上層韻律架構；再利用PLM演算法預估模型參數及對訓練語料自動標記韻律標記。實驗結果顯示，模型產生的參數符合語言學知識，證實HPM也能有效的應用在英文上，最後本研究將訓練完成的韻律模型應用在英語文字轉語音上，用來產生韻律信息，使得合成的語音較為自然。 A hierarchical prosody modeling approach for English speech is proposed. It is an extended version of the HPM approach proposed previously for Mandarin speech. It first designs a syllable-based, statistical prosodic model to describe various relationships of prosodic-acoustic features of the speech signal, linguistic features of the associated text, and prosodic tags representing the underlining prosody structure of the speech. It then employs a prosody labeling and modeling algorithm to estimate the model parameters and label the prosodic tags of all training utterances simultaneously from a prosody-unlabeled speech corpus. Experimental results on a corpus containing many paragraphic utterances of a female English-majored Chinese speaker show that the inferred parameters of the model are all meaningful. We then use the trained model to generate prosodic information for a TTS system. An informal listening test shows that the synthetic speech sounds quite natural.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#GT070160223 http://hdl.handle.net/11536/75593
顯示於類別：	畢業論文