標題: 英語階層式韻律模型之建立與其在語音合成之應用
Hierarchical Prosody Modeling for English Speech and its Application to TTS
作者: 蔡仲堯
王逸如
電信工程研究所
關鍵字: 英文韻律模型;階層式韻律模型;字轉音;English prosody model;Hierarchical Prosody Modeling;TTS
公開日期: 2014
摘要: 本論文目的在於針對英語朗讀語料,建立一個英語的階層式韻律模型,並完成自動化的韻律標記,所用之語料為一位以中文為母語之女性所錄製,語者大學主修英文,英文發音正確且朗讀流暢。本研究以中文HPM為基礎,針對英文的特性設計syllable-based的英語韻律模型,用以描述語音信號的韻律聲學參數、文字的語法參數及韻律標記之間的關係,而所用的韻律標記為韻律狀態及停頓標記,用以描述上層韻律架構;再利用PLM演算法預估模型參數及對訓練語料自動標記韻律標記。實驗結果顯示,模型產生的參數符合語言學知識,證實HPM也能有效的應用在英文上,最後本研究將訓練完成的韻律模型應用在英語文字轉語音上,用來產生韻律信息,使得合成的語音較為自然。
A hierarchical prosody modeling approach for English speech is proposed. It is an extended version of the HPM approach proposed previously for Mandarin speech. It first designs a syllable-based, statistical prosodic model to describe various relationships of prosodic-acoustic features of the speech signal, linguistic features of the associated text, and prosodic tags representing the underlining prosody structure of the speech. It then employs a prosody labeling and modeling algorithm to estimate the model parameters and label the prosodic tags of all training utterances simultaneously from a prosody-unlabeled speech corpus. Experimental results on a corpus containing many paragraphic utterances of a female English-majored Chinese speaker show that the inferred parameters of the model are all meaningful. We then use the trained model to generate prosodic information for a TTS system. An informal listening test shows that the synthetic speech sounds quite natural.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070160223
http://hdl.handle.net/11536/75593
顯示於類別:畢業論文