標題: | 英語階層式韻律模型之建立與其在語音合成之應用 Hierarchical Prosody Modeling for English Speech and its Application to TTS |
作者: | 蔡仲堯 王逸如 電信工程研究所 |
關鍵字: | 英文韻律模型;階層式韻律模型;字轉音;English prosody model;Hierarchical Prosody Modeling;TTS |
公開日期: | 2014 |
摘要: | 本論文目的在於針對英語朗讀語料,建立一個英語的階層式韻律模型,並完成自動化的韻律標記,所用之語料為一位以中文為母語之女性所錄製,語者大學主修英文,英文發音正確且朗讀流暢。本研究以中文HPM為基礎,針對英文的特性設計syllable-based的英語韻律模型,用以描述語音信號的韻律聲學參數、文字的語法參數及韻律標記之間的關係,而所用的韻律標記為韻律狀態及停頓標記,用以描述上層韻律架構;再利用PLM演算法預估模型參數及對訓練語料自動標記韻律標記。實驗結果顯示,模型產生的參數符合語言學知識,證實HPM也能有效的應用在英文上,最後本研究將訓練完成的韻律模型應用在英語文字轉語音上,用來產生韻律信息,使得合成的語音較為自然。 A hierarchical prosody modeling approach for English speech is proposed. It is an extended version of the HPM approach proposed previously for Mandarin speech. It first designs a syllable-based, statistical prosodic model to describe various relationships of prosodic-acoustic features of the speech signal, linguistic features of the associated text, and prosodic tags representing the underlining prosody structure of the speech. It then employs a prosody labeling and modeling algorithm to estimate the model parameters and label the prosodic tags of all training utterances simultaneously from a prosody-unlabeled speech corpus. Experimental results on a corpus containing many paragraphic utterances of a female English-majored Chinese speaker show that the inferred parameters of the model are all meaningful. We then use the trained model to generate prosodic information for a TTS system. An informal listening test shows that the synthetic speech sounds quite natural. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT070160223 http://hdl.handle.net/11536/75593 |
顯示於類別: | 畢業論文 |