標題: | Hierarchical Prosody Modeling of English Speech and its Application to TTS |
作者: | Tsai, Chung-Yao Kuo, Chin-Kuan Wang, Yih-Ru Chen, Sin-Horng Liao, I-Bin Chiang, Chen-Yu 交大名義發表 National Chiao Tung University |
關鍵字: | Prosody modeling;Text-to-Speech;Hierarchical prosodic model |
公開日期: | 1-Jan-2014 |
摘要: | In this paper, a hierarchical prosody modeling approach for English speech is proposed. It is an extended version of the HPM approach proposed previously for Mandarin speech. It first designs a syllable-based, statistical prosodic model to describe various relationships of prosodic-acoustic features of the speech signal, linguistic features of the associated text, and prosodic tags representing the underlining prosody structure of the speech. It then employs a prosody labeling and modeling algorithm to estimate the model parameters and label the prosodic tags of all training utterances simultaneously from a prosody-unlabeled speech corpus. Experimental results on a corpus containing many paragraphic utterances of a female English-majored Chinese speaker show that the inferred parameters of the model are all meaningful. We then use the trained model to generate prosodic information for a TTS system. An informal listening test shows that the synthetic speech sounds quite natural. |
URI: | http://hdl.handle.net/11536/146820 |
期刊: | 2014 17TH ORIENTAL CHAPTER OF THE INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDIZATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (COCOSDA) |
Appears in Collections: | Conferences Paper |