Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Wang, Po-Chun | en_US |
dc.contributor.author | Liao, I-Bin | en_US |
dc.contributor.author | Chiang, Chen-Yu | en_US |
dc.contributor.author | Wang, Yih-Ru | en_US |
dc.contributor.author | Chen, Sin-Horng | en_US |
dc.date.accessioned | 2019-04-02T06:04:25Z | - |
dc.date.available | 2019-04-02T06:04:25Z | - |
dc.date.issued | 2014-01-01 | en_US |
dc.identifier.uri | http://hdl.handle.net/11536/150638 | - |
dc.description.abstract | In this paper, a speaker adaptation method to adapt an existing speaking rate-dependent hierarchical prosodic model (SR-HPM) of an SR-controlled Mandarin TTS system to new speaker's data for realizing a new voice is proposed. Two main problems are addressed: data sparseness for few adaptation utterances existing only in a small range of normal speaking rate and no adaptation data in both ranges of fast and slow speaking rates. The proposed method follows the idea of SR-HPM training to firstly normalize the prosodic-acoustic features of the new speaker's speech data, to then train an HPM by the prosody labeling and modeling algorithm, and to lastly refine the HPM to an SR-dependent model. The MAP adaptation method with model parameter extrapolation is applied to cope with the above two problems. Experimental results on a male speaker's adaptation data confirmed that the resulting adaptive SR-HPM has reasonable parameters covering a wide range of speaking rates and hence can be used in the TTS system to generate prosodic-acoustic features for synthesizing the new speaker's voice of any given SR. | en_US |
dc.language.iso | en_US | en_US |
dc.subject | speaker adaptation | en_US |
dc.subject | hierarchical prosodic model | en_US |
dc.subject | prosodic-acoustic features | en_US |
dc.subject | Mandarin TTS | en_US |
dc.title | Speaker Adaptation of Speaking Rate-dependent Hierarchical Prosodic Model for Mandarin TTS | en_US |
dc.type | Proceedings Paper | en_US |
dc.identifier.journal | 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | en_US |
dc.citation.spage | 511 | en_US |
dc.contributor.department | 電機學院 | zh_TW |
dc.contributor.department | College of Electrical and Computer Engineering | en_US |
dc.identifier.wosnumber | WOS:000349765600129 | en_US |
dc.citation.woscount | 2 | en_US |
Appears in Collections: | Conferences Paper |