Speaker Adaptation of Speaking Rate-dependent Hierarchical Prosodic Model for Mandarin TTS

Full metadata record

DC Field	Value	Language
dc.contributor.author	Wang, Po-Chun	en_US
dc.contributor.author	Liao, I-Bin	en_US
dc.contributor.author	Chiang, Chen-Yu	en_US
dc.contributor.author	Wang, Yih-Ru	en_US
dc.contributor.author	Chen, Sin-Horng	en_US
dc.date.accessioned	2019-04-02T06:04:25Z	-
dc.date.available	2019-04-02T06:04:25Z	-
dc.date.issued	2014-01-01	en_US
dc.identifier.uri	http://hdl.handle.net/11536/150638	-
dc.description.abstract	In this paper, a speaker adaptation method to adapt an existing speaking rate-dependent hierarchical prosodic model (SR-HPM) of an SR-controlled Mandarin TTS system to new speaker's data for realizing a new voice is proposed. Two main problems are addressed: data sparseness for few adaptation utterances existing only in a small range of normal speaking rate and no adaptation data in both ranges of fast and slow speaking rates. The proposed method follows the idea of SR-HPM training to firstly normalize the prosodic-acoustic features of the new speaker's speech data, to then train an HPM by the prosody labeling and modeling algorithm, and to lastly refine the HPM to an SR-dependent model. The MAP adaptation method with model parameter extrapolation is applied to cope with the above two problems. Experimental results on a male speaker's adaptation data confirmed that the resulting adaptive SR-HPM has reasonable parameters covering a wide range of speaking rates and hence can be used in the TTS system to generate prosodic-acoustic features for synthesizing the new speaker's voice of any given SR.	en_US
dc.language.iso	en_US	en_US
dc.subject	speaker adaptation	en_US
dc.subject	hierarchical prosodic model	en_US
dc.subject	prosodic-acoustic features	en_US
dc.subject	Mandarin TTS	en_US
dc.title	Speaker Adaptation of Speaking Rate-dependent Hierarchical Prosodic Model for Mandarin TTS	en_US
dc.type	Proceedings Paper	en_US
dc.identifier.journal	2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP)	en_US
dc.citation.spage	511	en_US
dc.contributor.department	電機學院	zh_TW
dc.contributor.department	College of Electrical and Computer Engineering	en_US
dc.identifier.wosnumber	WOS:000349765600129	en_US
dc.citation.woscount	2	en_US
Appears in Collections:	Conferences Paper