标题: STRUCTURAL MAXIMUM A POSTERIORI SPEAKER ADAPTATION OF SPEAKING RATE-DEPENDENT HIERARCHICAL PROSODIC MODEL FOR MANDARIN TTS
作者: Liao, I-Bin
Chiang, Chen-Yu
Chen, Sin-Horng
电机学院
College of Electrical and Computer Engineering
关键字: speaker adaptation;hierarchical prosodic model;prosodic-acoustic features;Mandarin TTS
公开日期: 2016
摘要: In this paper, a structural maximum a posterior speaker adaptation method to adjust the existing speaking rate (SR) dependent hierarchical prosodic model (SR-HPM) to a new speaker\'s data for realizing a new voice of any given SR is discussed. The adaptive SR-HPM is formulated based on MAP estimation with a reference SR-HPM serving as an informative prior. The prior information provided by the reference SR-HPM is hierarchically organized by decision trees. The results of objective and subjective evaluations showed that the proposed method not only performed slightly better than the maximum likelihood-based model in the observed SR range of the target speaker\'s data, but also was much better in the unseen SR range.
URI: http://hdl.handle.net/11536/136366
ISBN: 978-1-4799-9988-0
ISSN: 1520-6149
期刊: 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS
起始页: 5625
结束页: 5629
显示于类别:Conferences Paper