標題: A novel syllable duration modeling approach for Mandarin speech
作者: Lai, WH
Chen, SH
電信工程研究所
Institute of Communications Engineering
公開日期: 2001
摘要: In this paper, a novel syllable duration modeling approach for Mandarin speech is proposed. It explicitly takes several main affecting factors as multiplicative companding parameters and estimates all model parameters by an EM algorithm. Experimental results showed that the variance of the observed syllable duration was greatly reduced from 183.4 frame(2) (1 frame = 5 ms) to 18.5 frame(2) by eliminating effects from these affecting factors. Besides, the estimated companding values of these affecting factors agreed well to our prior linguistic knowledge. A preliminary study of applying the proposed model to predict syllable duration for TTS is also performed. Experimental results showed that it outperformed the conventional regressive prediction method. Lastly, an extension of the approach to incorporate initial and final duration modeling is presented. This leads to a better understanding of the relation between the companding factors of initial and final duration models and those of syllable duration model.
URI: http://hdl.handle.net/11536/19042
ISBN: 0-7803-7041-4
ISSN: 1520-6149
期刊: 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM
起始頁: 93
結束頁: 96
Appears in Collections:Conferences Paper