標題: Latent prosody model of continuous mandarin speech
作者: Chiang, Chen-Yu
Wang, Xiao-Dong
Liao, Yuan-Fu
Wang, Yih-Ru
Chen, Sin-Horng
Hirose, Keikichi
電信工程研究所
Institute of Communications Engineering
關鍵字: speech processing;speech recognition;tone recognition
公開日期: 2007
摘要: The major difficulty of prosody modeling and automatic tone recognition of continuous Mandarin speech is the complex interaction of tones and prosody/intonation on F0 contours. In this study, we propose a latent prosody model (LPM) aiming to jointly model the affections of tone and prosody state on F0. The main purposes are twofold including (1) automatic prosody state labeling and (2) improving tone recognition accuracy. The basic idea is to introduce latent prosody state variables into an additive statistic model of F0 which already considers the affecting factors of tone and speaker. Experiments on the Tree-Bank corpus showed that LPM not only gave meaningful prosody state labeling results but also improved the average tone recognition rate from 80.86% of a multi-layer perceptron (MLP) baseline to 82.55%.
URI: http://hdl.handle.net/11536/11190
ISSN: 1520-6149
期刊: 2007 IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol IV, Pts 1-3
起始頁: 625
結束頁: 628
顯示於類別:會議論文