標題: A statistics-based pitch contour model for Mandarin speech
作者: Chen, SH
Lai, WH
Wang, YR
電信工程研究所
Institute of Communications Engineering
公開日期: 1-Feb-2005
摘要: A statistics-based syllable pitch contour model for Mandarin speech is proposed. This approach takes the mean and the shape of a syllable log-pitch contour as two basic modeling units and considers several affecting factors that contribute to their variations. The affecting factors include the speaker, prosodic state (which essentially represents the high-level linguistic components of F0 and will be explained more clearly in Sec. 1), tone, and initial and final syllable classes. The parameters of the two modeling units were automatically estimated using the expectation-maximization (EM) algorithm. Experimental results showed that the root mean squared errors (RMSEs) obtained in the closed and open tests in the reconstructed pitch period were 0.362 and 0.373 ins, respectively. This model provides a way to separate the effects of several major factors. All of the inferred values of the affecting factors were in close agreement with our prior linguistic knowledge. It also gives a quantitative and more complete description of the coarticulation effect of neighboring tones rather than conventional qualitative descriptions of the tone sandhi rules. In addition, the model can provide useful cues to determine the prosodic phrase boundaries, including those occurring at intersyllable locations, with or without punctuation marks. (C) 2005 Acoustical Society of America.
URI: http://dx.doi.org/10.1121/1.1841572
http://hdl.handle.net/11536/23876
ISSN: 0001-4966
DOI: 10.1121/1.1841572
期刊: JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA
Volume: 117
Issue: 2
起始頁: 908
結束頁: 925
Appears in Collections:Articles


Files in This Item:

  1. 000226986900043.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.