標題: 中文語音停頓韻律標記預估之改進
An Improvement on Break Tag Prediction for Mandarin Speech
作者: 陳睿詮
Chen, Jui-Chuan
陳信宏
Chen, Sin-Horng
電信工程研究所
關鍵字: 停頓標記預估;語音合成;Break tag prediction;Speech synthesis
公開日期: 2012
摘要: 此論文將針對中文語音合成器中的停頓韻律標記預估做改進,由於文字轉語音的系統裡沒有聲學參數的輔助,我們只能利用文字分析器得到的語言參數來預估停頓韻律標記,而一般來說文字分析器所能得到的語言參數大多是詞階層以及語句階層的語言參數,即使有了這些參數,語法以及語意上的資訊仍略顯得不足。要增進中文語音停頓韻律的預估,我們還需要更豐富的語言參數,以描述中文語音裡語法及語意結構的資訊。本研究採取人工分類、標記的方式找出一些常見或特殊的詞組、片語,並且以統計的方式分析詞組及片語內特殊位置的音節邊界的停頓分布,以及這些音節邊界相對於詞組及片語邊界的韻律斷點相對強度。 分析結果發現,在大多數的詞組內的詞邊界不會出現停頓,詞組及片語內特殊位置的停頓會受到詞組及片語結構的影響,且結構愈短的詞組愈不會違反詞組及片語邊界的韻律斷點相對強度。 實驗的結果也證明了加入詞組及片語的語言參數,對於停頓韻律標記的預估的確會有幫助,不論是在靜態地利用語言參數預估每個音節邊界的停頓,或是輔助動態地搜尋韻律單元邊界,都有還不錯的效果,表示加入詞組以及片語的語言資訊能夠更正確的描述中文語法結構,進而增進停頓韻律的預估。
This thesis proposed an improvement method on break tag prediction for Mandarin speech synthesis. The linguistic features given from parser were utilized for the prediction of break tags due to the lack of prosodic-acoustic features in TTS. Generally, the linguistic features generated by parser belong to the word-level and sentence-level. However, the syntactic and semantic information still remain insufficient even the word-level and sentence-level features are given. In order to improve the break prediction for Mandarin speech, more linguistic features for describing the syntactic and semantic information are needed. This research classifies and labels the common and special word chunk as well as the phrase artificially, analyze the inter-syllable break appeared at special position in word chunk or phrase by using statistic distribution and decision tree, and investigate at last the mutual strength between these special position and the boundary of word chunk and phrase. The analyzed result showed that the inter-syllable break at special position in word chunk are mostly non-break, while the break of special position in word chunk or phrase are affected by the structure of word chunk or phrase. Furthermore, the smaller structure of word chunk and phrase posseses higher probability to follow the rule of mutual strength related between special position and the boundary of word chunk or phrase. The experiment results also showed that the adding of linguistic features of word chunk and phrase can in fact improve the prediction of break tags. Either using the linguistic features to predict inter-syllable break tags statically, or assisting the dynamic search for boundary of prosodic unit could the TTS achieve a more effective capability of break tags prediction. It virtually showed that the addition of word chunk and phrase information is capable of describing the syntactic structure more correctly, and then improve a more precise prediction of break tags.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079913549
http://hdl.handle.net/11536/49328
Appears in Collections:Thesis


Files in This Item:

  1. 354901.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.