标题: 汉语韵律微结构模式的初步研究
A Preliminary Study on Microstructure of Mandarin Speech Prosody
作者: 林宜宣
陈信宏
Lin,Yi-Hsuan
电信工程研究所
关键字: 汉语;韵律;微结构;音素;Mandarin Speech;Prosody;Microstructure;Phone
公开日期: 2017
摘要: 本论文对三种最主要的韵律声学参数做微结构的初步研究,包括基频轨迹、音长、能量轨迹,本研究以音素做为分析单位,分析一个包含四种不同语速之平行语料库的三种音素声学韵律参数的变化。首先,将此语料库之语句信号以强迫对准切割成音素序列,并且对此语料库使用非监督式韵律标记与模式方法,做自动停顿及韵律状态的韵律标记。研究主要是探讨语言参数及两类韵律标记对音素长度、音素能量轨迹、以及音素的基频存在机率的影响,方法是使用决策树,将各种影响因素以问题形式加入,来决定在各种前后文的情况下三种音素声学韵律参数的估计值,实验结果显示此做法可以改善此三种韵律参数的合成。
A preliminary study on the microstructure of Mandarin speech prosody is conducted in this thesis. The variations of three phoneme-based prosodic-acoustic features on a large speech corpus containing four parallel sub-corpora are explored. They include phone duration, phone energy contour, and phone’s occurrence probability of pitch. First, all utterances of the corpus are segmented into phoneme sequences by forced alignment. Meanwhile, labelling of utterance with two prosodic tags of break and prosodic state is performed by the PLM algorithm proposed previously. Then, the influences of linguistic features and prosodic tags on the three prosodic-acoustic features are analyzed. The method of regression tree is employed to determine the estimates of these three prosodic-acoustic features in different contexts considering various combinations of linguistic features and prosodic tags. Experimental results show that the proposed method can improve the syntheses of these three prosodic-acoustic features.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070460251
http://hdl.handle.net/11536/142768
显示于类别:Thesis