標題: | 中文單詞之韻律模擬與其應用 Prosody Modeling for Isolated Mandarin Words and Its Application |
作者: | 施宏廣 Hung-Kuang Shih 陳信宏 Sin-Horng Chen 電信工程研究所 |
關鍵字: | 韻律模擬;連音現象;中文單詞;prosody modeling;coarticulation;isolated word |
公開日期: | 2008 |
摘要: | 本論文探討中文單詞的音節基頻軌跡、長度和能量三種韻律參數的模式,考慮了音節的聲調、與前後音節的連音影響、音節在詞中的位置和基本音節類別等因素對三種韻律參數的影響,藉由假設這些影響因素彼此獨立且具加成性,我們設計了一個逐項最佳化的遞迴訓練方法來由實際語料估計模型參數。以一套包含107,936個單詞的單一女性語者的語料庫訓練韻律模型,並分析各種影響因素的物理意義和模式的誤差;實驗結果顯示此模型能有效描述此三種韻律參數的變化。
在驗證此韻律模式的有效性後,我們使用它建立了一套中文韻律學習系統,提供非中文母語的使用者學習。使用者可依需要輸入單詞,系統會自動合成該單詞之語音及顯示正確的三種韻律參數變化讓使用者模仿;並且在使用者由麥克風錄音後進行語音切割、求取基頻軌跡和能量等處理,並提供使用者相關韻律資訊回饋學習。 This thesis can be divided into two parts. In the first part, syllable-based prosodic models for syllable F0 contour, duration and energy are proposed. Four affecting factors: syllable tone, inter-syllable coarticulation, syllable position in a word and base syllable type are considered. These affecting factors are assumed to be independent and additive. A large speech database containing 107,936 isolated Mandarin words and recorded by a professional female announcer is used to train the prosodic models. The affecting factors and modeling errors are analyzed after the convergence. It shows that the proposed model is effective. In the second part of the thesis, a Mandarin prosody learning system for non-native speakers is built as an application of the prosodic model. The user can first enter a Mandarin word, and an ideal speech and prosodic features, including syllable F0 contour, duration and energy, will be generated based on the prosodic model. The user can then record his/her own voice, and similarly, the prosodic features of the recorded voice will be extracted by the system. The user can learn and adjust the speaking style by comparing the difference between targeted and recorded voice and prosodic features. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079613506 http://hdl.handle.net/11536/41947 |
顯示於類別: | 畢業論文 |