Title: | 音節間連音狀態自動標記模型研究 A Study on Auto-Labeling Model for Coarticulation States between Syllables |
Authors: | 陳慶安 Ching-An Chen 陳信宏 Sin-Horng Chen 電信工程研究所 |
Keywords: | 單元選取;連音;unit selection;coarticulation |
Issue Date: | 2008 |
Abstract: | 隨著計算機之運算速度及記憶體容量的加快與增大,以大型語料庫為基礎之語音合成系統已成為目前最普遍且品質最好的語音合成系統,其方法是先輸入欲合成之文字後,再經過分析得到語言參數,接著在大型語料庫中依據所得到之語言參數找到對應之候選合成單元組,最後透過語音合成器挑選出最合適的單元組進行串接以得到合成語音。然而在單元選取時,常因為所挑選的單元與目標句之前後文字不同亦或因為合成單元本身受到連音效應影響等因素而造成合成語音在聽覺上之不適。為了改善這些缺點,在本文中我們提出利用頻譜參數來建構音節頻譜模型並同時標記出中文語料庫中音節間連音狀態。在模型中,我們考慮了三種影響音節頻譜之因素:基本音節類型,前、後音節類型和其音節間之連音程度,我們假設這些影響因素獨立且有加成性,在重複訓練後模型之影響因素能夠有不錯的學習效果,除此之外由模型標記之頻譜連音狀態在韻律參數以及語言參數上也能有合理的解釋。此方法在未來可以應用在語音合成系統中單元挑選的部份來幫助提升合成語音的品質。 As the computation power and the memory capacity increase, the corpus-base speech synthesis system has become the best and most popular speech synthesis system. Based on the system, the linguistic features are first derived after the text is parsed, then some appropriate units are selected as candidates. Finally, the well-pronounced speech is synthesized by concatenating the best unit sequence by the synthesizer part of system. In the unit selection process, the smooth-less places of synthesized speech usually caused by choosing the units which have different context with target units, or because the coarticulation effecting influencing. In this paper, to solve these problems, we use MFCC features to construct syllable spectral model and labeling coarticulation state between syllables in Chinese corpus at the same time. In this model, we have considered the three kinds of affecting factors with syllable spectral: the basic syllable type of current syllable, the coarticulation affecting from previous and following syllable, we assume that these three factors are independent and additive. After well-training, the affecting factor patterns could have good performance in model learning, besides the updated coarticulation states have reasonable explanation by prosody features and linguistic features. This method can improve the performance of synthesized speech by apply to unit selection process of using TTS system. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009513546 http://hdl.handle.net/11536/38389 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.