完整後設資料紀錄
DC 欄位語言
dc.contributor.author吳文良en_US
dc.contributor.author陳信宏en_US
dc.date.accessioned2014-12-12T01:47:06Z-
dc.date.available2014-12-12T01:47:06Z-
dc.date.issued2011en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT079813506en_US
dc.identifier.urihttp://hdl.handle.net/11536/46989-
dc.description.abstract本論文目標為引入階層式韻律模型,進一步提升以馬可夫模型為基礎之合成器表現。首先引入韻律模型相關之韻律標記-音節邊界停頓標記與音節韻律狀態,將其運用到頻譜模型訓練過程,在決策樹分群階段改以韻律標記取代傳統語言資訊,改以介於上層語法資訊與下層音節資訊間的中層韻律資訊供決策樹分群使用,韻律標記除考量語言資訊外,更同時考量了聲學上的資訊,故應比語言資訊與頻譜更加相關,經實驗證實,韻律標記確實可提供勝過語言資訊的分群能力,訓練出更好的頻譜模型。接著進一步考慮合成時韻律模型的運用,因合成階段僅有文字,但欲取得標記需同時具有聲學與語言資訊,故本論文提出以條件式隨機域的方式訓練以文字預估韻律標記的模型,由於其可同時考量全域觀察序列之影響,並且利用前後狀態相關性進行模型學習,對於具時間相關性的參數預估應極有幫助,從實驗結果可發現,預估得到的韻律狀態,大多皆能符合音節邊界停頓對應的轉移特性。最後結合頻譜模型、韻律模型與預估得到之韻律標記,即為一完整合成系統,此系統具韻律變化豐富之優點,但因音節邊界停頓預估仍不夠好,導致部分合成語音的自然度欠佳,此有待未來繼續努力。zh_TW
dc.description.abstractIn this thesis, we introduce the hierarchical prosody model to further improve the HMM-based synthesis system performance. First, we apply two types of prosodic tags, prosodic breaks and prosody states, to the spectral model training process. In the process of decision tree clustering, we replace the high-level linguistic features with the middle-level prosodic tags to cluster context dependent model. For the prosodic tags labeling, we consider not only linguistic features but also acoustic features. We suggest it be more related to spectrum than considering linguistic features only. The experiment confirms that our proposed method is better than the conventional method considering linguistic features only in the clustering process. Second, in the synthesis stage, there is no way to label the prosodic tags of the text with the prosody model owing to the lack of acoustic features. As a result, we propose the conditional random fields(CRFs) method to estimate two types of prosodic tags according to the input text information. Because during the CRF model training process, it considers all the observation sequences and the neighboring output states, it is contributive to estimate the time-dependent parameter. The results of experiment show the transition of prosody states matches the corresponding prosodic breaks. Last, we build our proposed complete synthesis system by combining the training spectral model, the prosody model and the estimating prosodic tags, which has the advantage of prosodic diversity. Nevertheless, it is still not good enough for the prosodic break prediction. The prediction results degrade the naturalness of synthesis speech, thus improving the prosodic break prediction will be the future work.en_US
dc.language.isozh_TWen_US
dc.subject語音合成zh_TW
dc.subject韻律模型zh_TW
dc.subjectsynthesisen_US
dc.subjectprosody modelen_US
dc.title以階層式韻律模型為基礎之中文半隱藏式馬可夫模型語音合成器zh_TW
dc.titleA HSMM-based Mandarin Speech Synthesizer Based on Hierarchical Prosody Modelen_US
dc.typeThesisen_US
dc.contributor.department電信工程研究所zh_TW
顯示於類別:畢業論文


文件中的檔案:

  1. 350601.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。