標題: 基於隱藏式馬可夫模型之客語文句轉語音系統
An HMM-based Hakka Text-to-Speech System
作者: 蔡依玲
王逸如
Wang, Yih-Ru
電信工程研究所
關鍵字: 客語;斷詞;文字轉語音;語音合成器;馬可夫模型;停頓預估系統;Hakka;parser;TTS;HTS;HMM;pause predictor
公開日期: 2009
摘要: 本論文完成一套客語文句轉語音系統。它由斷詞系統、停頓預估系統、文脈分析器以及基於隱藏式馬可夫模型之語音合成器所組成。文句輸入經由斷詞系統做分詞和詞性標記,由於客語語料的不足,本論文在此結合中文斷詞條件隨機域模型、客語外掛詞典、中文內部詞典及客語構詞規則形成客語斷詞系統。斷詞系統輸出斷詞資訊後,由停頓預估系統預估詞後是否停頓,再由文脈分析器輸出和語音合成器相對應之合成單元及語言參數,依據文字標記檔及合成器訓練端所得到的模型進行音長、音高、頻譜參數的預測,而得到合成語音輸出。最後我們分別設計實驗,對斷詞系統、停頓預估系統、合成器以及客語TTS系統做出效能評估,而客語TTS系統在主觀評分上得到了不錯的分數,顯示出它是一套不錯的系統。
In this thesis, a Hakka Text-to-Speech (TTS) system is implemented. It consists of four main parts: parser, pause predictor, context analyzer and HMM-based synthesizer. The input text is first tagged in the text analyzer into word sequence. Due to the lack of a large text corpus to train a robust Hakka parser, we adopt a new approach to constructing a Hakka parser via extending an existing CRF-based Chinese parser to add a Hakka dictionary and incorporate some Hakka word construction rules. Then, the pause predictor estimates the inter-syllable locations to insert pauses. The context analyzer then generates the synthesis unit and some language parameters. Lastly, the HMM-based synthesizer produces duration, pitch, and spectral parameters to generate the output synthesized speech. Some experiments are also designed to evaluate the performances of the parser and the pause predictor, as well as the quality of the synthesized speech. A good MOS score obtained in the subjective quality test confirms that the Hakka TTS system is a promising one.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079713571
http://hdl.handle.net/11536/44588
Appears in Collections:Thesis


Files in This Item:

  1. 357101.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.