台語文句翻語音系統之製作

Full metadata record

DC Field	Value	Language
dc.contributor.author	楊鈺清	en_US
dc.contributor.author	Yu-Ching Yang	en_US
dc.contributor.author	陳信宏	en_US
dc.contributor.author	Dr. Sin-Horng Chen	en_US
dc.date.accessioned	2014-12-12T02:20:57Z	-
dc.date.available	2014-12-12T02:20:57Z	-
dc.date.issued	1998	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#NT870435018	en_US
dc.identifier.uri	http://hdl.handle.net/11536/64476	-
dc.description.abstract	本論文完成一套台語文句翻語音系統。它由四個主要部份組成：文句分析器、RNN韻律訊息產生器、語音波形樣本資料庫和PSOLA語音合成器。輸入文句經由文句分析後產生適當的語言參數，RNN韻律訊息產生器則根據這些參數產生出相對應的韻律參數。PSOLA語音合成器則根據合成音節碼從語音波形樣本資料庫擷取出適當的語音波形樣本，將其依照韻律參數調整後，得到合成語音波形輸出。在此研究中，我們並嘗試了一些不同的方法來合成語音。首先，為了能夠更精緻化的合成語音，我們以取樣點為基本合成單元來代替以音框為基本合成單元的方式來合成語音。其次，為了能使合成語音的波封更接近實際的情形，我們採用能量軌跡的方法來合成語音。此外，為了克服由於錄音環境不同或者是錄音者本身的因素，而有錄音語句前後音量和速度不同的現象，我們則先對目標值做正規化後，再訓練遞迴類神經網路。最後，我們使用一個單一文件界面的文字編輯器配合語音合成核心製作了一套在Windows 95/NT平台上的展示系統。	zh_TW
dc.description.abstract	In this thesis, a Taiwanese TTS system is implemented. It consists of four main parts: text analyzer, RNN prosody generator, waveform inventory of synthesis units, and PSOLA synthesizer. The input text is first tagged in the text analyzer into word sequence. Then, the RNN prosody generator is used to generate the prosodic information by using linguistic features extracted from the word sequence. Waveform sequence corresponding to the word sequence is then extracted from the waveform inventory and prosodically-adjusted to generate the output speech. The basic implementation of the system follows the Mandarin TTS system developed previously in NCTU with the following improvements. First, the sample-based duration information are used rather than the frame-based one. Second, the syllable energy contour is taken as a prosodic information to be generated in stead of using static patterns given by the corresponding basic waveform. Third, both duration and energy features are normalized up to the utterance level. A demo system operating on the Windows 95/NT platform by using a SDI (Single Document Interface) text editor with the synthesis kernel was last realized. Informal listening tests show that most synthesized speeches sound fair.	en_US
dc.language.iso	zh_TW	en_US
dc.subject	文句翻語音	zh_TW
dc.subject	台語	zh_TW
dc.subject	國語	zh_TW
dc.subject	遞迴類神經	zh_TW
dc.subject	基頻同步疊加	zh_TW
dc.subject	TTS	en_US
dc.subject	Taiwanese	en_US
dc.subject	Mandarin	en_US
dc.subject	RNN	en_US
dc.subject	PSOLA	en_US
dc.title	台語文句翻語音系統之製作	zh_TW
dc.title	An Implementation of Taiwanese Text-to-Speech System	en_US
dc.type	Thesis	en_US
dc.contributor.department	電信工程研究所	zh_TW
Appears in Collections:	Thesis