中文語音合成技術之實作與分析

Full metadata record

DC Field	Value	Language
dc.contributor.author	魯弘茂	en_US
dc.contributor.author	Hong-Mao Lu	en_US
dc.contributor.author	陳信宏	en_US
dc.contributor.author	Dr. Sin-Horng Chen	en_US
dc.date.accessioned	2014-12-12T02:28:30Z	-
dc.date.available	2014-12-12T02:28:30Z	-
dc.date.issued	2001	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#NT900435041	en_US
dc.identifier.uri	http://hdl.handle.net/11536/68916	-
dc.description.abstract	本論文對國立交通大學電信研究所過去發展的國語文句翻語音系統提出一些改進。首先，對於由411個基本音節波型過長所引起的合成聲音品質不佳的效應加以改進，採用由連續語音中抽取長度適中且沒有連音效應的基本音節波型以及去除過長的基本音節波型的鼻音部分，可大幅度改善合成聲音品質；接著，我們比較TD-PSOLA的三種不同能量補償方式—簡單重疊相加、最小平均方相加、及簡化後的簡單重疊相加，以及LP-PSOLA，最後決定採用簡化後的簡單重疊相加方式的TD-PSOLA；然後，我們使用類神經網路來產生音節的能量軌跡，將音節依照聲母分成四大類，個別使用一個MLP，由適當的語言輸入參數，可產生相當好的能量軌跡；最後，我們改進過去所提出，以 RNN-MLP方式，對夾雜在中文文句中的英文專有名詞，產生逐字發音所需的韻律信息，獲得較佳的字母長度及中英文詞間停頓長度。	zh_TW
dc.description.abstract	In this thesis, some approaches to improve the Mandarin TTS system, developed previously in the Department of Communication Engineering of National Chiao Tung University, are discussed. Firstly, the problem of the acoustic inventory comprising too-long waveforms of 411 isolated base-syllables is solved by selecting waveforms of proper duration from continuous speech and by compressing the nasal parts of too-long waveforms. Experimental results showed that the quality of the synthesized speech was greatly improved. Secondly, three forms of TD-PSOLA and LP-PSOLA are implemented to compare their qualities. Based on the study, we choose to use the simplest form of TD-PSOLA. Thirdly, an NN-based method is proposed to generate the energy contour of syllable. We first classify all syllables into four classes according to their initials, and then use one MLP to generate energy contours for syllables in each class. With properly choosing input linguistic features, very good synthesized energy contour of syllable can be obtained. Lastly, the RNN-MLP method proposed previously for the generation of prosodic parameters for English alphabets embedded in Chinese text is refined. Experimental results showed that better alphabet duration and pause durations before and after English words were obtained.	en_US
dc.language.iso	zh_TW	en_US
dc.subject	語音合成	zh_TW
dc.subject	文句翻語音系統	zh_TW
dc.subject	基頻同步疊加	zh_TW
dc.subject	遞迴式類神經網路	zh_TW
dc.subject	speech synthesis	en_US
dc.subject	TTS	en_US
dc.subject	Text To Speech	en_US
dc.subject	PSOLA	en_US
dc.subject	RNN	en_US
dc.title	中文語音合成技術之實作與分析	zh_TW
dc.title	An Implementation and Analysis of Mandarin Speech Synthesis Technologies	en_US
dc.type	Thesis	en_US
dc.contributor.department	電信工程研究所	zh_TW
Appears in Collections:	Thesis