標題: 使用類神經網路機制產生中英文夾雜文句之韻律訊息
A Neural Network Based Prosodic Information Generator in Bi-Lingual TTS System
作者: 陳啟仁
Chi-Jen Chen
陳信宏
Sin-Horng Chen
電信工程研究所
關鍵字: 中英文夾雜;韻律訊息參數;文字轉語音;Bi-Lingual;Prosody Information;TTS
公開日期: 2000
摘要: 本論文針對以中文文句為主體但內含英文夾雜的文章,使用一個以類神經網路為基礎的方法,對此類的文章求取一適當的韻律訊息參數。對於英文的文句,我們分為兩部分討論,一類為以字母為單位發音的文句,如TTS;另一類則是以音節組合而成為一個英文單字發音的文句,如Windows。我們以現有中文合成器中產生韻律訊息參數的RNN(Recursive Neural Network,遞迴式類神經網路)韻律訊息產生器為基礎,嘗試以此發展出一個適合中英文夾雜文句合成器的韻律訊息產生器。在我們的作法中,我們首先將兩類的英文字都視為中文字,將其和中文一起送進中文合成器中的RNN韻律訊息產生器中,如此可得出第一階段的韻律訊息參數。接著我們把第一階段的韻律訊息參數中僅抽出英文的部份送進另一個類神經網路機制MLP(Multi–Layer Perceptron,多層式類神經網路)加以修正,進而得到最後的韻律訊息參數。此MLP機制主要的目的便在於補償第一階段中將英文視為中文時,由於中文與英文的音節結構並不全然相同所發生不匹配的現象。經過此機制修正後,我們預期在合成英文部份的韻律時,能和合成中文文句的韻律一樣流暢,而實驗數據證明了這個RNN–MLP機制的確表現良好。另外,所有合成的韻律訊息參數在中文和英文的交接邊界都相當平滑,不會有落差過大的現象出現。也就是說,對中英文夾雜文句的合成器來說,這是一個可望發展的方法。
In this paper, a neural network-based approach to generate proper prosodic information for spelling and reading English words embedded in backguound Chinese texts is discussed. It expands an existing RNN-based prosodic information generator for Mandarin TTS to one suitable for Mandarin English mixed-lingual TTS. In the RNN prosodic information generator, an English input character is first replaced by a similar Chinese character. And, the original RNN, trained for Mandarin TTS, will generate the initial prosodic information for the English word. It then refines the initial prosodic information by using additional neural networks (MLPs), which will map the initial prosodic information generated from original RNN improper input settings to the phonetic structures of English. The resulting prosodic information is expected to be appropriate for English-word synthesis as well as to match well with that of the backguound Mandarin speech. Experimental results showed that the proposed RNN-MLP scheme performed very well. All synthesized prosodic parameters are smooth across Mandarin-English and English-Mandarin word boundaries
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT890435030
http://hdl.handle.net/11536/67308
Appears in Collections:Thesis