標題: 國語韻律訊息之偵測及應用
An Initial Study on Mandarin Prosodic Information Detection and Its Application
作者: 李 漜Q凌
Lee, Shu-Ling
陳信宏
Sin-Horng Chen
電信工程研究所
關鍵字: 韻律狀態;遞迴式類神經網路;向量量化;Prosodic States;Recurrent Neural Networks;Vector Quantization
公開日期: 1996
摘要: 本論文提出一個由語音中偵測韻律狀態的方法。首先使用一個 遞 迴類神經網路對每一輸入音框分類,而後驅動一個有限狀態機,將 輸入 語音切段成聲母音段、韻母音段、靜音音段,及暫態音段,接著 由每一 韻母音段附近抽取聲學參數用以判別韻律狀態。本研究中採用 兩種方法 做判定,其一採用向量量化,直接將相鄰兩韻母音段之參數 歸類成8或 16個韻律狀態;另一採用一遞迴類神經網路,利用一些 有意義之語言 參數當輸出目標,訓練此網路,使其隱藏層之輸出經向 量量化後,代表 韻律狀態。實驗結果顯示此兩方法獲得之韻律狀態均 具有語言學上的意 義。最後,我們將遞迴類神經網路之部份輸出用於 一模組化之國語連續 詞辨認器,實驗結果顯示它們可增進詞辨認之效 能。 In this thesis, a method to detect the prosodic states of speech signals is proposed. It first employs an RNN to discriminate each input frame of an input utterance among three broad classes of syllable initial, syllable final, and silence. Outputs of the RNN are then used to drive an FSM to segment the input utterance into segments of four states. They include three stable states of I (initial), F (final), and S (silence), and a transient state of T (transition). Several acoustic cues are then extracted from the vicinities of final segments, and used to model the prosodic states of inter-final- segment periods. Two prosodic-state modeling schemes are studied. One uses VQ to directly classify the acoustic cues of two contiguous final segments into 8 or 16 prosodic states. The other uses an RNN with some linguistic features as target outputs. Prosodic states are obtained by vector- quantizing the outputs of the hidden layer of the RNN. Linguistically meaningful interpretations of these prosodic states can be observed. Finally, two outputs of the RNN , which provide word-boundary cues, are integrated into an MRNN-based continuous Mandarin word recognizer. Experimental results showed that it is helpful in improving the word recognition performance.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT850436023
http://hdl.handle.net/11536/62097
顯示於類別:畢業論文