標題: 以類神經網路做國語語音辨認之研究
A Study on Mandarin Speech Recognition Using Connectionist Networks
作者: 陳文源
Wen-Yuan Chen
陳信宏
Sin-Horng Chen
電子研究所
關鍵字: 語音辨認,類神經網路,爆裂音,最小失真度切割法,階層式類神經網路,遞迴類神經網路。;Speech Recognition, Mandarin, Artificial Neural Networks,
公開日期: 1994
摘要: 這篇論文研究有關類神經網路於國語語音辨認上的應用。首先提出一種爆 裂音參數的抽取方式,能自動切出輸入語音的爆裂音段落,並利用直交的 多項式展開法,計算出一組固定數目的參數,以描述該段爆裂音在頻譜和 時間軸上的特性,該參數可直接輸入類神經網路做爆裂音的辨認。利用直 交的多項式展開法,我們也提出廣泛式最小失真度語音切割法,該方法能 找出一組段落界限值,使得切割後的語音參數能在最小失真度的情況下, 表示原來語音資料。切割後的參數不須再經過任何轉換程序,即可直接送 入階層式類神經網路做辨認。實驗結果證明,在參數量相同的情況下,本 論文所提出的廣泛式切割法,不論在失真度或辨認率上均比傳統的切割法 好。在改善類神經網路的使用效率方面,我們提出具有時序加權特性的階 層式類神經網路,其連接鍵值會隨時間變化,用以學習語音中的動態訊息 ,毋須另外執行動態時間校準程序,該網路就能有效的解決輸入語音與類 神經網路之間的時間對準問題。在大字彙辨認方面,利用國語語音的特性 ,我們提出串連的階層式類神經網路和層次架構的遞迴類神經網路。這兩 種網路架構以聲母、韻母或音素為基本辨認單位,一個類神經網路對應於 一待辨認單位。串連的階層式類神經網路是以動態時間對準法,將輸入語 音的時框映對至串連的階層式類神經網路。層次架構的遞迴類神經網路則 另外使用一個遞迴類神經網路做聲母、韻母的切割和加權,因此毋需使用 耗時的動態時間對準法,計算量可大量節省。 In this dissertation, several novel ANN-based speech recognition methods for discriminating isolated Mandarin speech are discussed. First, a new method to recognize six plosives in isolated Mandarin syllables is proposed. Next, an MLP-based method is proposed for isolated word recognition. Speech signal is first pre-processed by a generalized minimal distortion segmentation (GMDS) algorithm to find a set of boundaries that minimize the accumulated distortion of orthonormal polynomial expansions of all segments. Experimental results showed that dynamics of speech signal can be more accurately captured by the GMDS algorithm so as to improve the performance of the following MLP recognizer. Another approach based on a generalized MLP, referred to as time weighting MLP (TWMLP) is then proposed for isolated word recognition. In the TWMLP, weights which connect hidden nodes and output nodes are generalized to be varied with time in order to memorize the temporal information of training utterances. Last, two new methods are proposed for large vocabulary isolated Mandarin speech recognition. One is a sequential MLP based method. The other is a hierarchical recurrent neural networks based method.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT830430016
http://hdl.handle.net/11536/59199
顯示於類別:畢業論文