標題: | 基於類神經網路之中文語音屬性偵測器 A Neural Network based Mandarin Speech Attribute Detection |
作者: | 張友駿 Yio-Jun Zhang 王逸如 Yi-Ru Wang 電信工程研究所 |
關鍵字: | 偵測器;語音屬性;Detector;Speech Attribute |
公開日期: | 2007 |
摘要: | 新世代的自動語音辨識技術架構是一個以知識為基礎(knowledge-based),加上資料驅動(data-driven)的模式,其前端為語音屬性與事件偵測器群,藉由抽取不同的語音特徵參數去偵測某一時段中語音的屬性及事件,尋找任何可以提供語音辨識的線索,提供給後級作語音事件及知識整合後,作證據確認及決策,以其能夠突破目前語音辨識的能力與技術。
本論文基於此概念,首先由於中文語料庫並無精確的音素切割位置,因此我們從中文音節的切割位置起始對語料庫作自動切割以求得音素的初始切割位置,接著以Segmental Kmeans Segmentation Algorithm自動調整音素的切割位置,並以此切割位置製作中文發音方法偵測器。首先訓練線性的混合高斯模型偵測器,接著訓練非線性的多層感知機模型偵測器,之後以segment-based的概念在偵測過程中加入狀態轉移機率(State transition probability)來對於中文發音方法進行偵測實驗,最後引入信任度量測(Confidence measure)來對偵測結果可靠的程度作量化的評比,提供語音資訊傳給後級辨識器當參考依據。最後再對各架構語音屬性偵測器以及信任度量測作效能與錯誤分析 Next generation ASR system is a knowledge-based and data-driven paradigm. It’s front-end is the bank of speech attribute and event detectors, and it’s function is to detect the speech attributes and events in the speech signal. By organizing the outputs of front-end and knowledge, it would be sent to next stage to make evidence verified and decision. It would be expected to exceed the current state-of-the-art HMM-based ASR. Based on the concept, firstly, because there is no manual labeling for Mandarin corpus ,we start with syllable labeling and then forced-align the corpus to get initial phone labeling. Then we use Segmental Kmeans Segmentation Algorithm to automatically refine phone labeling and use this phone labeling to train Mandarin attribute detector. First, we train linear GMM based detector and then train nonlinear MLP based detector. Then based on concept of segment-based ,we add state transition probability to MLP based detector to examine Mandarin speech detection. Secondly, we use confidence measure to evaluate the result of attribute detection, providing confident speech information to recognizer for reference. Finally, we would make error analysis and performance evaluation of different Mandarin speech attribute detectors and confidence measure. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009413623 http://hdl.handle.net/11536/80884 |
顯示於類別: | 畢業論文 |