使用深層類神經網路之中文語音辨認

標題:	使用深層類神經網路之中文語音辨認 A Study on DNN-based Mandarin-speech Recognition
作者:	歐立安 Ou, Li-An 陳信宏 Chen, Sin-Horng 電信工程研究所
關鍵字:	深層類神經網路;中文語音辨識;即時;Deep neural network;mandarin-speech recognition;real-time
公開日期:	2015
摘要:	目前深層類神經網路已成為語音辨識領域中的熱門研究，本論文中以Kaldi speech recognition toolkit建立即時中文大詞彙語音辨識系統，並使用深層類神經網路取代傳統聲學模型中高斯混合模型，以加權有限狀態機實現辨識系統，分析影響辨識率與辨識時間的因素。其中聲學模型部分使用高斯混合模型或深層類神經網路模型的差異，以及語言模型大小的差異都影響了整個辨識系統的大小，解碼過程中的許多參數也影響了辨識系統的效能，因此藉由調整這些參數，我們可以找到最佳的操作點，以得到一個即時又正確的辨識系統。在實驗中使用TCC300語料庫分別作為訓練與測試語料，並建立八萬詞的發音詞典。 Deep neural network has been a popular research area in automatic speech recognition. In this dissertation, we focus on implementing a real-time large vocabulary mandarin-speech recognition system using Kaldi speech recognition toolkit. In the proposed system we develop the deep neural network acoustic model which is compared to the conventional Gaussian mixture model (GMM). We also use weighted finite state transducer (WFST) to realize the decoder. The main goal is to receive the best operation point in the proposed speech recognition system by tuning the parameter which effect the recognition speed amd recognition rate. The experimental results use the speech corpora of TCC300 speech corpora and 80k lexicon.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#GT070260242 http://hdl.handle.net/11536/127264
Appears in Collections:	Thesis