标题: | 使用深层类神经网路之中文语音辨认 A Study on DNN-based Mandarin-speech Recognition |
作者: | 欧立安 Ou, Li-An 陈信宏 Chen, Sin-Horng 电信工程研究所 |
关键字: | 深层类神经网路;中文语音辨识;即时;Deep neural network;mandarin-speech recognition;real-time |
公开日期: | 2015 |
摘要: | 目前深层类神经网路已成为语音辨识领域中的热门研究,本论文中以Kaldi speech recognition toolkit建立即时中文大词汇语音辨识系统,并使用深层类神经网路取代传统声学模型中高斯混合模型,以加权有限状态机实现辨识系统,分析影响辨识率与辨识时间的因素。其中声学模型部分使用高斯混合模型或深层类神经网路模型的差异,以及语言模型大小的差异都影响了整个辨识系统的大小,解码过程中的许多参数也影响了辨识系统的效能,因此藉由调整这些参数,我们可以找到最佳的操作点,以得到一个即时又正确的辨识系统。在实验中使用TCC300语料库分别作为训练与测试语料,并建立八万词的发音词典。 Deep neural network has been a popular research area in automatic speech recognition. In this dissertation, we focus on implementing a real-time large vocabulary mandarin-speech recognition system using Kaldi speech recognition toolkit. In the proposed system we develop the deep neural network acoustic model which is compared to the conventional Gaussian mixture model (GMM). We also use weighted finite state transducer (WFST) to realize the decoder. The main goal is to receive the best operation point in the proposed speech recognition system by tuning the parameter which effect the recognition speed amd recognition rate. The experimental results use the speech corpora of TCC300 speech corpora and 80k lexicon. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT070260242 http://hdl.handle.net/11536/127264 |
显示于类别: | Thesis |