使用跨語語料改進深層類神經網路中文語音辨認器之效能

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	林建廷	zh_TW
dc.contributor.author	王逸如	zh_TW
dc.contributor.author	Lin, Chien-Ting	en_US
dc.contributor.author	Wang, Yih-Ru	en_US
dc.date.accessioned	2018-01-24T07:37:34Z	-
dc.date.available	2018-01-24T07:37:34Z	-
dc.date.issued	2016	en_US
dc.identifier.uri	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070360313	en_US
dc.identifier.uri	http://hdl.handle.net/11536/139182	-
dc.description.abstract	目前深層類神經網路已成為語音辨識領域中相當熱門的研究，本論文使用Kaldi speech recognition toolkit的環境為基礎實現共享隱藏層式深層類神經網路(Shared-hidden-layer language DNN, SHL-DNN)聲學模型訓練，因為類神經網路中的隱藏層可視為一連串複雜的特徵轉換，並且描述著各種語音之發音特徵，因此不同語言間可以共享。由於中文語料的不足，故使用較充足的英文語料，透過跨語言模型轉換技術(Cross-lingual model transfer)，以改善中文語音之辨識率、降低各語者語音辨認率的變異程度。後續加入語言模型建立語音辨識系統，調整解碼過程中的參數並考量實際時間係數(Real Time Factor, RTF)，找出最佳操作點，以得到即時性與辨識率兼顧的辨識系統。在訓練語料的部分，本實驗使用長達960個小時的Librispeech英文語料作為來源語言(Source language)，目標語言(Target language)則為約24小時的中文語料TCC300，為了檢測系統強健度(Robustness)，測試語料除了TCC300以外，另外加入1.9個小時的Sinica COSPRO02語料庫。	zh_TW
dc.description.abstract	Deep neural network (DNN) has been a popular research in automatic speech recognition. In this dissertation, we focus on implementing shared-hidden-layer language DNN acoustic model. The DNNs can be considered as a model that learns a complicated nonlinearity feature transformation and describe the pronouncing of different phonemes. Therefore, the hidden layers of DNNs could be shared for different languages. Unfortunately, there are only few small Taiwanese Mandarin speech corpora available. Consequently, in this paper, we evaluate cross-lingual model transfer approach to improve the performance of Mandarin-speech recognition and reduce the standard deviation of phone error rate with respect to different speakers. Moreover, we introduce a trigram language model to build a speech recognition system and tune the parameters using in decoding to receive the best operation point. Finally, we can build a real-time and high efficiency speech recognition system. In this experiment, a large 960 hours English corpus, Librispeech, was treated as the source language and a small 24 hours Mandarin corpus, TCC300, was treated as the target language. In the testing data, in addition to TCC300, we add an extra small 1.9 hours’ corpus, COSPRO02, to test the robustness of this system.	en_US
dc.language.iso	zh_TW	en_US
dc.subject	深層類神經網路	zh_TW
dc.subject	中文語音辨識	zh_TW
dc.subject	跨語	zh_TW
dc.subject	Deep neural network	en_US
dc.subject	Mandarin speech recognition	en_US
dc.subject	Cross lingual	en_US
dc.title	使用跨語語料改進深層類神經網路中文語音辨認器之效能	zh_TW
dc.title	Using cross lingual data to improve the performance of DNN-based Mandarin-speech recognition	en_US
dc.type	Thesis	en_US
dc.contributor.department	電信工程研究所	zh_TW
顯示於類別：	畢業論文

APA	林., 王., Lin, C., & Wang, Y. (2016). 使用跨語語料改進深層類神經網路中文語音辨認器之效能. http://hdl.handle.net/11536/139182.
Bibtex	@article{林建廷 and 王逸如 and Lin2016, title={使用跨語語料改進深層類神經網路中文語音辨認器之效能}, author={林建廷 and 王逸如 and Lin, Chien-Ting and Wang, Yih-Ru}, journal={http://hdl.handle.net/11536/139182}, year={2016}, url={https://ir.lib.nycu.edu.tw/handle/11536/139182?mode=full}, }