電話環境之中文連續語音辨認

標題:	電話環境之中文連續語音辨認 Continuous Mandarin Speech Recognition on Telephone Environment
作者:	李供龍 Gong-Long Lee 陳信宏 Sin-Horng Chen 電信工程研究所
關鍵字:	強健式訓練法;相似度補償;強健式訊號偏移補償;Robust Training Algorithm;Likelihood Compensation;Robust SBR
公開日期:	1998
摘要:	本論文的研究重點在於電話通道上的中文連續語音411音節辨認。以聲韻母模型為基礎，對不特定語者、大字彙國語連續411音節之強健式辨認。利用強健式訓練法，在訓練階段考慮通道與雜訊的效應，以獲致近乎乾淨的語音模型。並於辨認階段加入相對應之補償與LC相似度補償，經實際電話語音語料庫的測試，的確獲得良好的成果，得到TL語料庫的411音節辨認率為54.26%。另外，採用RRSBR方法以改善傳統SBR的缺點，不但使得辨認率上升，而且更能有效的節省辨認系統的運算量與辨認時間。得到MAT5語料庫的411音節辨認率為49.94%。 In this thesis, a robust training algorithm aiming at generating a set of bias-removed, noise-suppressed reference speech HMM models from a telephone-speech database collected in public telephone switching system is proposed. It incorporates a signal bias-compensation operation and a PMC noise-compensation operation into its iterative training process so as to make the resulting speech HMM models more suitable to the given robust speech recognition method using the same bias- and noise-compensation operations in its recognition process. Effectiveness of the proposed training algorithm was examined by simulation using a 227-speaker database provided by Chunghwa Telecommunication Laboratories (TL). Experimental results confirmed that it significantly outperformed the conventional segmental k-means training algorithm. An RNN-based robust signal bias removal (RRSBR) method is also proposed for improving the SBR method on both the recognition performance and the computational efficiency. The base-syllable recognition rates increased from 49.06% to 54.26% for the TL database and from 45.34% to 49.94% for a 649-speaker Mandarin-Across-Taiwan (MAT) database.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#NT870435043 http://hdl.handle.net/11536/64503
Appears in Collections:	Thesis