Title: 電話環境之中文連續語音辨認
Continuous Mandarin Speech Recognition on Telephone Environment
Authors: 李供龍
Gong-Long Lee
Sin-Horng Chen
Keywords: 強健式訓練法;相似度補償;強健式訊號偏移補償;Robust Training Algorithm;Likelihood Compensation;Robust SBR
Issue Date: 1998
Abstract: 本論文的研究重點在於電話通道上的中文連續語音411音節辨認。以聲韻母模型為基礎,對不特定語者、大字彙國語連續411音節之強健式辨認。利用強健式訓練法,在訓練階段考慮通道與雜訊的效應,以獲致近乎乾淨的語音模型。並於辨認階段加入相對應之補償與LC相似度補償,經實際電話語音語料庫的測試,的確獲得良好的成果,得到TL語料庫的411音節辨認率為54.26%。另外,採用RRSBR方法以改善傳統SBR的缺點,不但使得辨認率上升,而且更能有效的節省辨認系統的運算量與辨認時間。得到MAT5語料庫的411音節辨認率為49.94%。
In this thesis, a robust training algorithm aiming at generating a set of bias-removed, noise-suppressed reference speech HMM models from a telephone-speech database collected in public telephone switching system is proposed. It incorporates a signal bias-compensation operation and a PMC noise-compensation operation into its iterative training process so as to make the resulting speech HMM models more suitable to the given robust speech recognition method using the same bias- and noise-compensation operations in its recognition process. Effectiveness of the proposed training algorithm was examined by simulation using a 227-speaker database provided by Chunghwa Telecommunication Laboratories (TL). Experimental results confirmed that it significantly outperformed the conventional segmental k-means training algorithm. An RNN-based robust signal bias removal (RRSBR) method is also proposed for improving the SBR method on both the recognition performance and the computational efficiency. The base-syllable recognition rates increased from 49.06% to 54.26% for the TL database and from 45.34% to 49.94% for a 649-speaker Mandarin-Across-Taiwan (MAT) database.
Appears in Collections:Thesis