Title: Traditional Chinese Parser and Language Modeling for Mandadin ASR
Authors: Lin, Ang-Hsing
Wang, Yih-Ru
Chen, Sin-Horng
傳播研究所
Institute of Communication Studies
Keywords: Chinese word segmentation;Conditional random field;Language model;weighted finite state transducer;automatic speech recognition
Issue Date: 1-Jan-2013
Abstract: A new approach of traditional Chinese parser to improving the language modeling of Mandarin speech recognition is proposed in this paper. The parser first uses a preprocessing to correct some word segmentation inconsistencies of the text corpus. It then employs a CRF-based word segmentation method and a CRF-based POS tagger to resegment the texts so as to generate better word strings for training an n-gram language model (LM) for ASR. Experimental results on the TCC-300 corpus showed that a word error rate (WER) of 13.4% was achieved by the proposed method. It is about 45% improvement on the relative WER reduction as compared with the previous system.
URI: http://hdl.handle.net/11536/125033
ISSN: 
Journal: 2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE)
Appears in Collections:Conferences Paper