Title: | Traditional Chinese Parser and Language Modeling for Mandadin ASR |
Authors: | Lin, Ang-Hsing Wang, Yih-Ru Chen, Sin-Horng 傳播研究所 Institute of Communication Studies |
Keywords: | Chinese word segmentation;Conditional random field;Language model;weighted finite state transducer;automatic speech recognition |
Issue Date: | 1-Jan-2013 |
Abstract: | A new approach of traditional Chinese parser to improving the language modeling of Mandarin speech recognition is proposed in this paper. The parser first uses a preprocessing to correct some word segmentation inconsistencies of the text corpus. It then employs a CRF-based word segmentation method and a CRF-based POS tagger to resegment the texts so as to generate better word strings for training an n-gram language model (LM) for ASR. Experimental results on the TCC-300 corpus showed that a word error rate (WER) of 13.4% was achieved by the proposed method. It is about 45% improvement on the relative WER reduction as compared with the previous system. |
URI: | http://hdl.handle.net/11536/125033 |
ISSN: | |
Journal: | 2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE) |
Appears in Collections: | Conferences Paper |