標題: | A language model based on semantically clustered words in a Chinese character recognition system |
作者: | Lee, HJ Tung, CH 交大名義發表 資訊工程學系 National Chiao Tung University Department of Computer Science |
關鍵字: | contextual postprocessing;language model;semantics;word group |
公開日期: | 1-八月-1997 |
摘要: | This paper presents a new method for clustering the words in a dictionary into ward groups. A Chinese character recognition system can then use these groups in a language model to improve the recognition accuracy. In the language model, the number of parameters we must train beforehand can be kept to a reasonable value. The Chinese synonym dictionary Tong2yi4ci2 ci2lin2 providing the semantic features is used to calculate the weights of the semantic attributes of the character-based word classes. The weights of the semantic attributes are next updated according to the words of the Behavior dictionary, which has a rather complete word set. Then, the word classes are clustered to In groups according to the semantic measurement by a greedy method. The words in the Behavior dictionary can finally be assigned to the m groups. The parameter space for the bigram contextual information of the character recognition system is m(2). From the experimental results, the recognition system with the proposed model has shown better performance than that of a character-based bigram language model. (C) 1997 Pattern Recognition Society. Published by Elsevier Science Ltd. |
URI: | http://hdl.handle.net/11536/385 |
ISSN: | 0031-3203 |
期刊: | PATTERN RECOGNITION |
Volume: | 30 |
Issue: | 8 |
起始頁: | 1339 |
結束頁: | 1346 |
顯示於類別: | 期刊論文 |