標題: A language model based on semantically clustered words in a Chinese character recognition system
作者: Lee, HJ
Tung, CH
資訊工程學系
Department of Computer Science
關鍵字: contextual postprocessing;language model;semantics;word group
公開日期: 1-Aug-1997
摘要: This paper presents a new method for clustering the words in a dictionary into ward groups. A Chinese character recognition system can then use these groups in a language model to improve the recognition accuracy. In the language model, the number of parameters we must train beforehand can be kept to a reasonable value. The Chinese synonym dictionary Tong2yi4ci2 ci2lin2 providing the semantic features is used to calculate the weights of the semantic attributes of the character-based word classes. The weights of the semantic attributes are next updated according to the words of the Behavior dictionary, which has a rather complete word set. Then, the word classes are clustered to In groups according to the semantic measurement by a greedy method. The words in the Behavior dictionary can finally be assigned to the m groups. The parameter space for the bigram contextual information of the character recognition system is m(2). From the experimental results, the recognition system with the proposed model has shown better performance than that of a character-based bigram language model. (C) 1997 Pattern Recognition Society. Published by Elsevier Science Ltd.
URI: http://dx.doi.org/10.1016/S0031-3203(96)00154-9
http://hdl.handle.net/11536/149566
ISSN: 0031-3203
DOI: 10.1016/S0031-3203(96)00154-9
期刊: PATTERN RECOGNITION
Volume: 30
起始頁: 1339
結束頁: 1346
Appears in Collections:Articles