A language model based on semantically clustered words in a Chinese character recognition system

標題:	A language model based on semantically clustered words in a Chinese character recognition system
作者:	Lee, HJ Tung, CH 交大名義發表資訊工程學系 National Chiao Tung University Department of Computer Science
關鍵字:	contextual postprocessing;language model;semantics;word group
公開日期:	1-八月-1997
摘要:	This paper presents a new method for clustering the words in a dictionary into ward groups. A Chinese character recognition system can then use these groups in a language model to improve the recognition accuracy. In the language model, the number of parameters we must train beforehand can be kept to a reasonable value. The Chinese synonym dictionary Tong2yi4ci2 ci2lin2 providing the semantic features is used to calculate the weights of the semantic attributes of the character-based word classes. The weights of the semantic attributes are next updated according to the words of the Behavior dictionary, which has a rather complete word set. Then, the word classes are clustered to In groups according to the semantic measurement by a greedy method. The words in the Behavior dictionary can finally be assigned to the m groups. The parameter space for the bigram contextual information of the character recognition system is m(2). From the experimental results, the recognition system with the proposed model has shown better performance than that of a character-based bigram language model. (C) 1997 Pattern Recognition Society. Published by Elsevier Science Ltd.
URI:	http://hdl.handle.net/11536/385
ISSN:	0031-3203
期刊:	PATTERN RECOGNITION
Volume:	30
Issue:	8
起始頁:	1339
結束頁:	1346
顯示於類別：	期刊論文

文件中的檔案：

A1997XH88300010.pdf

若為 zip 檔案，請下載檔案解壓縮後，用瀏覽器開啟資料夾中的 index.html 瀏覽全文。