标题: 利用具有属性的图形表示法及编码比对方式作多字形与大小之印刷中文字辨认
Recognition of Multi-font and Multi-size Printed Chinese Characters Using Attributed Graph Representations and Code Matching
作者: 陈建铭
Jian-Ming Chen
蔡文祥
Wen-Hsiang Tsai
资讯科学与工程研究所
关键字: 多字形, 多大小, 印刷中文字辨认;multi-font, printed Chinese characters recognition, morphology
公开日期: 1994
摘要: 随着资讯的多样化,越来越多的中文印刷品在字形以及字体大小上力求变
化,因此,本论文提出了一个利用具有属性的图形表示法及编码比对方式
来作多字形与大小之印刷中文字的辨认系统。首先,提出一个改良之数学
形态学的运算演算法,此方法可在不破坏中文字结构资讯的情形下用以改
进中文字形的输入品质。接着介绍我们所提出建构中文字的图形表示之方
法,包括特征点的抽取、笔画的追踪、以及杂讯的消除等,也同时说明了
我们建构模型的方法。然后我们将表示图形的连接矩阵伴随着一些属性予
以编码。最后,利用这些编码发展出了一套包含大分类以及细分类的快速
辨识系统。实验结果证实所提技巧确实可行且具实用性,不但对于所测试
的五种字体达到百分之九十六点二的平均辨识率,并且对于相似形之中文
字也有相当好的辨识能力。
Nowadays, more and more Chinese materials are printed in more
than one font and size in a single page. So a system for
recognizing multi-font and multi-size Chinese characters based
on attributed graph representations and a code matching method
is proposed. First, a modified dilation operation is proposed
to improve and transform the shapes of Chinese characters,
without destroying the structural information within the
characters. Next, a method for constructing graph
representations from thinned character shapes, which includes
finding the feature points, tracing the strokes, and removing
noise, is proposed. Then the adjacency matrices of the graph
representations with some attributes are encoded. A fast
classifier consisting of preclassfication and detailed matching
both based on the use of codes is also proposed. Experimental
results are shown to prove the feasibility and practicability
of the proposed approach, which achieved a 96.2% overall
recognition rate for five fonts. The proposed system is also
useful for recognizing similar-structured Chinese characters
without extra works.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT830394009
http://hdl.handle.net/11536/59028
显示于类别:Thesis