標題: 利用具有屬性的圖形表示法及編碼比對方式作多字形與大小之印刷中文字辨認
Recognition of Multi-font and Multi-size Printed Chinese Characters Using Attributed Graph Representations and Code Matching
作者: 陳建銘
Jian-Ming Chen
蔡文祥
Wen-Hsiang Tsai
資訊科學與工程研究所
關鍵字: 多字形, 多大小, 印刷中文字辨認;multi-font, printed Chinese characters recognition, morphology
公開日期: 1994
摘要: 隨著資訊的多樣化,越來越多的中文印刷品在字形以及字體大小上力求變 化,因此,本論文提出了一個利用具有屬性的圖形表示法及編碼比對方式 來作多字形與大小之印刷中文字的辨認系統。首先,提出一個改良之數學 形態學的運算演算法,此方法可在不破壞中文字結構資訊的情形下用以改 進中文字形的輸入品質。接著介紹我們所提出建構中文字的圖形表示之方 法,包括特徵點的抽取、筆畫的追蹤、以及雜訊的消除等,也同時說明了 我們建構模型的方法。然後我們將表示圖形的連接矩陣伴隨著一些屬性予 以編碼。最後,利用這些編碼發展出了一套包含大分類以及細分類的快速 辨識系統。實驗結果證實所提技巧確實可行且具實用性,不但對於所測試 的五種字體達到百分之九十六點二的平均辨識率,並且對於相似形之中文 字也有相當好的辨識能力。 Nowadays, more and more Chinese materials are printed in more than one font and size in a single page. So a system for recognizing multi-font and multi-size Chinese characters based on attributed graph representations and a code matching method is proposed. First, a modified dilation operation is proposed to improve and transform the shapes of Chinese characters, without destroying the structural information within the characters. Next, a method for constructing graph representations from thinned character shapes, which includes finding the feature points, tracing the strokes, and removing noise, is proposed. Then the adjacency matrices of the graph representations with some attributes are encoded. A fast classifier consisting of preclassfication and detailed matching both based on the use of codes is also proposed. Experimental results are shown to prove the feasibility and practicability of the proposed approach, which achieved a 96.2% overall recognition rate for five fonts. The proposed system is also useful for recognizing similar-structured Chinese characters without extra works.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT830394009
http://hdl.handle.net/11536/59028
顯示於類別:畢業論文