標題: 線上手寫中文字以空間關係及結構為基礎之表示法及辨識法之研究
Spatial Relationship and Structure-based Representation and Recognition of On-line Chinese Characters
作者: 陳如薇
Ju-Wei Chen
李 素 瑛
Suh-Yin Lee
資訊科學與工程研究所
關鍵字: 線上手寫中文字辨認; 筆畫空間關係; 結構辨認; 乏晰表示法;階層表示法;OLCCR; Structural recognition; Stroke spatial relationship, Fuzzy attribute,Hierarchical repre.
公開日期: 1994
摘要: 線上手寫輸入系統是一種智慧型電腦人機界面,尤其適合輸入具有圖形特 性之中文字。基本上,中文字是由基本筆畫,依據個別之結構特性所構成 。本論文便是以結構分析為研究途徑,探討線上手寫中文字之表示法及辨 識法。首先,探討2D Strings用於中文字表示法之可行性。經由實驗,發 現手寫文字筆畫位置不穩定,而且相似字多,只用筆畫之水平及垂直兩軸 之投影關係,在文字識別上,許多字不能區分。因此,進一步分析中文字 之筆畫空間關係的屬性,提出一個可用於辨認之筆畫空間關係表示法,並 驗証其效果。辨識時,書寫筆順可不受限制。手寫文字具形態變異,結構 屬性之判斷結果往往是一個乏晰量而非二元量。針對這項事實,作者提出 一個乏晰表示法來描述文字結構特性。利用筆段及該表示法,每個中文字 ,可以一個乏晰屬性圖來描述其結構特性。文字辨識問題可以乏晰比圖演 算法來解決。由於,文字基本描述單元為筆段且以圖模式表示文字,因此 ,書寫筆順及筆畫數可不受限制。個人數位助理系統目前逐漸普及。該類 系統之人機介面程式必需有夠小之程式碼及資料量。針對這項需要,作者 提出在文字比對程序中,以規則為基礎做筆畫對應。 筆畫對應之時間複 雜度最佳為 O(n),最差為O(nlogn),適用於大字集。辨識用之資料庫則 利用字根及字形結構,以階層表示,可節省四分之三的儲存空間。並以實 驗證實所提出之方法。 On-line handwriting input is an intelligent man-computer interface, especially suitable for ideographic Chinese characters. Chinese characters are constructed by basic strokes based on certain structural configurations. The dissertation focuses on a study of on-line Chinese character representation and recognition based on structural analysis approach. The feasibility of 2D strings used in Chinese character representation is first investigated. The experimental result shows that only projection relationships of strokes is not enough for character recognition due to many similar characters in company with handwriting variations. A stroke spatial relationship representation (SSRR) is proposed for character recognition without the constraint of stroke order. The effectiveness and significance of SSRR is proved and analyzed by experiments. Due to the existence of shape deformations in handwritten scripts, the attribute value of a structural identification is then a fuzzy quantity rather than a binary quantity. A fuzzy attribute representation (FAR) is then proposed for describing the structural knowledge of handwritten Chinese characters. The character recognition problem can be transformed to a matching problem of fuzzy attribute graphs and there are no constraints on both stroke number and stroke order. For PDAs, small program code and small data size are the characteristics of PDA software. A rule-based approach of OLCCR is proposed to satisfy the requirements. The stroke correspondence between an input script and a template character requires O(n) time in the best case, and O(n log n) in the worst case, where n is the stroke number of the template. Therefore, it is suitable to be applied to the recognition of a large character set. The hierarchical representation is adopted in the reference databases by using character components and character structures. The storage requirement is about 1/4 of that without using the hierarchical representation.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT830392002
http://hdl.handle.net/11536/58920
顯示於類別:畢業論文