標題: 利用筆劃階層式結構關係進行手寫中文字辨認
Handwritten Chinese character recognition based on structural relations of character strokes
作者: 鄭瑞□
Rei-Heng Cheng
陳 稔
Zen Chen
資訊科學與工程研究所
關鍵字: 筆劃次結構,手寫中文字,部首辨認,文字大分類;Stroke substructure, handwritten Chinese character,radical recognition,Character preclassification
公開日期: 1994
摘要: 本論文探討如何以筆劃為基礎, 在階層性的結構關係下,配合事先建立 的知識,降低手寫中文字辨認問題的複雜度。 首先,由於中文字可以分 解為部首之組合, 而部首種類則遠較文字種類少,因此文字辨認問題可 簡化成文字中所含部首的種類與部首間空間關係辨認的問題。 本文的第 一部份即探討如何自文字中辨認出可能的部首。 經由筆劃次結構的使用 , 將部首辨認問題由上而下簡化成明顯筆劃的組合,再利用知識導引的 方式, 由下而上藉由穩定關係找尋相關筆劃,組合成筆劃次結構,最後 在部首知識導引下,找出可能的部首種類。 由於不使用筆劃間的空隙來 切割部首, 因此筆劃間的疏密程度並不會影響本方法辨認部首的正確性 。 本文第二部份則利用文字中的次結構種類及其空間關係,提出一種手 寫中文字的大分類方法, 以期快速找出待辨文字的可能所屬文字分類, 減少候選文字種類,降低文字辨認的複雜度。 手寫文字中,有時會因書 寫不慎, 造成兩筆劃不慎相交,或兩筆劃應相交而未相交,也可能因為 筆劃彎曲過大, 遭誤判為特徵,這些因素可能使得文字中部份次結構發 生變化, 在此種情況下,如何盡量找出正確文字的辨認結果,亦為本論 文所要考慮的問題。在本文第三部份中提出一套以圖 (graph) 為基礎的 方法, 將次結構及部首知識以圖表示,並發展圖形比對的方法,在容許 文字中出現部份錯誤特徵的情況下, 找出文字中各筆劃團可能的次結構 及部首種類。 如此,對文字中消失或多出一些簡單的次結構時,本部份 方法可以獲得正確文字分類結果的方法。 本文使用了許多例子來驗證主 要觀念的可行性, 同時也對所提方法的績效加以評估,並和其他現存方 法作比較。 This dissertation is concerned with handwritten Chinese character recognition problem. In the first part, we propose a problem reduction technique to reduce the radical recognition problem to a subproblem of recognizing stable stroke substructure(s) in the radical. Furthermore, the stroke substructure recognition problem can be further reduced to a subproblem of identifying a salient stroke in each stroke substructure. The actual recognition process will work in the reversed order, i.e., from salient stroke toward radical. In this problem reduction formulation, each subproblem deals with a simpler but stabler stroke substructure than its original problem, so we can find an easier and more reliable solution to the subproblem. In the second part, we proposed a preclassification technique for handprinted Chinese characters. By using the stable stroke substructures contained in a character as features, we can get a good classification result. According to the radical recognition strategy mentioned above, the stroke substructures can be expanded to a radical under the guidance of knowledge base. There would be no overhead caused by this proposed preclassification stage. Since there may be some incorrect stroke features, including missing or incorrect stroke intersection and connection relationships caused by handwriting variance. These incorrect features will cause the stroke substructures and radicals to be incorrectly recognized. In the third part, we propose a graph-based approach to deal with these variation problems. By matching subgraphs of a character with graphs of predefined stroke substructures and radicals, we can find all possible stroke substructures and radicals in a character. Examples are included to illustrate the ideas presented above. The performance of the algorithms is also evaluated, and comparisons with some other existing methods are made.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT830392008
http://hdl.handle.net/11536/58927
顯示於類別:畢業論文