標題: 線上中文字的拆解與字根辨識
On-line Chinese Character Decomposition and Radical Recogniton
作者: 胡文傑
Hu, Wen-Jia
劉振漢
Liu Jenn-Hann
資訊科學與工程研究所
關鍵字: 字根辨識;字根拆解;中文辨識;線上手寫;字根集;Radical Recognition;Radical Decomposition;Chinese Recognition;On-Line Hand-Written;Radical Set
公開日期: 1995
摘要: 本篇論文的目的是發展一個方法﹐用以辨識奶文手寫輸入字的字根( radical)。許多學者相信﹐字根是解決中文字相關問題的關鍵。 我們 發現﹐如果能夠"正確地"將字分割成子字形﹐再決定子字形是甚麼字根﹐ 是比較容易的。 我們發展了三種方法將字切割成子字形﹐而子字形再 切割成更小的子字形。每個筆劃的第一點都是候選的切割點。我們試著以 一條垂直線或水平線來切割字﹐並配合許多整體或局部的加權法則﹐替候 選切割點評分。再挑選最高分者為切割點﹐以此點為界﹐之前的所有筆劃 組成一個子字形﹐之後的所有筆劃組成另一個子字形。 我們測試了兩 組手寫字體﹐每組為5401字。其中一組是比較潦草的﹐另一組是比較工整 的。字根辨識率分別是67.83% 和 89.10%。 The goal of this thesis is to develop a scheme for the recognitionof radicals in Chinese characters input from a pen. Many researchers sharethe conviction that radicals are the key to the problems associated with Chinese characters. We found that if we can 'correctly' split a character into sub-characters, then it is relatively easy to determine what radicals these sub-characters are. We developed three methods to split characters into sub-characters, and sub- characters in turn into sub-sub-characters. We first try to cut a character (sub-character) by a vertical line or a horizontal line. Besides, we have elaborated rules to calculate the scores of all candidates. We select the candidate with highest score to cut the character.The first point of every stroke is a candidate, which splits a character into two sub-characters. One consists of all strokes preceding the candidate,and the other consists of all strokes succeeding the candidate. We tested with two sets of written characters. Each set contains 5401characters. One set is more cursive that the other. The radical recognition rates are 67.83% and 89.10% respectively.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT840392015
http://hdl.handle.net/11536/60356
顯示於類別:畢業論文