完整後設資料紀錄
DC 欄位語言
dc.contributor.author高國忠en_US
dc.contributor.authorKuo-Chung Kaoen_US
dc.contributor.author李錫堅en_US
dc.contributor.authorHsi-Jian Leeen_US
dc.date.accessioned2014-12-12T02:20:22Z-
dc.date.available2014-12-12T02:20:22Z-
dc.date.issued1998en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#NT870392086en_US
dc.identifier.urihttp://hdl.handle.net/11536/64111-
dc.description.abstract在這篇論文裡我們提出兩種方法來處理有關於在已知表格中與線重疊之文字辨識,這些受干擾的文字(interfered-characters)不能正確的被抽取出來,而且對於傳統的光學文字辨識(optical character recognition ,OCR)核心會導致辨識錯誤。第一種方法是移除表格線然後重建文字:當線移除後,文字就會破碎而且筆劃會分成兩群筆劃端(stroke-ends),我們利用筆劃端的共線性(colinearity)和位置來找出正確的連接對應,同時,破碎筆劃之間的縫隙填補能儘量重建成原來的文字。第二種方法是修改OCR來辨識受干擾的文字:我們依據投影資訊來均勻分割含有表格線的印刷字體,我們找出表格線在受干擾字的位置並且計算表格線的兩種特徵值:CDFs(contour-direction features)和CCFs(crossing-count features),所以我們根據表格線的特徵值修正OCR的特徵值來辨識這些受干擾的文字。 在第一個實驗中,我們先用938個受干擾的手寫字做測試,辨識率為23.7%,經過使用第一種方法之後,辨識率提升到78.3%。在第二個實驗中,我們測試了695個與線重疊的印刷字,辨識率為64.3%,當使用第二種方法之後,辨識率增加到77.3%。zh_TW
dc.description.abstractThe thesis aims to provide two methods to deal with the recognition of characters overlapping with lines in known forms. The interfered-characters can't be extracted from the text lines exactly and the traditional OCR engines will fail to recognize characters with interference. The first method is to remove form lines and reconstruct characters. Characters are broken with line removal and strokes are separated into two sets of stroke-ends. The colinearity and position of the stroke-ends are used to find out correct connecting correspondences. Gaps of the broken strokes are filled to reconstruct the original characters. The second method is to modify the OCR model to fit interfered-characters. Printed characters with form lines are uniformly segmented according to projection profiles. The locations of form lines in the interfered-characters are extracted and both CDFs (contour-direction features) and CCFs (crossing-count features) of form lines are calculated. Trained features of the OCR engine are modified by the features of form lines to match interfered-characters. In the first experiment, 938 handwritten characters with form lines are tested, and the recognition rate is 23.7%. After using the first method, the accuracy is raised to 78.3%. In the second experiment, 695 printed characters with form lines are tested, and the recognition rate is 64.3%. After using the second method, the accuracy is increased to 77.3%.en_US
dc.language.isoen_USen_US
dc.subject受干擾的字zh_TW
dc.subject中文辨識zh_TW
dc.subject文字重建zh_TW
dc.subjectinterfered-charactersen_US
dc.subjectchinese recognitionen_US
dc.subjectcharacters reconstructionen_US
dc.title已知表格中與線重疊之中文字辨識zh_TW
dc.titleRecognition of Chinese Characters with Overlapped Lines in Known Formsen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
顯示於類別:畢業論文