完整後設資料紀錄
DC 欄位語言
dc.contributor.author吳志宏en_US
dc.contributor.authorWu, Chih-Hungen_US
dc.contributor.author李錫堅en_US
dc.contributor.authorHsi-Jian Leeen_US
dc.date.accessioned2014-12-12T02:17:13Z-
dc.date.available2014-12-12T02:17:13Z-
dc.date.issued1996en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#NT850392012en_US
dc.identifier.urihttp://hdl.handle.net/11536/61759-
dc.description.abstract本論文在介紹在已知表格中切割手寫中文字的方法. 我們對於輸入的公文 影像會先做一些前處理的工作. 這些工作包括有二值化, 去雜訊, 以及欄 位內容抽取. 我們對每一個欄位的內容先做投影的分析. 利用對於投影的 分析來將一整行的文字分成一個個部份. 接下來我們將上面找出來的投影 區塊分成四類, 分別是"mark", "half-word", "one-word", 和"two- word". 我們先將大的投影區塊分割成和該行文字平均大小相近的小區塊. 然後再依據一般寫字的習慣將較小的投影區塊合併. 為了減少由於中文數 目字與文字的某個部份間的模糊情形所產生的錯誤, 對於"half-word"區 塊, 我們將之送入一個統計式中文辨識系統做辨識. 依據辨識的結果決定 是否該區塊需與其它的區塊合併. 除此之外, 系統並且讓使用者可以線上 修改切字的結果.在測試的文件影像中共有1319個中文字, 不加入OCR 系 統切出率為91.76%. 如果加入OCR系統的抽出率增加為92.34% In this thesis, we introduce a method to segment handwritte Chinese charactersin form documents with know structure. In the first step, we perform some preprocessing operations to input form documents. These operations include binarization proposed by Niblack, noise removal, and text-line extraction. We then use projection profile analysis method to segment a text-line image to individual subimages. We classify the projection blocks found in previous step to four types, "mark", "half-word", "one-word", and "two-word". Then, we split large projection blocks to two or more blocks with heights close to the average character height in a text-line. We merge projection blocks that are small with some rules. In order to reduce the errors generated from the ambiguties between Chinese numeric characters and a component of a Chinese character, we introduce the OCR system to our character segmentation process. A "half-word" block is sent to a statistic character recognition module. We aAccording to the recognition result, we decide whether to merge it with other projection blocks. The system we propose a lso let users edit the segmentation results manually. There are totally 1319 Chinese characters in the test samples. The correct segmentation rate without OCR is 91.76%. The correct segmentation rate is increased to 92.34% with the help of the OCR system.zh_TW
dc.language.isozh_TWen_US
dc.subject手寫中文字切割zh_TW
dc.subject去雜訊zh_TW
dc.subject投影zh_TW
dc.subject切字程序修改zh_TW
dc.subjectHandwritten Chinese Character Segmentationen_US
dc.subjectNoise Removalen_US
dc.subjectProjectionen_US
dc.subjectCharacter Segmentation Revisionen_US
dc.title公文表格之手寫中文字切割zh_TW
dc.titleChinese Handwritten Character Segmentation in Form Documentsen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
顯示於類別:畢業論文