標題: 病歷表影像的分割與內容元件的分類
Form Segmentation and Component Classification for Clinic Data Image Analysis
作者: 曾詠超
Tseng, Yung-Chao
蔡文祥
Wen-Hsiang Tsai
資訊科學與工程研究所
關鍵字: 病歷表;表格;分割;分類;破碎字;clinic data;form;segmentation;classification;broken stroke
公開日期: 1997
摘要: 本論文提出了病歷表影像分割與其內容元件分類的方法。首先,提出 一修改的赫夫演算法,可以偵測出病歷表影像中有文字雜訊干擾的連續線 段。基於這個演算法,我們又提出了抽取影像中表格框線的遞迴演算法, 可以結構化地抽取表格框線,並以樹狀結構排列。在抽取表格框線後,我 們提出了幾種修補遮罩來修補表格框線抽取過程中所造成的破碎字,並以 尋找連接元件的方式,與幾項合併技巧,將影像中的內容元件分割出來。 最後,我們提出幾個用來分辨預先印刷元件或是手寫填入元件的特徵,與 一個倒傳遞類神經網路,將病歷表影像中的內容元件分類。藉由實驗的結 果,可以證明我們所提出的方法是可行的。 An integrated approach to form segmentation and component classification forclinic data image analysis is proposed. First, a modified Hough algorithm isproposed, which can be used to detect consecutive line segments from noisy formsin a clinic data image. Based on this algorithm, a top-down recursive form frameextraction algorithm is proposed, by which form frames in a clinic data imageare extracted structurally. After form frame extraction, a restoration maskingmethod is proposed, which is used to restore the broken strokes resulting fromframe extraction. Then, a component extraction and merge method is proposed forform content extraction. Finally, several features and a back propagation neuralnetwork are used for classification of preprinted components and filled-in hand-written components. Experimental results are shown to prove the feasibility ofthe proposed approach.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT860394051
http://hdl.handle.net/11536/62881
Appears in Collections:Thesis