标题: 病历表影像的分割与内容元件的分类
Form Segmentation and Component Classification for Clinic Data Image Analysis
作者: 曾咏超
Tseng, Yung-Chao
蔡文祥
Wen-Hsiang Tsai
资讯科学与工程研究所
关键字: 病历表;表格;分割;分类;破碎字;clinic data;form;segmentation;classification;broken stroke
公开日期: 1997
摘要: 本论文提出了病历表影像分割与其内容元件分类的方法。首先,提出
一修改的赫夫演算法,可以侦测出病历表影像中有文字杂讯干扰的连续线
段。基于这个演算法,我们又提出了抽取影像中表格框线的递回演算法,
可以结构化地抽取表格框线,并以树状结构排列。在抽取表格框线后,我
们提出了几种修补遮罩来修补表格框线抽取过程中所造成的破碎字,并以
寻找连接元件的方式,与几项合并技巧,将影像中的内容元件分割出来。
最后,我们提出几个用来分辨预先印刷元件或是手写填入元件的特征,与
一个倒传递类神经网路,将病历表影像中的内容元件分类。藉由实验的结
果,可以证明我们所提出的方法是可行的。
An integrated approach to form segmentation and component
classification forclinic data image analysis is proposed. First,
a modified Hough algorithm isproposed, which can be used to
detect consecutive line segments from noisy formsin a clinic
data image. Based on this algorithm, a top-down recursive form
frameextraction algorithm is proposed, by which form frames in a
clinic data imageare extracted structurally. After form frame
extraction, a restoration maskingmethod is proposed, which is
used to restore the broken strokes resulting fromframe
extraction. Then, a component extraction and merge method is
proposed forform content extraction. Finally, several features
and a back propagation neuralnetwork are used for classification
of preprinted components and filled-in hand-written components.
Experimental results are shown to prove the feasibility ofthe
proposed approach.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT860394051
http://hdl.handle.net/11536/62881
显示于类别:Thesis