标题: | 病历表影像的分割与内容元件的分类 Form Segmentation and Component Classification for Clinic Data Image Analysis |
作者: | 曾咏超 Tseng, Yung-Chao 蔡文祥 Wen-Hsiang Tsai 资讯科学与工程研究所 |
关键字: | 病历表;表格;分割;分类;破碎字;clinic data;form;segmentation;classification;broken stroke |
公开日期: | 1997 |
摘要: | 本论文提出了病历表影像分割与其内容元件分类的方法。首先,提出 一修改的赫夫演算法,可以侦测出病历表影像中有文字杂讯干扰的连续线 段。基于这个演算法,我们又提出了抽取影像中表格框线的递回演算法, 可以结构化地抽取表格框线,并以树状结构排列。在抽取表格框线后,我 们提出了几种修补遮罩来修补表格框线抽取过程中所造成的破碎字,并以 寻找连接元件的方式,与几项合并技巧,将影像中的内容元件分割出来。 最后,我们提出几个用来分辨预先印刷元件或是手写填入元件的特征,与 一个倒传递类神经网路,将病历表影像中的内容元件分类。藉由实验的结 果,可以证明我们所提出的方法是可行的。 An integrated approach to form segmentation and component classification forclinic data image analysis is proposed. First, a modified Hough algorithm isproposed, which can be used to detect consecutive line segments from noisy formsin a clinic data image. Based on this algorithm, a top-down recursive form frameextraction algorithm is proposed, by which form frames in a clinic data imageare extracted structurally. After form frame extraction, a restoration maskingmethod is proposed, which is used to restore the broken strokes resulting fromframe extraction. Then, a component extraction and merge method is proposed forform content extraction. Finally, several features and a back propagation neuralnetwork are used for classification of preprinted components and filled-in hand-written components. Experimental results are shown to prove the feasibility ofthe proposed approach. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT860394051 http://hdl.handle.net/11536/62881 |
显示于类别: | Thesis |