標題: 表格文件自動描述與分類系統
Automatic Form Description and Classification System
作者: 黃煒生
Wei-Sheng Huang
陳 稔
Dr. Zen Chen
關鍵字: 文件處理;結構辨認;表格描述;表格分類;document processing;structure recognition;form description; form classification
公開日期: 1993
摘要:   表格分類是表格文件處理的重要步驟之一。表格分類的工作包含兩個
格的種類。 本文提出表格結構碼,以作為建立表格資料庫的主要資料
具有可重建性(Reconstructible )。本文也考慮到在現實的應用中,會發
生的線標註不穩定問題及線合併不穩定等問題 , 故本方法的表格結構碼
非常穩定( Robust)。這對表格辨認工作有很大的幫助。 最後本文以
Form classification is an important task in the field of form
processing. The proposed method is based on the primitive
line features and it can describe simple and complex forms.
There are three components in the form description: structure
code, normalized horzontal line vertical distance vector,
and normalized vertical line horzontal distance vector.Here
structure code is a string code which is a robust feature.The
proposed form classification process is based on the structure
code and is very simple. On the other hand, normalized
horzontal line vertical distance vector and normalized
vertical line horzontal distance vector provide a
similarity measure between the forms . It can be shown that
the original form can be reconstructed up to a scale factor
from its form description. Experimental results are given to
show the feasibility of the proposed method.