標題: 表格文件自動描述與分類系統
Automatic Form Description and Classification System
作者: 黃煒生
Wei-Sheng Huang
陳 稔
Dr. Zen Chen
資訊科學與工程研究所
關鍵字: 文件處理;結構辨認;表格描述;表格分類;document processing;structure recognition;form description; form classification
公開日期: 1993
摘要:   表格分類是表格文件處理的重要步驟之一。表格分類的工作包含兩個 階段:(1)表格資料庫建立,包含表格的輸入;(2)表格的比對,以區分表 格的種類。 本文提出表格結構碼,以作為建立表格資料庫的主要資料 結構。此外也定義了正規化水平線垂直距離向量,和正規化垂直線水平距 離向量,以輔助結構碼的不足,做為表格相似度計算。本文的表格描述法 具有可重建性(Reconstructible )。本文也考慮到在現實的應用中,會發 生的線標註不穩定問題及線合併不穩定等問題 , 故本方法的表格結構碼 非常穩定( Robust)。這對表格辨認工作有很大的幫助。 最後本文以 實驗驗證,本文方法不論在佔用的空間、所能表示的表格種類、辨認速度 、可重建性、和描述法的穩定度等方面都優於別人的作法。 Form classification is an important task in the field of form processing. The proposed method is based on the primitive line features and it can describe simple and complex forms. There are three components in the form description: structure code, normalized horzontal line vertical distance vector, and normalized vertical line horzontal distance vector.Here structure code is a string code which is a robust feature.The proposed form classification process is based on the structure code and is very simple. On the other hand, normalized horzontal line vertical distance vector and normalized vertical line horzontal distance vector provide a similarity measure between the forms . It can be shown that the original form can be reconstructed up to a scale factor from its form description. Experimental results are given to show the feasibility of the proposed method.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT820392001
http://hdl.handle.net/11536/57803
顯示於類別:畢業論文