標題: Text extraction from complex document images using the multi-plane segmentation technique
作者: Chen, Yen-Lin
Wu, Bing-Fei
電控工程研究所
Institute of Electrical and Control Engineering
公開日期: 2006
摘要: This study presents a new method for extracting characters from various real-life complex document images. The proposed method applies a multi-plane segmentation technique to separate homogeneous objects including text blocks, non-text graphical objects, and background textures into individual object planes. It consists of two stages - automatic localized multilevel thresholding, and multi-plane region matching and assembling. Then a text extraction process can be performed on the resultant planes to detect and extract characters with different characteristics in the respective planes. The proposed method processes document images regionally and adaptively according to their respective local features. This allows preservation of detailed characteristics from extracted characters, especially small characters with thin strokes, as well as gradational illuminations of characters. This also permits background objects with uneven, gradational, and sharp variations in contrast, illumination, and texture to be handled easily and well. Experimental results on real-life complex document images demonstrate that the proposed method is effective in extracting characters with various illuminations, sizes, and font styles from various types of complex document images.
URI: http://hdl.handle.net/11536/17235
http://dx.doi.org/10.1109/ICSMC.2006.384668
ISBN: 978-1-4244-0099-7
ISSN: 1062-922X
DOI: 10.1109/ICSMC.2006.384668
期刊: 2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS
起始頁: 3540
結束頁: 3547
顯示於類別:會議論文


文件中的檔案:

  1. 000248078503145.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。