Title: | Multi-layer segmentation of complex document images |
Authors: | Wu, BF Chen, YL Chiu, CC 電控工程研究所 Institute of Electrical and Control Engineering |
Keywords: | complex compound document;image segmentation;document analysis;text extraction |
Issue Date: | 1-Dec-2005 |
Abstract: | Text is commonly printed on a complex background. Segmenting text is an important part in document analysis. In the past some methods have been shown for the segmentation of texts with images. However, previous studies have not sufficiently addressed complex compound documents. This investigation presents an algorithm for the segmentation of text in various document images. The proposed segmentation algorithm applies a new multilayer segmentation method to separate the text from various compound document images, independent from the text and background overlapping or not. This method solves various problems associated with the complexity of background images. Experimental results obtained using various document images scanned from book covers, advertisements, brochures and magazines, reveal that the proposed algorithm can successfully segment Chinese and English text strings from various backgrounds, regardless of whether the texts are over a simple, slowly varying or rapidly varying background texture. |
URI: | http://dx.doi.org/10.1142/S0218001405004435 http://hdl.handle.net/11536/13047 |
ISSN: | 0218-0014 |
DOI: | 10.1142/S0218001405004435 |
Journal: | INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE |
Volume: | 19 |
Issue: | 8 |
Begin Page: | 997 |
End Page: | 1025 |
Appears in Collections: | Articles |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.