彩色混合模式封面文字字串之分割

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	陳貴青	en_US
dc.contributor.author	Guey-Ching Chen	en_US
dc.contributor.author	林錫寬	en_US
dc.contributor.author	Shir-Kuan Lin	en_US
dc.date.accessioned	2014-12-12T02:26:26Z	-
dc.date.available	2014-12-12T02:26:26Z	-
dc.date.issued	2000	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#NT890591016	en_US
dc.identifier.uri	http://hdl.handle.net/11536/67782	-
dc.description.abstract	圖文分割是文件分析中一個相當重要的過程，好的圖文分割演算法可以使一份數位文件中的圖形與文字分隔結果更加正確，或者是能節省下更多的時間。而依欲分割的數位文件色彩特性，可以區分成單色文件及彩色文件圖文分割兩大類，一般來說由於彩色文件擁有不確定的背景、圖形及文字色彩，有的甚至有圖文的交疊作用，因此在處理上將比單色文件的圖文分割來的複雜。本論文利用七個處理程序，針對彩色數位文件做圖文分割，對圖文交疊及字串傾斜的情況亦可有效處理，此七個步驟依序為：1.以色彩群聚(Color Clustering)將相近的色彩聚成數個標準類別；2.以邊緣檢測的結果標示區塊；3.以區域成長法則補償小區塊；4.以聚類分析做色彩分類； 5.以Run Length Smoothing做區塊結合；6.利用過濾條件擷取文字字；7.以投影法校正傾斜的文字字串。本論文將以 Borland C++ Builder 程式語言建立圖文分割演算法及使用者操作介面，配合個人電腦及掃瞄器來取得欲處理的彩色數位文件，以這些彩色數位文件來驗證本處理方法的可行性，最後則將以市售OCR軟體對處理前後的結果做比較與討論。	zh_TW
dc.description.abstract	Segmentation of pictures and texts is an important phase of document analysis,a good algorithm can make the result correcter or reduce processing time. According to the feature of colour information of digital documents, this task can be classified into two types: monochrome documents segmentation and color documents segmentation. Commonly, the components (text, picture, background)in color documents have uncertain colour, sometimes text string is embedded in color images. Because of these reasons, it is much more difficult to separate text from color documents than monochrome documents. We present a text segmentation scheme, using seven phases to deal with digital colour documents. This scheme is also useful for complicated documents, for example, text is embedded in color images or text string is skew. The seven phases are: 1.color clustering: classify image color according to several standard color; 2.detect edge and label block: use the result of edge detection to label block; 3.region growing: use the region growing rule to compensate small blocks; 4.color classification: classify the block according to color; 5.run length smoothing: merge the near block; 6.filter: extrace the text block; 7.profile projection: correct the skew text string . We uses Borland C++ Builder Language to accomplish the user interface and algorithm, the digital color documents are gotten by scanner. We use the OCR software to recognize our experimental results. Finally, we aim at the results to discuss.	en_US
dc.language.iso	zh_TW	en_US
dc.subject	圖文分割	zh_TW
dc.subject	圖文分離	zh_TW
dc.subject	文件分析	zh_TW
dc.subject	字串擷取	zh_TW
dc.subject	文字分割	zh_TW
dc.subject	segmentation	en_US
dc.subject	document analysis	en_US
dc.subject	text extraction	en_US
dc.subject	text segmentation	en_US
dc.title	彩色混合模式封面文字字串之分割	zh_TW
dc.title	Text string segmentation from colored mixed-mode covers	en_US
dc.type	Thesis	en_US
dc.contributor.department	電控工程研究所	zh_TW
顯示於類別：	畢業論文