完整後設資料紀錄
DC 欄位語言
dc.contributor.author鄭紹余en_US
dc.contributor.authorShau-Yu Chengen_US
dc.contributor.author李錫堅en_US
dc.contributor.authorHsi-Jian Leeen_US
dc.date.accessioned2014-12-12T02:20:17Z-
dc.date.available2014-12-12T02:20:17Z-
dc.date.issued1998en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#NT870392024en_US
dc.identifier.urihttp://hdl.handle.net/11536/64044-
dc.description.abstract一般的文件處理系統包含兩個部份:文字切割與文字辨識。在本論文提出了一個有效率之文字切割系統。 這個系統含有兩個模組: 文件分析與文字切割。在文件分析部份,我們先進行縮圖與抽取連通元件(Connected-Components) ,接著將連通元件分為圖形或文字元件。在抽取出文件上之文字元件後,我們將文字元件合併成文字區塊,並檢查圖元件內是否有文字元件。若有,則抽取出來並合併至文字區塊中。最後,對所有的文字區塊切割出一行行之文字。 當區塊的文字行被切開後,針對每個文字區塊,我們先檢查區塊中是否有首字放大情形。若有,則抽取之。最後,我們針對每個文字行執行文字切割以切出中文、英文與數字。 在我們的實驗中,文字切割的正確率約98.9% ,對於一份內含1158個的文件所需時間為5秒。由此證明了我們系統的效率。zh_TW
dc.description.abstractA general document processing system usually includes two major modules: character segmentation module and character recognition module. In this thesis, we present an automatic system to segment characters efficiently. Our character segmentation system contains two modules: document layout analysis and character segmentation. In the document layout analysis module, we first perform image reduction and connected-components extraction. In the component classification procedure, the connected-components be classified as image components or text components. In the block segmentation procedure, we merge all text components into text blocks . The extraction of text components from image components can group all text components into text blocks. Finally, we perform text line segmentation to segment all text lines in the text blocks. After all text lines have been segmented, we found and extracted the initial caps if they exist in the text blocks. Finally we segment the Chinese characters, English letters and numerals in the character segmentation module. In our experiment, the character segmentation rate of our system is about 98.9% and the processing time is about 5 seconds per page with 1158 characters. This proves the effectiveness of our proposed system.en_US
dc.language.isoen_USen_US
dc.subject切字zh_TW
dc.subjectcharacter segmentationen_US
dc.title中文雜誌內對中英文字與圖混合之切字zh_TW
dc.titleCharacter Segmentation in Chinese Magazines with Mixed Alphabets, Numerals and Figuresen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
顯示於類別:畢業論文