Full metadata record
DC FieldValueLanguage
dc.contributor.author鄭泰銘en_US
dc.contributor.authorCHENG, TAI-MINGen_US
dc.contributor.author李錫堅en_US
dc.contributor.authorHsi-Jian Leeen_US
dc.date.accessioned2014-12-12T02:20:20Z-
dc.date.available2014-12-12T02:20:20Z-
dc.date.issued1998en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#NT870392064en_US
dc.identifier.urihttp://hdl.handle.net/11536/64087-
dc.description.abstract在某些應用領域中, 例如名片辨識, 我們必須在沒有整篇文件資訊的情況下, 辨識一行文字. 這篇論文中, 我們提供了一種方法, 針對任何一行獨立文字, 做出正確的辨識. 此方法主要包括三部份, 即前處理、文字辨識核心和後處理. 首先將一行單行的二值化影像做水平校正, 控制傾斜角度於 0.3 度以內, 然後 偵測此行文字是否為斜體字. 在抽取所有的連通元件 (connected components) 之後 , 經過適當的合併與去雜訊處理, 根據先前偵測到的傾斜角度做垂直方向的平移, 然後平滑化. 抽取出來的元件, 由一個「雙核心」架構的核心程式辨識, 視其為斜體或正體 而定, 由這兩個核心其中之一做辨識, 並且, 嘗試切割辨識結果較差之元件, 因為 某些元件可能包含不止一個字元, 而是多個字元相連而成. 切割的方法是利用搜尋 樹的 branch-and-bound 先深 (depth-first) 搜尋. 最後, 元件的垂直位置與字元高度可用來檢查辨識結果. 將一些不可能的字元 排除之後, 正確的字元就可以提升到第一名. 此外, 我們提出了一個決定空白字元 的方法. 對於某些大小寫外型相同的字元, 我們也可以由其垂直位置與字元高度來 判斷其為大寫或小寫. 我們從 107 張英文名片上剪取 646 行的單行文字, 作為測試樣本. 水平校正的 正確率為 99.23%; 斜體字判斷的正確率為 100%, 相連文字有 93.18% 被正確地 切割出來. 核心方面, 正體與斜體的正確率分別達到了 99.07% 與 98.53%.zh_TW
dc.description.abstractIn this thesis, we design a procedure for recognizing single text lines. In certain applications, single text lines are to be recognized without any whole-document information. This procedure consists of three parts: pre-processing, character recognition kernel, and post-processing. In the first phase, the skewing angle and italicness of the binarized image of a single text line are detected. After all connected components being extracted and proper combination/deletion, the vertical positions of components are shifted. Images are smoothed then. The components are to be recognized and, if necessary, segmented, using a dual-kernel according yto whether it is an italic text line or a roman one. Touching charcters are segmented using branch-and-bound tree traversal. Finally, vertical position information is used to post-process the recognition results. Some impossibilities are rejected and the correct class is eventually promoted to the first candidate. An approach to determining space characters using the profile is introduced. Characters that have the same shape in capital and lower case are justified according to their heights. In our experiments, we tested 646 text lines cut from English business name cards. The accuracy of skewing-angle detection was 99.23%. The accuracy of italicness detection was 100%. 93.18% of touching characters were correctly segmented. The character recognition rates for correctly segmented or un-touched roman and italic characters were 99.07 and 98.53 respectively.en_US
dc.language.isoen_USen_US
dc.subject文字辨識zh_TW
dc.subject文件分析zh_TW
dc.subject英文字母zh_TW
dc.subject統計式圖形辨識zh_TW
dc.subject雙核心架構zh_TW
dc.subject水平校正zh_TW
dc.subject連字切割zh_TW
dc.subject斜體字偵測zh_TW
dc.subjectCharacter Recognitionen_US
dc.subjectDocument Analysisen_US
dc.subjectEnglish Alphabetsen_US
dc.subjectStatistical Pattern Recognitionen_US
dc.subjectDual-kernel Architectureen_US
dc.subjectDe-skewingen_US
dc.subjectTouching Character Segmentationen_US
dc.subjectDetection of Italic Text Linesen_US
dc.title英文字母與數字之辨識zh_TW
dc.titleCharacter Recognition of English Alphabets and Numeralsen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
Appears in Collections:Thesis