中文郵件地址辨識

標題:	中文郵件地址辨識 Chinese Mail Address Recognition
作者:	黃百綱 Bai-Kuang Hwang 蔡文祥 Wen-Hsiang Tsai 資訊科學與工程研究所
關鍵字:	郵件; 信封; 地址; 文字辨識;;envelope; address; character recognition;
公開日期:	1994
摘要:	本論文提出一個辨識手寫中文郵件地址的方法。這個方法著重在處理一般常用的標準尺寸信封上的地址的地址的城市(市/縣)和區域(鄉/鎮/區)名稱，這個方法包括五個主要步驟：前處理、地址區塊定位、文字切割、文字辨識和字詞後處理等；首先，前處理步驟可濾除掉一些信封上的雜訊和不用的元件，藉由一些元件的特徵，可以判斷出信封的格式；其次，在地址區塊定位步驟中，可以將剩下的元件分群成數個區塊，我們計算各個區塊的一些特徵和位置關係，來決定何者是收件地址區塊；接著，收件地址區塊內的元件會傳給文字切割模組，將此區塊內的元件轉成一個個中文字。然後執行文字辨識步驟，收件地址區塊內的中文字將被辨識出成為一個個對應的中文碼，由於文字辨識結果仍然有一些錯誤，所以，有時無法找到正確的地址﹝市、區﹞，因此最後，我們使用一個後處理步驟來修正辨識後的結果，它是以已知的地址字詞關係來修正錯誤，進而找到這地址的郵遞區號；我們使用所提出的方法來測試一些真實的信封，各個步驟的實驗結果都列在本論文中，可看出這個方法的實用性。 An approach to recognition of the handprinted Chinese mail addresses on envelopes, which focuses on processing addresses on standard-size envelopes. There are mainly five stages in the proposed approach, namely pre- processing, location of address blocks, character segmentation, address recognition, and postprocessing. First, a preprocessing stage is designed to filter out noise and to remove unwanted components. Next, a stage of address block location, is used to group the rest components into blocks. Some features are extracted from the blocks to identify the envelope format. And the destination address block is located. The address block is then fed to character segmentation stage to merge the components into characters. Then, in the address recognition stage, the characters are recognized using statistical character recognition algorithm. Since there are errors in the recognized characters, the correct destination address may not be found in this stage. So, the postprocessing stage to modify the erroneous address characters into correct ones according to the contextual relationship among the characters. The proposed approach has been tested on some real envelopes and the experimental results show its feasibility.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#NT830394073 http://hdl.handle.net/11536/59098
Appears in Collections:	Thesis