標題: 類神經網路於高辨識率手寫中文字辨識之研究
Neural Networks for High Performance Handwritten Chinese Character Recognition
作者: 張宏淵
Hung-Yuan Chang
傅心家
HSIN-CHIA FU
資訊科學與工程研究所
關鍵字: 自增長機率決策型的類神經網路;手寫中文字辨識;相似字辨識;特徵維度降低;使用者調適;Self-growing Probabilistic Decision-based Neural Networks (SPDNN);Handwritten Chinese Character Recognition;similar characters recognition;feature reduction;user adaptation
公開日期: 2000
摘要: 本論文提出一個自增長機率決策型的類神經網路(Self-growing Probabilistic Decision-based Neural Networks, SPDNN)手寫中文字辨識系統。 SPDNN 是一種以機率值做決策的模組化類神經網路,它採用階層式的網路結構,並擁有非線性的基底函數及競爭性的信賴值指派。 我們利用SPDNN模型建立了一個二級式的辨識系統。我們先用粗分類器針對輸入的影像找出一個最有可能的子集合,然後再用文字辨識器比對子集合中的文字,藉以找出與輸入的影像最像的字。
為了更進一步地提升系統的辨識率以及辨識速度,我們提出了三個方案。首先, 我們著重於相似字的問題。由於相似字間筆劃結構非常類似,因此必須使用能夠精確顯示些微差異的特徵。基於此,我們選用了一些高維度的區域性特徵來增進相似字的辨識率。其次,由於當採用高維度的特徵時意味著系統需要更多的計算及記憶體,因此我們嘗試萃取最有用的特徵以求降低維度並同時保持辨識率。 我們檢測了三個方法:(1)以F-Ratio為標準做特徵選取,(2)主要成分分析(Principal Component Analysis, PCA),(3)線性分別分析(Linear Discriminant Analysis, LDA)。實驗結果顯示LDA有最好的效果。 最後, 我們提出使用者調適的模組,藉著漸增式學習的演算法來調整系統的參數。當某個使用者一再的輸入手寫字時,系統即能逐漸地學習到這個使用者的書寫風格。
我們驗證了SPDNN較傳統的統計方法為優,其原因在於加強及逆向加強(reinforced and antireinforced)的學習法則可以學習出準確的決策分界以及自增長法則(Self-growing rules)可以決定恰當的高斯數目來表示文字的分佈。我們也驗證了高維度的區域性特徵能夠精確表示相似字間些微的差異。實驗的結果也顯示在HCCR 中用LDA 降低資料的維度相當可行。 最後, 我們也驗證了漸增式學習的演算法能夠很快地學習到個別使用者的書寫風格。
因此我們十分確信,我們所提出的SPDNN非常適用於手寫中文字的辨識。 它不僅比傳統的統計方法更好, 同時也易於調適為個人的系統。 此外採用高維度的區域性特徵以及使用LDA 降低資料的維度不只能提高辨識率又可兼顧辨識速度。期望此研究結果對手寫中文字辨識的發展有所助益。
在此,我們列舉幾個未來值得研究的方向:(1)探討SPDNN模型中高斯分佈的個數與相似字之間的關係,(2)探討SPDNN模型中高斯分佈的合併問題(3)在採用LDA 後,發展可行的快速搜尋方法,(4)建立更有效率的調適方法。
In this dissertation, we proposed the Self-growing Probabilistic Decision-based Neural Networks (SPDNN) for handwritten Chinese character recognition. The proposed SPDNN is a probabilistic variant of the decision-based modular neural network for classification. It adopts a hierarchical networks structure with nonlinear basis functions and a competitive credit-assignment scheme. Based on the SPDNN model, we have constructed a two-stage recognition system. At first stage, a coarse classifier is used to determine an input image to be one of the pre-defined subclasses. Then a character recognizer is adopted to determine the input image to the best matched reference character in the subclass.
In order to further improve the recognition performance and efficiency, three schemes have been proposed. The first scheme addressed the problem of similar or confusing characters. Since the similar characters usually are composed of likely stroke structure or pixel distribution, the features used in similar character recognition need to be precisely represent their minor differences. Based on this observation, some high-dimensional local features are selected to enhance the separability among similar characters.
The second scheme was proposed to reduce the dimensionality of data. Since a recognizer with high-dimensional features requires more complex computation and data storage, thus eliminating redundant features such that a most effective features can be achieved to preserve or even to improve the performance. Three methods have been examined: (1) feature selection using F-Ratio as a criterion, (2) principal component analysis (PCA), and (3) linear discriminant analysis (LDA). The experimental results showed that the LDA has the best performance and efficiency.
Finally, we proposed a personal adaptation module to fine tune the parameters of the SPDNN character recognizer in order to adapt a user's own writing style. The parameters or the decision boundaries of the corresponding subnets in the SPDNN are modified according to the proposed incremental learning algorithm. When more and more freehand-written characters are presented to the system, the SPDNN recognizer will gradually learn the user's personal writing style.
We showed that the SPDNN is better than conventional static methods for HCCR, due to (1) the reinforced and antireinforced learning rules, and (2) the self-growing rules, which can learn the character decision boundary precisely and allow for just enough number of Gaussian clusters to represent the character image distribution. We also showed that the high-dimensional local features could precisely represent the minor differences among similar characters. However, it means the recognizer requires more complex computation and data storage. The results obtained suggest that the LDA is very promising to reduce the dimensionality of data for HCCR. Finally, we showed that the proposed incremental learning algorithm could learn the user's personal writing style very fast.
Therefore, we are quite sure that the proposed SPDNN is very suitable for HCCR. It not only outperforms the conventional static methods but also can easily be adapted to a personal system. Besides, adopt high-dimensional local features and use LDA to reduce the dimensionality can improve the recognition accuracy well and speed up the recognition time. We hope our study can make help to HCCR.
We list the suggestions on the directions for future research. (1) Find the relation between the number of subnets and similar characters. (2) Develop methods to merge some of the Gaussian distributions in each class. (3) Develop fast search algorithms after adopting LDA. (4) Find more efficient adaptation schemes.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT890392103
http://hdl.handle.net/11536/66896
Appears in Collections:Thesis