複合式高斯類神經網路之研究

標題:	複合式高斯類神經網路之研究 The Study of Mixture Gaussian Neural Networks
作者:	徐永煜 Yeong-Yuh Xu 傅心家 Hsin-Chia Fu 資訊科學與工程研究所
關鍵字:	類神經網路;複合式高斯密度函數;自成長式機率決策神經網路;一般化機率決策神經網路;視覺關鍵字;視覺關鍵字串;影像內容搜尋;手寫字辨識;Neural Networks;Mixture Gaussian density function;Self-growing Probability Decision based Neural Network;Generalized Probability Decision based Neural Network;Visual Keyword;Visual String;content-based image retrieval;handwritten character recognition
公開日期:	2003
摘要:	本論文在探討複合式高斯類神經網路模組於圖形識別的應用。基於一個類別一個網路 (One class in One Network) 的神經網路架構，本論文提出了自成長式機率決策神經網路(Self-growing Probability Decision based Neural Network (SPDNN))與Iteratively Supervised Learning and Unsupervised Growing (ISLUG)的訓練策略來自動的根據資料的複雜程度調整SPDNN的高斯個數(Gaussian cluster number)、位置與大小(mean and covariance matrix)進而增進辨識的正確率。本論文更進一步提出了一般化機率決策神經網路(Generalized Probability Decision based Neural Network (GPDNN))來解決當網路的輸入從數值一般化成機率分布的情況。為了驗證所提的類神經網路模組，本論文並將所提的類神經網路應用於手寫字辨識與以影像內容為基礎的影像搜尋的問題上。在手寫字辨識的應用上，本論文建立了以SPDNN為基礎的手寫字辨識器。基於CCL/HCCR1的資料庫，SPDNN手寫字辨識器可以達到86.12\%的辨識正確率。此辨識結果可以與Li and Yu (88.65\%) \cite{li-yu} and Tseng {\em et al} (88.55\%) \cite{Tseng98}所發表的正確率相匹配，但是SPDNN文字辨識器只用了92維的特徵，明顯比 \cite{li-yu} 的400維的特徵與 \cite{Tseng98} 的256維的特徵少。除此之外，本論文也建立了個人手寫字辨識系統。根據ISLUG的訓練策略，SPDNN個人手寫字辨識系統能漸漸地認識個人獨特的手寫字特徵，並進而提高對個人手寫字辨識的正確性。實驗結果顯示個人手寫字辨識的正確性可在SPDNN的學習過程中從$44.09\%$提高到$90.03\%$。在以影像內容為基礎的影像搜尋上，本論文基於GPDNN建立了一個影像內容搜尋系統(http://140.113.216.78/imagequerysystem)，並提出了兩個新的概念：視覺關鍵字(Visual Keyword)與視覺關鍵字串(Visual String) 來作為影像的索引。視覺關鍵字主要是用來描述影像中同質區塊(homogenous region)的視覺特性，如顏色、紋理與形狀特徵。而視覺關鍵字串則用來描述區塊間於影像上的位置關係。基於Corel的資料庫，GPDNN影像內容搜尋器在視覺關鍵字的搜尋上可以達到$49.4\%$的正確率。其結果可與\cite{wang01simplicity}所提的IRM方法($46.8\%$)與\cite{chen02regionbased}所提的UFM方法($47.7\%$) 相匹配；另一方面，當考慮區塊間於影像上的位置關係時，用視覺關鍵字串搜尋確實能很明顯的將所感興趣的影像排序在影像序列的前端。由上所述的應用可知，所提出的複合式高斯類神經網路模組確實能夠根據的資料特性，適切的學習並辨識這些資料。 In this dissertation, the mixture Gaussian Neural networks are proposed for pattern recognition. {\it Self-growing Probabilistic Decision based Neural Network} (SPDNN) is proposed to classify the numerical data. The ISLUG training scheme is introduced to tune the SPDNN parameters to improve the classification accuracy. Furthermore, a generalized version of SPDNN, called {\it Generalized Probabilistic decision based Neural Network} (GPDNN), is proposed to handle the general case that the data are in the form of the distributions instead of the numerical quantities. In order to verify the accuracy of the proposed SPDNN and GPDNN, the applications of handwritten character recognition and content-based image retrieval are involved. An SPDNN-based Handwritten Chinese major hybrid character recognition system is developed. All the major processing modules, including pre-processing and feature selection modules, a coarse classifier, a character recognizer, and an personal adaptation module, are implemented. Based on the CCL/HCCR1 database, the SPDNN character recognizer achieved 86.12\% recognition accuracy, which is comparable to the reports by Li and Yu (88.65\%) \cite{li-yu} and Tseng {\em et al} (88.55\%) \cite{Tseng98}, but uses only 92 features compared to the huge number of character features used in \cite{li-yu} (400 features) and \cite{Tseng98} (256 features). In the content-based image retrieval application, a {\it Neural Networks based Image Retrieval System} (NNIRS) is developed based on GPDNN and is implemented at ``http:// 140.113.216.78/ imagequerysystem''. Two novel image representation concepts for CBIR are proposed: (1) the {\it visual keyword} and (2) the {\it visual string}. The {\it visual keyword} describes the visual characteristics (color, texture, and spatial features) of a homogenous region, while the {\it visual string} represents the spatial relation of the regions in an image. The experiment results show the follows: (1) the retrieving performance by {\it visual keyword} method is $49.4\%$, which is comparable the $46.8\%$ of IRM method \cite{wang01simplicity} and $47.7\%$ of the UFM method \cite{chen02regionbased}, and (2) query by {\it visual string} can significantly improve the hit rate of finding the interested images while the spatial relation of {\it visual keywords} is concerned.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#GT008617810 http://hdl.handle.net/11536/81124
顯示於類別：	畢業論文

文件中的檔案：

781001.pdf

若為 zip 檔案，請下載檔案解壓縮後，用瀏覽器開啟資料夾中的 index.html 瀏覽全文。