應用統計與類神經網路模式於監督式分類問題

Full metadata record

DC Field	Value	Language
dc.contributor.author	李得盛	en_US
dc.contributor.author	Li, Te-Sheng	en_US
dc.contributor.author	蘇朝墩	en_US
dc.contributor.author	Su, Chao-Ton	en_US
dc.date.accessioned	2014-12-12T02:27:09Z	-
dc.date.available	2014-12-12T02:27:09Z	-
dc.date.issued	2001	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#NT900031066	en_US
dc.identifier.uri	http://hdl.handle.net/11536/68185	-
dc.description.abstract	應用統計和類神經網路模式於監督式分類問題研究生：李得盛指導教授：蘇朝墩博士國立交通大學工業工程與管理系摘要多維度資料間充滿著模糊不清與許多變異，資料間的關係不易釐清。傳統上建立資料分類系統需藉助輸入變數所形成的規則，而在大量資料的情況下，這些規則的形成特別因難。在過去的數十年間，有許多監督型分類演算法已經成功地實行在多維度的資訊系統上，這些演算中可分為統計分類器和類神經網路分類器兩大類。本論文首先比較不同的統計分類器與類神網路分類器運用在原始的輸入變數上，統計分類器為k個最鄰近法 (KNN, k-nearest neighbor)、線性辨別分析法 (LDA, linear discriminant analysis) 和馬氏距離法(MD, Mahalanobis distance)；類神經分類器依序為倒傳遞類神經網路 (BP, backpropagation)、放射基準機能網路 (RBF, radial basis function) 和學習向量量化網路 (LVQ, learning vector quantization)。這些方法的比較主要是以分類的正確率作為評估的準則。接下來，為了能從多維度的資訊系統中去除多餘的變數以增進分類的效率與正確率，本論文提出幾項有關變數縮減的方法論，能夠篩選出重要因子，利用較少的變數，即可完成分類的任務，並且不影響原先的分類正確率。他們是馬氏田口法(MTS, Mahalanobis-Taguchi method)、輸入節點選擇法 (INS, input node selection)、結合倒傳遞類神經網路與基因演算法的BP-GA (backpropagation-genetic algorithm)、結合放射基準機能網路與基因演算法的RBF-GA (radial basis function-genetic algorithm) 結合學習向量量化網路與基因演算法的LVQ-GA (learning vector quantization-genetic algorithm)。以上所提的縮減變數的方法將與統計上逐步辨別分析(stepwise discriminant analysis) 所得的結果作一比較。本論文提供原始變數分類器的基礎，也提出縮減變數的方法論以及應用這些方法在實際個案上的分類結果。在分類結果上，醫學檢驗資料上，利用原始變數進行分析具有較佳結果的是馬氏距離(MD)；利用縮減變數的模式具有較佳結果的是MTS, INS和LVQ-GA。例二利用原始變數所得較佳結果為LDA和RBF；利用縮減變數的模式具有較佳結果的是INS其次為MTS和LVQ-GA。因此，本論文所提的方法的確可刪除多餘變數，即可篩選重要變數以利未來的應用。最後，本論文將各方法的優缺點與實際應用上的注意事項作一比較與說明。	zh_TW
dc.description.abstract	The relationships among multi-dimensional data with ambiguity and variation are difficult to explore. The traditional approach to building a data classification system requires the formulation of rules by which the input data can be analyzed. The formulation of such rules is very difficult with large sets of input data. Various algorithms for supervised classification of multi-dimensional data have been implemented in the past decades. Among these algorithms, statistical and neural classifiers are two major methodologies used in literature. In this dissertation, a comparison of different statistical and neural network algorithms using all the original input variables for classification is first presented. Three statistical classifier: k-nearest neighbor (KNN), linear discriminant analysis (LDA) and Mahalanobis distance (MD) are considered. Meanwhile, three types of neural classifiers: back-propagation (BP) neural network, radial basis function network (RBF), and learning vector quantization (LVQ) are also discussed in order to compare the accuracy of classification with those of using statistical classifiers. Next, in order to eliminate the redundant variables in multi-dimensional data set and increase classification efficiency and accuracy, we also herein proposed the variable reduction techniques. They are Mahalanobis-Taguchi system (MTS), input nodes selection (INS), BP combined with GA procedure (named BP-GA), RBF combined with GA (named RBF-GA) and LVQ combined with GA (LVQ-GA). A benchmark method, stepwise discriminant analysis, is employed to compare the accuracy with those of reduced models. This dissertation includes an introduction of the theoretical background of the classifiers, their implementation procedures, and two case studies to evaluate their performance. Whether the full model or reduced model, both neural networks and statistical models are demonstrated to be efficient and effective methods for multi-dimensional data classification. In example one, MD outperforms the neural classifiers and statistical models for full models. Compared to the full models, MTS, INS and LVQ-GA reduced models result in the higher classification accuracy. In example two, LDA and RBF outperform the other models for full models. Compared to the full models, INS, MTS, and LVQ-GA reduced models result in the higher classification accuracy. It is also shown that the proposed variable reduction techniques indeed eliminate the redundant variables in the multi-dimensional system. Once a subset of variables is selected, the more important variables can be used for rule extraction or future application. In conclusion, the comparison and discussion of these approaches are presented in view of practical and theoretical consideration.	en_US
dc.language.iso	en_US	en_US
dc.subject	分類	zh_TW
dc.subject	k個最鄰近法	zh_TW
dc.subject	線性辨別分析法	zh_TW
dc.subject	馬氏距離	zh_TW
dc.subject	基因演算法	zh_TW
dc.subject	倒傳綱路	zh_TW
dc.subject	放射基準機能綱路	zh_TW
dc.subject	學習向量化綱路	zh_TW
dc.subject	classification	en_US
dc.subject	k nearest neighbor	en_US
dc.subject	linear discriminant analysis	en_US
dc.subject	Mahalanobis distance	en_US
dc.subject	genetic algorithm	en_US
dc.subject	backpropagation network	en_US
dc.subject	radial basis function network	en_US
dc.subject	learning vector quantization	en_US
dc.title	應用統計與類神經網路模式於監督式分類問題	zh_TW
dc.title	Statistical and Neural Models for Supervised Classification	en_US
dc.type	Thesis	en_US
dc.contributor.department	工業工程與管理學系	zh_TW
Appears in Collections:	Thesis