標題: | 螞蟻分類技術之研究 A Study of the Ant Colony System for the Discovery of Classification Rules |
作者: | 陳惠琪 劉敦仁 林妙聰 Dr. Duen-Ren Liu Dr. B.M.T. Lin 管理學院資訊管理學程 |
關鍵字: | 資料探勘;知識分類;螞蟻演算法;貝氏分類;決策樹;Data Mining;Knowledge Management;ACO;NaiveBayes;Decision Tree |
公開日期: | 2005 |
摘要: | 資料探勘中的分類是知識管理中最基本也是必備的一環,唯有加以分類編碼才可能將知識成為資料庫並加速知識的擴散。螞蟻演算法是在1991年由Colorni等學者提出,為一新近發展的求近似解演算法。螞蟻演算法原本多運用於求解組合最佳化問題,例如旅行銷售員問題(traveling salesman problem)、二次分派問題(quadratic assignment problem)等。近幾年來,許多研究者發現螞蟻演算法對於資料探勘(data mining)方面亦有不錯的表現。因此,本論文即希望探討如何藉由螞蟻演算法的分類技術,以提升知識管理者的處理效率及一致性。
本研究的分類法著重於名詞性的分類而非數值型態之分類,運用常用的分類法與最近新興的螞蟻分類技術(Ant-Miner)來進行比較。本研究所使用進行比較的軟體工具共有兩套:其中一套為Weka的軟體,係是由Java所寫成,為根據各式的機器學習(machine learning)演算法所寫出來的資料探勘軟體;另一套Ant-Miner係由Rrfael等學者在2002年,將原本運用在各最佳化解題的螞蟻系統,運用在資料探勘方面所發展成的。
之前的針對螞蟻的相關分類技術文件,僅針對常用來做測試的UCI Machine Learning Repository的資料來進行分類。本研究中除了測試UCI的資料外,亦將實際於2005年經由問卷調查所搜集的資料,輸入至Ant-Miner中測試其效果,並將其結果與貝氏分類法及決策樹分類法進行比較。研究結果發現不管那一種分類法,其訓練資料量的大小所建立之分類模式會造成正確率的不同,而分類正確率與執行效率也會有一定的對比關係,此項比較與分析將可做為實務上採行之參考。 Ant Colony Optimization (ACO) was proposed by Colorni et al in 1991 from the collaborative behavior of ant colonies. It has been applied to such combinatorial optimization problems as traveling salesman problem, quadratic assignment problem, just to name a few. In the recent years, the ACO approach was deployed in the area of data mining, where algorithmic and statistical techniques are used to discover or extract useful information as well as knowledge from large volume of data. This thesis aims to study the efficiency and effectiveness of Ant-Miner, a well-known classifier that is developed using ACO. The major function of Ant-Miner is to extract classification rules out of the examined data sets. The terms or conditions of a rule will be added or removed by ant colony through collaboration or pheromone sharing. The focus of this research is set on the performance comparison between Ant-Miner and Weka, which is a data mining tool incorporating machine learning mechanisms. In this research, we have two data sets with nominal attributes. The first set is selected from the UCI Machine Learning Repository, and the second is a real data set collected in 2005 by a local research institute. We use the two data sets to compare Naivebayes and Decision Tree with Ant-Miner. Experimental results and analysis show that different classification tools demonstrate different levels of efficiency and effectiveness. We also examine the performance of Ant-Miner resulted from different parameter settings, including such as colony size, evaporation rate and diversification level. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009364518 http://hdl.handle.net/11536/80004 |
顯示於類別: | 畢業論文 |