標題: | 含類別屬性相依性最大化之基因演算模糊ID3方法 Genetic Algorithm Based Fuzzy ID3 Method with Class-Attribute Interdependence Maximization |
作者: | 林克勤 Ke-Chin Lin 張志永 Jyh-Yeong Chang 電控工程研究所 |
關鍵字: | 基因演算;模糊;類別屬性;相依性;最大化;Class-Attribute;Fuzzy ID3;Genetic;Interdependence |
公開日期: | 2004 |
摘要: | 近來,許多自動獲取知識方法一直發展,一個普遍且有效的方法,主要是對於符號屬性資料的決策樹歸納,稱為ID3演算法。另一個被推薦的模糊ID3方法,他和ID3方法特有的特徵有高度聯繫並且擴展到應用在包含連續數值屬性的資料集。但是模糊 ID3 演算法只能處理連續數值資料,並且通常被批評為不夠高的辨識準確性。在本篇論文中,我們提出一個產生模糊決策樹的新方法,它可以接受非連續數值、連續數值或混雜型的資料並使用基因演算法調整模糊集合。此外,我們提出類別屬性相依性最大化演算法來處理資料集中特徵之最佳分段方法。接著,我們制定一個決策樹刪減的方法,以得到更精簡的規則庫。我們利用一些著名的資料集來測試我們所提出的方法,並且以兩摺交叉評比方式的結果跟C5.0方法比較,實驗顯示,實際上我們的方法有較好的結果;在效能上,含類別屬性相依性最大化之基因演算模糊 ID3 比起未包含類別屬性相依性最大化有較好的準確率。 Many approaches to acquire knowledge automatically have been developed recently. A popular and efficient method for decision tree induction from symbolic data is ID3 algorithm. A proposed fuzzy ID3 algorithm, which is tightly connected with characteristic features of the ID3 algorithm and is extended to apply a data set containing continuous attribute values. But fuzzy ID3 algorithm can only deal with continuous data and it is often criticized to result in poor learning accuracy. In this thesis, we proposed a genetic algorithm based fuzzy ID3 method to construct fuzzy classification system, which can accept continuous, discrete, or mixed-mode data sets. Furthermore, we proposed CAIM algorithm to deal with the best partitions of the feature of data sets. Next, we formulated a rule pruning method to obtain a more efficient rule base. We have tested our method on some famous data sets, and the results of a two-fold cross validation are compared to those by C5.0. The experiments show that our method works better in practice. The performance of the testing accuracy by our method with CAIM algorithm is better averagely than that without CAIM algorithm. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009212620 http://hdl.handle.net/11536/69168 |
顯示於類別: | 畢業論文 |