標題: 根據基因演算法之Fuzzy ID3方法於混合特徵資料學習
Genetic Algorithm Based Fuzzy ID3 Method for Data Learning with Mixed-Mode Attributes
作者: 謝書桓
Su-Hwang Hsieh
張志永
Jyh-Yeong Chang
電控工程研究所
關鍵字: 模糊ID3;混合特徵;模糊決策樹;fuzzy ID3;mixed-mode attributes;fuzzy decision tree
公開日期: 2003
摘要: 許多知識獲取的學習方法一直持續發展,一個普遍且有效的方法,主要是對於非連續數值資料 (discrete data) 的決策樹歸納 (decision tree induction),稱為ID3演算法。然而,多數的知識結合人類思考和感覺有著不精確和不確定性,為了獲取不精確和不確定的知識,決策樹歸納被改良成為模糊的版本,即模糊的ID3方法,但是它只能處理連續數值資料 (continuous data),並且通常被批評為不夠高的辨識準確性。在本篇論文中,我們提出一個產生模糊決策樹的新方法,它可以接受非連續數值、連續數值或非連續與連續混雜型的資料 (mixed-mode data),並使用基因演算法調整模糊集合。接著,我們制定一個決策樹刪減的方法,以得到更精簡的規則庫。我們利用UCI的十種資料集測試所提fuzzy ID3方法,並且以兩摺交叉評比方式 (two-fold cross validation) 的結果跟C5.0方法比較,實驗的數據顯示,我們的方法有較佳的結果。最後,我們用這個方法分析一個網路內容 (web log-file) 資料集,以fuzzy ID3分析其規律性,並產生決策規則庫,提供資訊給網站管理者改進網站內容的參考。
Many learning approaches to knowledge acquisition have been promisingly developed recently. A popular and efficient method for decision tree induction from discrete data is ID3 algorithm. However, most knowledge associated with human’s thinking and perception has some imprecision and uncertainty. For the purpose of handling imprecise and uncertain knowledge, the decision tree induction has been improved so that it is suitable for the fuzzy case. Several fuzzy ID3 schemes were proposed, but they can only deal with continuous data and are often criticized to result in poor learning accuracy. In this thesis, we propose a method to generate a fuzzy decision tree, which can accept continuous, discrete, or mixed-mode data and it is designed based on genetic algorithm. Next, we formulated a pruning method for our algorithm to obtain a more compact rule-base. We have tested our method on ten data sets from the UCI Repository, and the results of a two-fold cross validation are compared to those by C5.0. The experiments show that our method works better in practice. Finally, we analysis a web log-file data set using our fuzzy ID3 method, the rule-base extracted from the fuzzy ID3 decision tree can provide important directions to web master for improve the contents of the website.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009112597
http://hdl.handle.net/11536/45534
顯示於類別:畢業論文


文件中的檔案:

  1. 259701.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。