完整後設資料紀錄
DC 欄位語言
dc.contributor.author吳秉怡en_US
dc.contributor.authorWu, Ping-Yien_US
dc.contributor.author唐麗英en_US
dc.contributor.authorTong, Lee-Ingen_US
dc.date.accessioned2014-12-12T02:32:37Z-
dc.date.available2014-12-12T02:32:37Z-
dc.date.issued2012en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT070053313en_US
dc.identifier.urihttp://hdl.handle.net/11536/71472-
dc.description.abstract針對不同類別資料建構分類模型(classification model)以預測新資料之類別,在許多領域均非常重要,例如:在行銷方面由顧客個人資料來預測其購買商品之品牌,或銀行由貸款客戶資料來判斷其是否會違約等。因此,建構一個準確之分類模型是一個重要議題。由於在實務應用上,各類別的資料通常是不平衡資料(imbalanced data),即有一類之資料數量顯著多於或少於另一類資料之數量,若直接使用不平衡資料來建構分類模型,則不論使用何種分類方法(如:判別分析或類神經方法等),通常都會有分類模型整體分類準確率雖然相當高,但少數類別之分類準確率卻過低的情況,而在實務應用上,少數類別的分類準確率通常要比多數類別的分類準確率要重要許多,因此提升少數類別資料之分類準確率,非常重要。現有文獻大多只探討如何提升兩類別不平衡資料分類模型之分類準確率,罕見有文獻探討提升三類以上不平衡資料分類模型分類準確率的方法。因此,本研究利用實驗設計(Design of Experiment;DOE)及雙反應曲面法(Dual Response Surface Methodology;DRS),針對有多個類別之不平衡資料提出一套最適之重新取樣策略(Re-sampling Strategy),以有效提升多類別不平衡資料中少數類別資料之分類準確率。本研究最後利用KEEL資料庫所提供之多類別不平衡資料,驗證了本研究方法確實有效。zh_TW
dc.description.abstractIn many fields, developing an effective classification model to predict the category of incoming data is an important problem. For example, classification model can be utilized to predict certain type goods that the customers will purchase or to determine whether the loan customer will be default or not. However, real-world categorical data are often imbalanced, that is, the sample size of a particular class is significantly greater than that of others. In this case, most of the classification methods fail to construct an accurate model to classify the imbalanced data. There were several studies focused on developing binary classification models, but these models are not appropriate for data involve three or more categories. Therefore, this study introduces an optimal re-sampling strategy using design of experiments (DOE) and dual response surface methodology (DRS) to improve the accuracy of classification model for multi-class imbalanced data. The real cases from KEEL-dataset are used to demonstrate the effectiveness of the proposed procedure.en_US
dc.language.isozh_TWen_US
dc.subject多類別不平衡資料zh_TW
dc.subject重新取樣策略zh_TW
dc.subject實驗設計zh_TW
dc.subject雙反應曲面法zh_TW
dc.subjectMulti-Class Imbalanced Dataen_US
dc.subjectRe-sampling Strategyen_US
dc.subjectDOEen_US
dc.subjectDRSen_US
dc.title多類別不平衡資料之最適重新取樣策略zh_TW
dc.titleOptimal Re-sampling Strategy for Multi-Class Imbalanced Dataen_US
dc.typeThesisen_US
dc.contributor.department工業工程與管理系所zh_TW
顯示於類別:畢業論文