Knowledge acquisition through information granulation for imbalanced data

doi:10.1016/j.eswa.2005.09.082

標題:	Knowledge acquisition through information granulation for imbalanced data
作者:	Su, CT Chen, LS Yih, YW 工業工程與管理學系 Department of Industrial Engineering and Management
關鍵字:	information granulation;fuzzy ART;granular computing;knowledge acquisition;imbalanced data
公開日期:	1-Oct-2006
摘要:	When learning from imbalanced/skewed data, which almost all the instances are labeled as one class while far few instances are labeled as the other class, traditional machine learning algorithms tend to produce high accuracy over the majority class but poor predictive accuracy over the minority class. This paper proposes a novel method called 'knowledge acquisition via information granulation' (KAIG) model which not only can remove some unnecessary details and provide a better insight into the essence of data but also effectively solve 'class imbalance' problems. In this model, the homogeneity index (H-index) and the undistinguishable ratio (U-ratio) are successfully introduced to determine a suitable level of granularity. We also developed the concept of sub-attributes to describe granules and tackle the overlapping among granules. Seven data sets from UCI data bank, including one imbalanced diagnosis data (pima-Indians-diabetes), are provided to evaluate the effectiveness of KAIG model. By using different performance indexes, overall accuracy, G-mean and Receiver Operation Characteristic (ROC) curve, the experimental results comparing with C4.5 and Support Vector Machine (SVM) demonstrate the superiority of our method. (c) 2005 Elsevier Ltd. All rights reserved.
URI:	http://dx.doi.org/10.1016/j.eswa.2005.09.082 http://hdl.handle.net/11536/11716
ISSN:	0957-4174
DOI:	10.1016/j.eswa.2005.09.082
期刊:	EXPERT SYSTEMS WITH APPLICATIONS
Volume:	31
Issue:	3
起始頁:	531
結束頁:	541
Appears in Collections:	Articles

Files in This Item:

000238750200009.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.