標題: Information granulation based data mining approach for classifying imbalanced data
作者: Chen, Mu-Chen
Chen, Long-Sheng
Hsu, Chun-Chin
Zeng, Wei-Rong
運輸與物流管理系 註:原交通所+運管所
Department of Transportation and Logistics Management
關鍵字: information granulation;granular computing;data mining;latent semantic indexing;imbalanced data;feed-forward neural network
公開日期: 15-Aug-2008
摘要: Recently, the class imbalance problem has attracted much attention from researchers in the field of data mining. When learning from imbalanced data in which most examples are labeled as one class and only few belong to another class, traditional data mining approaches do not have a good ability to predict the crucial minority instances. Unfortunately, many real world data sets like health examination, inspection, credit fraud detection, spam identification and text mining all are faced with this situation. In this study, we present a novel model called the "Information Granulation Based Data Mining Approach" to tackle this problem. The proposed methodology, which imitates the human ability to process information, acquires knowledge from Information Granules rather then from numerical data. This method also introduces a Latent Semantic Indexing based feature extraction tool by using Singular Value Decomposition, to dramatically reduce the data dimensions. In addition, several data sets from the UCI Machine Learning Repository are employed to demonstrate the effectiveness of our method. Experimental results show that our method can significantly increase the ability of classifying imbalanced data. (c) 2008 Elsevier Inc. All rights reserved.
URI: http://dx.doi.org/10.1016/j.ins.2008.03.018
http://hdl.handle.net/11536/29065
ISSN: 0020-0255
DOI: 10.1016/j.ins.2008.03.018
期刊: INFORMATION SCIENCES
Volume: 178
Issue: 16
起始頁: 3214
結束頁: 3227
Appears in Collections:Conferences Paper


Files in This Item:

  1. 000258052400006.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.