完整後設資料紀錄
DC 欄位語言
dc.contributor.authorTong, Lee-Ingen_US
dc.contributor.authorChang, Yung-Chiaen_US
dc.contributor.authorLin, Shan-Huien_US
dc.date.accessioned2014-12-08T15:18:00Z-
dc.date.available2014-12-08T15:18:00Z-
dc.date.issued2009en_US
dc.identifier.issn1539-2023en_US
dc.identifier.urihttp://hdl.handle.net/11536/13012-
dc.description.abstractImbalanced data are often found in many real-world applications in machine learning. In an imbalanced data set, the number of instances in at least one class is significantly greater or smaller than that in other classes. Consequently, when developing a classification model with imbalanced data, most classifiers are subjected to the unequal number of instances in each class and thereby fail to construct an accurate classification model. Balance the sample sizes from different classes using re-sampling strategy is a common approach to enhance the accuracy of a classification model for an imbalanced data. Many studies utilized try-and-error method to determine the appropriate sampling proportion in each class for imbalanced data. The try-and-error method may not effectively classify the imbalanced data if the sampling strategy determined by the try-and-error method does not include the optimal sampling strategy. The conventional under-sampling strategy or over-sampling strategy determines just a specified sampling strategy. If the optimal sampling proportion for each class is not the specific sampling strategy determined by over-sampling approach or under-sampling approach, the classifiers cannot develop an effective classification model either. This study proposes a procedure to determine the optimal re-sampling strategy using design of experiments (D.O.E.). The proposed procedure can be utilized by any classifier. Finally, the classification model based on the training data obtained from the proposed procedure is verified to be more accurate than that obtained using the try-and-error method, over-sampling approach or under-sampling approach.en_US
dc.language.isoen_USen_US
dc.subjectre-sampling strategyen_US
dc.subjectimbalanced dataen_US
dc.subjectclassifieren_US
dc.subjectmachine learningen_US
dc.titleUsing Experimental Design to Determine the Re-Sampling Strategy for Developing a Classification Model for Imbalanced Dataen_US
dc.typeProceedings Paperen_US
dc.identifier.journalPROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON INFORMATION AND MANAGEMENT SCIENCESen_US
dc.citation.volume8en_US
dc.citation.spage646en_US
dc.citation.epage648en_US
dc.contributor.department工業工程與管理學系zh_TW
dc.contributor.departmentDepartment of Industrial Engineering and Managementen_US
dc.identifier.wosnumberWOS:000270433200122-
顯示於類別:會議論文