SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary
Metadatos
Mostrar el registro completo del ítemAutor
Fernández Hilario, Alberto Luis; García López, Salvador; Herrera Triguero, Francisco; Chawla, Nitesh V.Editorial
AI Access Foundation
Fecha
2018Referencia bibliográfica
Fernández Hilario, A.L [et al.]. SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary. Journal of Arti cial Intelligence Research 61 (2018) 863-905. [http://hdl.handle.net/10481/56411]
Patrocinador
This work have been partially supported by the Spanish Ministry of Science and Technology under projects TIN2014-57251-P, TIN2015-68454-R and TIN2017-89517-P; the Project 887 BigDaP-TOOLS - Ayudas Fundaci on BBVA a Equipos de Investigaci on Cient ca 2016; and the National Science Foundation (NSF) Grant IIS-1447795.Resumen
The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is
considered \de facto" standard in the framework of learning from imbalanced data. This
is due to its simplicity in the design of the procedure, as well as its robustness when applied
to di erent type of problems. Since its publication in 2002, SMOTE has proven
successful in a variety of applications from several di erent domains. SMOTE has also inspired
several approaches to counter the issue of class imbalance, and has also signi cantly
contributed to new supervised learning paradigms, including multilabel classi cation, incremental
learning, semi-supervised learning, multi-instance learning, among others. It is
standard benchmark for learning from imbalanced data. It is also featured in a number of
di erent software packages | from open source to commercial. In this paper, marking the
fteen year anniversary of SMOTE, we re
ect on the SMOTE journey, discuss the current
state of a airs with SMOTE, its applications, and also identify the next set of challenges
to extend SMOTE for Big Data problems.