SOUL: Scala Oversampling and Undersampling Library for imbalance classification
Metadatos
Mostrar el registro completo del ítemAutor
Rodríguez, Néstor; López Pretel, David; Fernández Hilario, Alberto Luis; García López, Salvador; Herrera Triguero, FranciscoEditorial
Elsevier
Materia
Oversampling Undersampling Scala Imbalanced classification
Fecha
2021-07-18Referencia bibliográfica
Néstor Rodríguez... [et al.]. SOUL: Scala Oversampling and Undersampling Library for imbalance classification, SoftwareX, Volume 15, 2021, 100767, ISSN 2352-7110, [https://doi.org/10.1016/j.softx.2021.100767]
Patrocinador
UGR research contract OTRI 3940; University of Granada, Spain; TIN2017-89517-PResumen
The improvements in technology and computation have promoted a global adoption of Data Science.
It is devoted to extracting significant knowledge from high amounts of information by means of the
application of Artificial Intelligence and Machine Learning tools. Among the different tasks within Data
Science, classification is probably the most widespread overall.
Focusing on the classification scenario, we often face some datasets in which the number of
instances for one of the classes is much lower than that of the remaining ones. This issue is known as
the imbalanced classification problem, and it is mainly related to the need for boosting the recognition
of the minority class examples.
In spite of a large number of solutions that were proposed in the specialized literature to address
imbalanced classification, there is a lack of open-source software that compiles the most relevant ones
in an easy-to-use and scalable way. In this paper, we present a novel software approach named as
SOUL, which stands for Scala Oversampling and Undersampling Library for imbalanced classification.
The main capabilities of this new library include a large number of different data preprocessing
techniques, efficient execution of these approaches, and a graphical environment to contrast the output
for the different preprocessing solutions.





