• français 
    • español
    • English
    • français
  • FacebookPinterestTwitter
  • español
  • English
  • français
Voir le document 
  •   Accueil de DIGIBUG
  • 1.-Investigación
  • Departamentos, Grupos de Investigación e Institutos
  • Departamento de Estadística e Investigación Operativa
  • DEIO - Artículos
  • Voir le document
  •   Accueil de DIGIBUG
  • 1.-Investigación
  • Departamentos, Grupos de Investigación e Institutos
  • Departamento de Estadística e Investigación Operativa
  • DEIO - Artículos
  • Voir le document
JavaScript is disabled for your browser. Some features of this site may not work without it.

Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets

[PDF] 2016-PR-Saez.pdf (1.881Mo)
Identificadores
URI: https://hdl.handle.net/10481/100234
DOI: 10.1016/j.patcog.2016.03.012
Exportar
RISRefworksMendeleyBibtex
Estadísticas
Statistiques d'usage de visualisation
Metadatos
Afficher la notice complète
Auteur
Sáez Muñoz, José Antonio; Krawczyk, Bartosz; Wozniak, Michal
Editorial
Elsevier
Materia
machine learning
 
imbalanced classification
 
multi-class imbalance
 
oversampling
 
minority class types
 
Date
2016
Referencia bibliográfica
José A. Sáez; Bartosz Krawczyk; Michal Wozniak. Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets. Pattern Recognition, 57, 164-178. 2016. doi: 10.1016/j.patcog.2016.03.012
Résumé
Canonical machine learning algorithms assume that the number of objects in the considered classes are roughly similar. However, in many real-life situations the distribution of examples is skewed since the examples of some of the classes appear much more frequently. This poses a difficulty to learning algorithms, as they will be biased towards the majority classes. In recent years many solutions have been proposed to tackle imbalanced classification, yet they mainly concentrate on binary scenarios. Multi-class imbalanced problems are far more difficult as the relationships between the classes are no longer straightforward. Additionally, one should analyze not only the imbalance ratio but also the characteristics of the objects within each class. In this paper we present a study on oversampling for multi-class imbalanced datasets that focuses on the analysis of the class characteristics. We detect subsets of specific examples in each class and fix the oversampling for each of them independently. Thus, we are able to use information about the class structure and boost the more difficult and important objects. We carry an extensive experimental analysis, which is backed-up with statistical analysis, in order to check when the preprocessing of some types of examples within a class may improve the indiscriminate preprocessing of all the examples in all the classes. The results obtained show that oversampling concrete types of examples may lead to a significant improvement over standard multi-class preprocessing that do not consider the importance of example types.
Colecciones
  • DEIO - Artículos

Mon compte

Ouvrir une sessionS'inscrire

Parcourir

Tout DIGIBUGCommunautés et CollectionsPar date de publicationAuteursTitresSujetsFinanciaciónPerfil de autor UGRCette collectionPar date de publicationAuteursTitresSujetsFinanciación

Statistiques

Statistiques d'usage de visualisation

Servicios

Pasos para autoarchivoAyudaLicencias Creative CommonsSHERPA/RoMEODulcinea Biblioteca UniversitariaNos puedes encontrar a través deCondiciones legales

Contactez-nous | Faire parvenir un commentaire