Afficher la notice abrégée

dc.contributor.authorDurán López, Alberto
dc.contributor.authorBolaños Martinez, Daniel
dc.contributor.authorBermúdez Edo, María del Campo 
dc.date.accessioned2026-03-23T07:47:24Z
dc.date.available2026-03-23T07:47:24Z
dc.date.issued2026-03-19
dc.identifier.citationDurán-López, A., Bolaños-Martinez, D., & Bermudez-Edo, M. (2026). EDRS: Extremity-density representative selection for semi-supervised learning on imbalanced data. Information Sciences, 744(123390), 1-18. https://doi.org/10.1016/j.ins.2026.123390es_ES
dc.identifier.urihttps://hdl.handle.net/10481/112360
dc.descriptionThis work was supported by Grant C-SEJ-128-UGR23, funded by Consejería de Universidad, Investigación e Innovación and by ERDF Andalusia Program 2021-2027; project PID2023-149185OBI00 funded by MICIU/AEI/10.13039/501100011033 and by ERDF/EU. Funding for open access charge: Universidad de Granada/CBUA.es_ES
dc.description.abstractRepresentative sample selection improves training in semi-supervised learning (SSL) where labeled data are limited and must reflect the original dataset. Recent SSL methods ignore class imbalance and lack tabular data case studies. To fill this gap, we propose Extremity-Density Representative Selection (EDRS), a preprocessing point selection method for imbalanced tabular datasets. EDRS ranks unlabeled candidates by combining two scores: density, which favors regions with many individuals, and extremity, which ensures inclusion of extreme cases likely belonging to minority classes. We first cluster the data to ensure diverse and representative coverage of the space, and then select samples with the highest density and extremity values, balancing outlier avoidance with coverage of extreme values. EDRS is used to select samples for labeling in an SSL framework and is compared with Random Sampling, Stratified Sampling, K-Means–derived methods, USL, Hybrid-CEAL, FDMat, Gaussian Mapping, and ESC-FFS. We validate EDRS on twelve synthetic and six real-world imbalanced datasets using SSL VIME, Manifold Mixup and Contrastive Mixup. EDRS achieves a class imbalance ratio (IR) close to 1 and is 99% faster than other algorithms with similar IR, improves F1-score by 3–5% in well-separated classes, and includes an ablation test evaluating the impact of density and extremity.es_ES
dc.description.sponsorshipConsejería de Universidad, Investigación e Innovación/ERDF Andalusia C-SEJ-128-UGR23es_ES
dc.description.sponsorshipMICIU/AEI/10.13039/501100011033/ERDF/EU PID2023-149185OBI00es_ES
dc.description.sponsorshipUniversidad de Granada/CBUAes_ES
dc.language.isoenges_ES
dc.publisherElsevieres_ES
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectSemi-supervised learninges_ES
dc.subjectInternet of things (IoT)es_ES
dc.subjectImbalanced classeses_ES
dc.subjectRepresentative sample selectiones_ES
dc.subjectTabular dataes_ES
dc.titleEDRS: Extremity-density representative selection for semi-supervised learning on imbalanced dataes_ES
dc.typejournal articlees_ES
dc.rights.accessRightsopen accesses_ES
dc.identifier.doi10.1016/j.ins.2026.123390
dc.type.hasVersionVoRes_ES


Fichier(s) constituant ce document

[PDF]

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepté là où spécifié autrement, la license de ce document est décrite en tant que Attribution-NonCommercial-NoDerivatives 4.0 Internacional