Mostrar el registro sencillo del ítem

dc.contributor.authorGarcía Gil, Diego Jesús 
dc.contributor.authorRamírez-Gallego, Sergio
dc.contributor.authorGarcía López, Salvador 
dc.contributor.authorHerrera Triguero, Francisco 
dc.date.accessioned2025-01-16T10:30:23Z
dc.date.available2025-01-16T10:30:23Z
dc.date.issued2018-06
dc.identifier.citationGarcía-Gil, D., Ramírez-Gallego, S., García, S., & Herrera, F. (2018). Principal components analysis random discretization ensemble for big data. Knowledge-Based Systems, 150, 166-174.es_ES
dc.identifier.urihttps://hdl.handle.net/10481/99369
dc.description.abstractHumongous amounts of data have created a lot of challenges in terms of data computation and analysis. Classic data mining techniques are not prepared for the new space and time requirements. Discretization and dimensionality reduction are two of the data reduction tasks in knowledge discovery. Random Projection Random Discretization is a novel and recently proposed ensemble method by Ahmad and Brown in 2014 that performs discretization and dimensionality reduction to create more informative data. Despite the good efficiency of random projections in dimensionality reduction, more robust methods like Principal Components Analysis (PCA) can improve the performance. We propose a new ensemble method to overcome this drawback using the Apache Spark platform and PCA for dimension reduction, named Principal Components Analysis Random Discretization Ensemble. Experimental results on five large-scale datasets show that our solution outperforms both the original algorithm and Random Forest in terms of prediction performance. Results also show that high dimensionality data can affect the runtime of the algorithm.es_ES
dc.description.sponsorshipThis work is supported by FEDER , the Spanish National Research Project TIN2014-57251-P and TIN2017-89517-P , and the Project BigDaP-TOOLS - Ayudas Fundación BBVA a Equipos de Investigación Científica 2016.es_ES
dc.language.isoenges_ES
dc.publisherKnowledge-Based Systemses_ES
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectbig dataes_ES
dc.subjectDiscretizationes_ES
dc.subjectSparkes_ES
dc.subjectDecision Treees_ES
dc.subjectPCAes_ES
dc.subjectData reductiones_ES
dc.titlePrincipal Components Analysis Random Discretization Ensemble for Big Dataes_ES
dc.typejournal articlees_ES
dc.rights.accessRightsopen accesses_ES
dc.identifier.doihttps://doi.org/10.1016/j.knosys.2018.03.012
dc.type.hasVersionAMes_ES


Ficheros en el ítem

[PDF]

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 Internacional