Mostrar el registro sencillo del ítem

dc.contributor.authorTorres Martos, Álvaro
dc.contributor.authorBustos Aibar, Mireia
dc.contributor.authorRamírez Mena, Alberto
dc.contributor.authorCámara Sánchez, Sofía
dc.contributor.authorAnguita Ruiz, Augusto
dc.contributor.authorAlcalá Fernández, Rafael 
dc.contributor.authorAguilera García, Concepción María 
dc.contributor.authorAlcalá Fernández, Jesús 
dc.date.accessioned2023-03-28T06:40:24Z
dc.date.available2023-03-28T06:40:24Z
dc.date.issued2023-01-18
dc.identifier.citationTorres-Martos, Á... [et al.]. Omics Data Preprocessing for Machine Learning: A Case Study in Childhood Obesity. Genes 2023, 14, 248. [https://doi.org/10.3390/genes14020248]es_ES
dc.identifier.urihttps://hdl.handle.net/10481/80881
dc.description.abstractThe use of machine learning techniques for the construction of predictive models of disease outcomes (based on omics and other types of molecular data) has gained enormous relevance in the last few years in the biomedical field. Nonetheless, the virtuosity of omics studies and machine learning tools are subject to the proper application of algorithms as well as the appropriate preprocessing and management of input omics and molecular data. Currently, many of the available approaches that use machine learning on omics data for predictive purposes make mistakes in several of the following key steps: experimental design, feature selection, data pre-processing, and algorithm selection. For this reason, we propose the current work as a guideline on how to confront the main challenges inherent to multi-omics human data. As such, a series of best practices and recommendations are also presented for each of the steps defined. In particular, the main particularities of each omics data layer, the most suitable preprocessing approaches for each source, and a compilation of best practices and tips for the study of disease development prediction using machine learning are described. Using examples of real data, we show how to address the key problems mentioned in multi-omics research (e.g., biological heterogeneity, technical noise, high dimensionality, presence of missing values, and class imbalance). Finally, we define the proposals for model improvement based on the results found, which serve as the bases for future work.es_ES
dc.description.sponsorshipERDF/Regional Government of Andalusia/Ministry of Economic Transformation, Industry, Knowledge, and Universities P18-RT-2248 B-CTS-536-UGR20es_ES
dc.description.sponsorshipERDF/Health Institute Carlos III/Spanish Ministry of Science, Innovation PI20/00711es_ES
dc.language.isoenges_ES
dc.publisherMDPIes_ES
dc.rightsAtribución 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectMachine learninges_ES
dc.subjectOmicses_ES
dc.subjectData pre-processinges_ES
dc.titleOmics Data Preprocessing for Machine Learning: A Case Study in Childhood Obesityes_ES
dc.typejournal articlees_ES
dc.rights.accessRightsopen accesses_ES
dc.identifier.doi10.3390/genes14020248
dc.type.hasVersionVoRes_ES


Ficheros en el ítem

[PDF]

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Atribución 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como Atribución 4.0 Internacional