<rdf:RDF xmlns:rdf="http://www.openarchives.org/OAI/2.0/rdf/" xmlns:ow="http://www.ontoweb.org/ontology/1#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:ds="http://dspace.org/ds/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.lyncode.com/xoai" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/rdf/ http://www.openarchives.org/OAI/2.0/rdf.xsd">
   <ow:Publication rdf:about="oai:digibug.ugr.es:10481/99405">
      <dc:title>Big data preprocessing: enabling smart data</dc:title>
      <dc:creator>Luengo Martín, Julián</dc:creator>
      <dc:creator>García Gil, Diego Jesús</dc:creator>
      <dc:creator>Ramírez-Gallego, Sergio</dc:creator>
      <dc:creator>García López, Salvador</dc:creator>
      <dc:creator>Herrera Triguero, Francisco</dc:creator>
      <dc:subject>Big Data</dc:subject>
      <dc:subject>Machine Learning</dc:subject>
      <dc:subject>Information Systems and Communication Service</dc:subject>
      <dc:description>The massive growth in the scale of data has been observed in recent years, being&#xd;
a key factor of the Big Data scenario. Big Data can be defined as high volume,&#xd;
velocity, and variety of data that require a new high-performance processing.&#xd;
Addressing Big Data is a challenging and time-demanding task that requires a&#xd;
large computational infrastructure to ensure successful data processing and analysis.&#xd;
Being a very common scenario in real-life applications, the interest of researchers&#xd;
and practitioners on the topic has grown significantly during these years. Among Big&#xd;
Data disciplines, data mining is a key topic, enabling the user to extract knowledge&#xd;
from enormous amounts of raw data. However, this raw data is not always in the best&#xd;
condition to be treated, analyzed, and surveyed. The application of preprocessing&#xd;
techniques is a must in real-world applications, to ensure quality data, Smart Data,&#xd;
for a proper treatment and analysis. The term Smart Data refers to the challenge of&#xd;
transforming raw data into quality data that can be appropriately exploited to obtain&#xd;
valuable insights.&#xd;
This book aims at offering a general and comprehensible overview of data&#xd;
preprocessing in Big Data, enabling Smart Data. It contains a comprehensive&#xd;
description of the topic and focuses on its main features and the most relevant&#xd;
proposed solutions. Additionally, it considers the different scenarios in Big Data for&#xd;
which the application of data preprocessing techniques can suppose a real challenge.&#xd;
Data preprocessing is a multifaceted discipline that includes data preparation,&#xd;
compounded by integration, cleaning, normalization, and transformation of data;&#xd;
data reduction tasks such as feature selection, instance selection, and discretization;&#xd;
and resampling techniques to deal with imbalanced data.&#xd;
This book stresses the gap with standard data preprocessing techniques and their&#xd;
Big Data equivalents, showing the challenging difficulties in their development&#xd;
for the latter. It also covers the different approaches that have been traditionally&#xd;
applied and the latest proposals in Big Data preprocessing. Specifically, it reviews&#xd;
data reduction methods, imperfect data approaches, discretization techniques, and imbalanced data preprocessing solutions. Finally, this book describes the most popular&#xd;
Big Data libraries for machine learning, focusing on their data preprocessing&#xd;
algorithms and utilities.</dc:description>
      <dc:date>2025-01-16T11:22:54Z</dc:date>
      <dc:date>2025-01-16T11:22:54Z</dc:date>
      <dc:date>2020-03-16</dc:date>
      <dc:type>book</dc:type>
      <dc:identifier>Luengo, J., García-Gil, D., Ramírez-Gallego, S., García, S., &amp; Herrera, F. (2020). Big data preprocessing. Cham: Springer.</dc:identifier>
      <dc:identifier>https://hdl.handle.net/10481/99405</dc:identifier>
      <dc:identifier>https://doi.org/10.1007/978-3-030-39105-8</dc:identifier>
      <dc:language>eng</dc:language>
      <dc:rights>http://creativecommons.org/licenses/by-nc-nd/4.0/</dc:rights>
      <dc:rights>open access</dc:rights>
      <dc:rights>Attribution-NonCommercial-NoDerivatives 4.0 Internacional</dc:rights>
      <dc:publisher>Springer Cham</dc:publisher>
   </ow:Publication>
</rdf:RDF>