Noise simulation in classification with the noisemodel R package: Applications analyzing the impact of errors with chemical data
Metadatos
Afficher la notice complèteAuteur
Sáez Muñoz, José AntonioEditorial
Wiley
Materia
Attribute noise Chemical data Classification Label noise Noise models
Date
2023-05Referencia bibliográfica
S aez JA. Noise simulation in classification with the noisemodel R package: Applications analyzing the impact of errors with chemical data. Journal of Chemometrics. 2023;37(5):e3472. [doi:10.1002/cem.3472]
Patrocinador
University of Granada/CBUARésumé
Classification datasets created from chemical processes can be affected by
errors, which impair the accuracy of the models built. This fact highlights the
importance of analyzing the robustness of classifiers against different types
and levels of noise to know their behavior against potential errors. In this con-
text, noise models have been proposed to study noise-related phenomenology
in a controlled environment, allowing errors to be introduced into the data in
a supervised manner. This paper introduces the noisemodel R package, which
contains the first extensive implementation of noise models for classification
datasets, proposing it as support tool to analyze the impact of errors related to
chemical data. It provides 72 noise models found in the specialized literature
that allow errors to be introduced in different ways in classes and attributes.
Each of them is properly documented and referenced, unifying their results
through a specific S3 class, which benefits from customized print, summary
and plot methods. The usage of the package is illustrated through four applica-
tion examples considering real-world chemical datasets, where errors are
prone to occur. The software presented will help to deepen the understanding
of the problem of noisy chemical data, as well as to develop new robust algo-
rithms and noise preprocessing methods properly adapted to different types of
errors in this scenario.