On the use of m-probability-estimation and imprecise probabilities in the naive Bayes classifier

García Castellano, Francisco Javier; Moral García, Serafín; Mantas Ruiz, Carlos Javier; Abellán Mulero, Joaquín

doi:10.1142/S0218488520500282

Artículo versión aceptada. (306.5Ko)

Identificadores

URI: https://hdl.handle.net/10481/88573

DOI: 10.1142/S0218488520500282

ISSN: 0218-4885

Exportar

Editorial

World Scientific

Materia

Supervised learning

Naive Bayes

m-estimate

m-probability-estimation

Imprecise probabilities

Noisy data

Date

2020-08

Referencia bibliográfica

Castellano, J. G., Moral-García, S., Mantas, C. J., & Abellán, J. (2020). On the use of m-probability-estimation and imprecise probabilities in the naive Bayes classifier. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 28(4). Doi:10.1142/S0218488520500282

Patrocinador

This work has been supported by the Spanish “Ministerio de Economíaa y Competitividad” and by “Fondo Europeo de Desarrollo Regional” (FEDER) under Project TEC2015-69496-R.

Résumé

Within the field of supervised classification, the naïve Bayes (NB) classifier is a very simple and fast classification method that obtains good results, being even comparable with much more complex models. It has been proved that the NB model is strongly dependent on the estimation of conditional probabilities. In the literature, it had been shown that the classical and Laplace estimations of probabilities have some drawbacks and it was proposed a NB model that takes into account the a priori probabilities in order to estimate the conditional probabilities, which was called m-probability-estimation. With a very scarce experimentation, this approximation based on m-probability-estimation demonstrated to provide better results than NB with classical and Laplace estimations of probabilities. In this research, a new naïve Bayes variation is proposed, which is based on the m-probability-estimation version and takes into account imprecise probabilities in order to calculate the a priori probabilities. An exhaustive experimental research is carried out, with a large number of data sets and different levels of class noise. From this experimentation, we can conclude that the proposed NB model and the m-probability-estimation approach provide better results than NB with classical and Laplace estimation of probabilities. It will be also shown that the proposed NB implies an improvement over the m-probability-estimation model, especially when there is some class noise.

Colecciones

DCCIA - Artículos