Multiple instance classification: Bag noise filtering for negative instance noise cleaning Luengo Martín, Julián Herrera Triguero, Francisco Multiple instance classification Data Preprocessing Noisy Data Instance noise Noise filtering This work was supported by project PID2020-119478GB-I00 granted by Ministerio de Ciencia, Innovacion y Univesidades, project P18-FR-4961 by Proyectos I+D+i Junta de Andalucia 2018 and the process no 2015/20606-6, FundacAo de Amparo a Pesquisa do Estado de SAo Paulo (FAPESP) . Data in the real world is far from being perfect. The appearance of noise is a common issue that arises from the limitations of data acquisition mechanisms and human knowledge. In classification, label noise will hinder the performance of almost all classifiers, inducing a bias in the built model. While label noise has recently attracted researchers’ attention in standard classification, it has only recently begun to be studied in multiple instance classification. In this work, we propose the usage of filtering algorithms for multiple instance classification that are able to reduce the impact of negative instances within the bags. In order to do so, we decompose the bags to form a standard classification problem that can be efficiently treated by a specialized noise filter. Such a decomposition is tackled in different ways, with the aim of exploiting the knowledge offered by the examples from opposite bags. The bags are then rebuilt, without the identified noise instances. In our experiments, we show that by applying our approach we can diminish the impact of noise and even obtain better results at 0% noise level for several classifiers. Our approach sets out a promising approach to dealing with noise in the bags of multiple instance datasets and further improve the classification rate of the built models. 2021-10-22T07:53:31Z 2021-10-22T07:53:31Z 2021-07-27 info:eu-repo/semantics/article Julián Luengo... [et al.]. Multiple instance classification: Bag noise filtering for negative instance noise cleaning, Information Sciences, Volume 579, 2021, Pages 388-400, ISSN 0020-0255, [https://doi.org/10.1016/j.ins.2021.07.076] http://hdl.handle.net/10481/71037 10.1016/j.ins.2021.07.076 eng http://creativecommons.org/licenses/by-nc-nd/3.0/es/ info:eu-repo/semantics/openAccess Atribución-NoComercial-SinDerivadas 3.0 España Elsevier