Mostrar el registro sencillo del ítem

dc.contributor.authorDel Ser, Javier
dc.contributor.authorBarredo Arrieta, Alejandro
dc.contributor.authorDíaz Rodríguez, Natalia Ana 
dc.contributor.authorHerrera Triguero, Francisco 
dc.contributor.authorSaranti, Anna
dc.contributor.authorHolzinger, Andreas
dc.date.accessioned2024-05-06T10:16:53Z
dc.date.available2024-05-06T10:16:53Z
dc.date.issued2023-11-17
dc.identifier.citationJ. Del Ser, A. Barredo-Arrieta, N. Díaz-Rodríguez et al. Information Sciences 655 (2024) 119898 [https://doi.org/10.1016/j.ins.2023.119898]es_ES
dc.identifier.urihttps://hdl.handle.net/10481/91428
dc.description.abstractDeep learning models like chatGPT exemplify AI success but necessitate a deeper understanding of trust in critical sectors. Trust can be achieved using counterfactual explanations, which is how humans become familiar with unknown processes; by understanding the hypothetical input circumstances under which the output changes. We argue that the generation of counterfactual explanations requires several aspects of the generated counterfactual instances, not just their counterfactual ability. We present a framework for generating counterfactual explanations that formulate its goal as a multiobjective optimization problem balancing three objectives: plausibility; the intensity of changes; and adversarial power. We use a generative adversarial network to model the distribution of the input, along with a multiobjective counterfactual discovery solver balancing these objectives. We demonstrate the usefulness of six classification tasks with image and 3D data confirming with evidence the existence of a trade-off between the objectives, the consistency of the produced counterfactual explanations with human knowledge, and the capability of the framework to unveil the existence of concept-based biases and misrepresented attributes in the input domain of the audited model. Our pioneering effort shall inspire further work on the generation of plausible counterfactual explanations in real-world scenarios where attribute-/concept-based annotations are available for the domain under analysis.es_ES
dc.description.sponsorshipBasque Government (Eusko Jaurlaritza) through the Consolidated Research Group MATHMODE (IT1256-22)es_ES
dc.description.sponsorshipCentro para el Desarrollo Tecnologico Industrial (CDTI)es_ES
dc.description.sponsorshipEuropean Union (AI4ES project, grant no. CER-20211030)es_ES
dc.description.sponsorshipAustrian Science Fund (FWF), Project: P-32554es_ES
dc.description.sponsorshipEuropean Union’s Horizon 2020 research and innovation programme under grant agreement No. 826078 (Feature Cloud)es_ES
dc.description.sponsorshipJuan de la Cierva Incorporación contract (IJC2019-039152-I)es_ES
dc.description.sponsorshipGoogle Research Scholar Programme 2021es_ES
dc.description.sponsorshipMarie Skłodowska-Curie Actions (MSCA) Postdoctoral Fellowship with agreement ID: 101059332es_ES
dc.language.isoenges_ES
dc.publisherElsevieres_ES
dc.rightsAtribución 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectExplainable artificial intelligencees_ES
dc.subjectDeep learninges_ES
dc.subjectCounterfactual explanationses_ES
dc.titleOn generating trustworthy counterfactual explanationses_ES
dc.typejournal articlees_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/EC/H2020/826078es_ES
dc.rights.accessRightsopen accesses_ES
dc.identifier.doi10.1016/j.ins.2023.119898
dc.type.hasVersionVoRes_ES


Ficheros en el ítem

[PDF]

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Atribución 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como Atribución 4.0 Internacional