Mostrar el registro sencillo del ítem
IV-Nlp: A Methodology to Understand the Behavior of DL Models and Its Application from a Causal Approach
dc.contributor.author | Guzman-Monteza, Yudi | |
dc.contributor.author | Fernández Luna, Juan Manuel | |
dc.contributor.author | Ribadas Pena, Francisco J. | |
dc.date.accessioned | 2025-05-21T07:59:29Z | |
dc.date.available | 2025-05-21T07:59:29Z | |
dc.date.issued | 2025-04-21 | |
dc.identifier.citation | Guzman-Monteza, Y.; Fernandez-Luna, J.M.; Ribadas-Pena, F.J. IV-Nlp: A Methodology to Understand the Behavior of DL Models and Its Application from a Causal Approach. Electronics 2025, 14, 1676. [https://doi.org/10.3390/electronics14081676] | es_ES |
dc.identifier.uri | https://hdl.handle.net/10481/104164 | |
dc.description.abstract | Integrating causal inference and estimation methods, especially in Natural Language Processsing (NLP), is essential to improve interpretability and robustness in deep learning (DL) models. The objectives are to present the IV-NLP methodology and its application. IV-NLP integrates two approaches. The first defines the process of the inference and estimation of the causal effect in original, predicted, and synthetic data. The second one includes a validation method of the results obtained by the selected Large- Language Model (LLM). IV-NLP proposes to use synthetic data in predictive tasks only if the causal effect pattern of the synthetic data is aligned with the causal effect pattern of the original data. DL models, the Instrumental Variable (IV) method, statistical methods, and GPT-3.5-turbo-0125 were used for its application, including an intervention method using a variation of the Retrieval-Augmented Generation (RAG) technique. Our findings reveal notable discrepancies between the original and synthetic data, highlighting that the synthetic data do not fully capture the underlying causal effect patterns of the original data, evidencing homogeneity and low diversity in the synthetic data. Interestingly, when evaluating the causal effect in the predictions made by our three best DL models, it was verified that the model with the lowest accuracy (84.50%) was fully aligned with the overall causal effect pattern. These results demonstrate the potential of integrating DL and LLM models with causal inference methods. | es_ES |
dc.language.iso | eng | es_ES |
dc.publisher | MDPI | es_ES |
dc.rights | Atribución 4.0 Internacional | * |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | * |
dc.subject | Natural language processing (NLP) | es_ES |
dc.subject | Synthetic data generation | es_ES |
dc.subject | Causal inference | es_ES |
dc.title | IV-Nlp: A Methodology to Understand the Behavior of DL Models and Its Application from a Causal Approach | es_ES |
dc.type | journal article | es_ES |
dc.rights.accessRights | open access | es_ES |
dc.identifier.doi | 10.3390/electronics14081676 | |
dc.type.hasVersion | VoR | es_ES |