The blueprint of a new fact-checking system: A methodology to enrich RAG systems with new generated datasets Díaz García, José Ángel López‑Joya, Salvador Martín Bautista, María José Ruiz Jiménez, María Dolores fact checking RAG NLP lenguaje models datasets In an era where digital misinformation spreads rapidly, Artificial Intelligence (AI) has become a crucial tool for fact-checking. However, the effectiveness of AI in this domain is often limited by the availability of high-quality and scalable datasets to train and guide algorithms. In this paper, we introduce VERIFAID (VERIfication FAISS-based framework for fake news Detection), a novel framework that improves fact-checking through a Retrieval-Augmented Generation (RAG) system based on automatically generated and dynamically growing datasets. Our approach improves evidence retrieval by building a scalable knowledge base, reducing the reliance on manually annotated data. The system consists of three key modules: two dedicated to dataset creation and one inference module that integrates advanced language models, such as LLaMA, within the RAG paradigm. To validate our methodology, we provide technical specifications for both the system and the dataset, together with comprehensive evaluations in zero-shot fact-checking scenarios. The results demonstrate the efficiency and adaptability of our approach and its potential to improve AI-driven fact verification at scale. 2025-10-29T10:20:01Z 2025-10-29T10:20:01Z 2025-10-09 journal article Lopez-Joya, S., Diaz-Garcia, J. A., Ruiz, M. D., & Martin-Bautista, M. J. (2025). The blueprint of a new fact-checking system: A methodology to enrich RAG systems with new generated datasets. Computers and Electrical Engineering, 128, 110746. https://hdl.handle.net/10481/107557 https://doi.org/10.1016/j.compeleceng.2025.110746 eng http://creativecommons.org/licenses/by-nc-nd/4.0/ open access Attribution-NonCommercial-NoDerivatives 4.0 Internacional Pergamon