Signal and Neural Processing against Spoofing Attacks and Deepfakes for Secure Voice Interaction (ASASVI) Gómez García, Ángel Manuel Peinado Herreros, Antonio Miguel Sánchez Calle, Victoria Eugenia López Espejo, Iván Roselló Casado, Eros Sanchez Valera, Jose Carlos Martín Doñas, Juan Manuel Speech-based interaction Security Voice biometrics systems Voice impersonation Anti-spoofing Deepfakes Artificial intelligence The increasing sophistication of multimodal interaction sys- tems, which enable human-like communication, raises concerns regarding the authenticity of exchanged speech data. Our research addresses the challenges posed by the malicious misuse of speech technologies, as for example voice conversion (VC) and text-to-speech (TTS), which can be exploited to impersonate speakers, manipulate public opinion, or compromise voice biometric systems. Existing countermeasures, known as anti-spoofing techniques, face significant limitations in effectively combating these threats. To tackle this, our project proposes three research directions: (1) improving deep neural network (DNN)-based anti-spoofing techniques through robust feature extractors, novel architectures, and enhanced training method- ologies to bridge the gap between laboratory performance and real-world application, (2) generating more realistic and diverse training data to better reflect real-world conditions and attacks, and (3) developing advanced, imperceptible watermarking techniques for synthesized speech to prevent misuse, even in the presence of deep learning-based removal attempts. This research aims to significantly enhance the security and reliability of computer-mediated speech interactions. 2024-11-18T11:21:12Z 2024-11-18T11:21:12Z 2024-11 conference output https://hdl.handle.net/10481/97005 10.21437/IberSPEECH.2024-49 eng http://creativecommons.org/licenses/by-nc-nd/4.0/ open access Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License Attribution-NonCommercial-NoDerivatives 4.0 Internacional