Signal and Neural Processing against Spoofing Attacks and Deepfakes for Secure Voice Interaction (ASASVI)
Metadatos
Mostrar el registro completo del ítemAutor
Gómez García, Ángel Manuel; Peinado Herreros, Antonio Miguel; Sánchez Calle, Victoria Eugenia; López Espejo, Iván; Roselló Casado, Eros; Sanchez Valera, Jose Carlos; Martín Doñas, Juan ManuelMateria
Speech-based interaction Security Voice biometrics systems Voice impersonation Anti-spoofing Deepfakes Artificial intelligence
Fecha
2024-11Patrocinador
Project PID2022-138711OB- I00 funded by MICIU/AEI/10.13039/501100011033 and by ERDF/EU.Resumen
The increasing sophistication of multimodal interaction sys- tems, which enable human-like communication, raises concerns regarding the authenticity of exchanged speech data. Our research addresses the challenges posed by the malicious misuse of speech technologies, as for example voice conversion (VC) and text-to-speech (TTS), which can be exploited to impersonate speakers, manipulate public opinion, or compromise voice biometric systems. Existing countermeasures, known as anti-spoofing techniques, face significant limitations in effectively combating these threats. To tackle this, our project proposes three research directions: (1) improving deep neural network (DNN)-based anti-spoofing techniques through robust feature extractors, novel architectures, and enhanced training method- ologies to bridge the gap between laboratory performance and real-world application, (2) generating more realistic and diverse training data to better reflect real-world conditions and attacks, and (3) developing advanced, imperceptible watermarking techniques for synthesized speech to prevent misuse, even in the presence of deep learning-based removal attempts. This research aims to significantly enhance the security and reliability of computer-mediated speech interactions.