Mostrar el registro sencillo del ítem

dc.contributor.authorGómez García, Ángel Manuel 
dc.contributor.authorPeinado Herreros, Antonio Miguel 
dc.contributor.authorSánchez Calle, Victoria Eugenia 
dc.contributor.authorLópez Espejo, Iván
dc.contributor.authorRoselló Casado, Eros
dc.contributor.authorSanchez Valera, Jose Carlos
dc.contributor.authorMartín Doñas, Juan Manuel
dc.date.accessioned2024-11-18T11:21:12Z
dc.date.available2024-11-18T11:21:12Z
dc.date.issued2024-11
dc.identifier.urihttps://hdl.handle.net/10481/97005
dc.description.abstractThe increasing sophistication of multimodal interaction sys- tems, which enable human-like communication, raises concerns regarding the authenticity of exchanged speech data. Our research addresses the challenges posed by the malicious misuse of speech technologies, as for example voice conversion (VC) and text-to-speech (TTS), which can be exploited to impersonate speakers, manipulate public opinion, or compromise voice biometric systems. Existing countermeasures, known as anti-spoofing techniques, face significant limitations in effectively combating these threats. To tackle this, our project proposes three research directions: (1) improving deep neural network (DNN)-based anti-spoofing techniques through robust feature extractors, novel architectures, and enhanced training method- ologies to bridge the gap between laboratory performance and real-world application, (2) generating more realistic and diverse training data to better reflect real-world conditions and attacks, and (3) developing advanced, imperceptible watermarking techniques for synthesized speech to prevent misuse, even in the presence of deep learning-based removal attempts. This research aims to significantly enhance the security and reliability of computer-mediated speech interactions.es_ES
dc.description.sponsorshipProject PID2022-138711OB- I00 funded by MICIU/AEI/10.13039/501100011033 and by ERDF/EU.es_ES
dc.language.isoenges_ES
dc.rightsCreative Commons Attribution-NonCommercial-NoDerivs 3.0 Licenseen_EN
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectSpeech-based interactiones_ES
dc.subjectSecurityes_ES
dc.subjectVoice biometrics systemses_ES
dc.subjectVoice impersonationes_ES
dc.subjectAnti-spoofinges_ES
dc.subjectDeepfakeses_ES
dc.subjectArtificial intelligence es_ES
dc.titleSignal and Neural Processing against Spoofing Attacks and Deepfakes for Secure Voice Interaction (ASASVI)es_ES
dc.typeconference outputes_ES
dc.rights.accessRightsopen accesses_ES
dc.identifier.doi10.21437/IberSPEECH.2024-49
dc.type.hasVersionVoRes_ES


Ficheros en el ítem

[PDF]

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License
Excepto si se señala otra cosa, la licencia del ítem se describe como Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License