Mostrar el registro sencillo del ítem

dc.contributor.authorMartín Doñas, Juan Manuel
dc.contributor.authorÁlvarez, Aitor
dc.contributor.authorRoselló Casado, Eros
dc.contributor.authorGómez García, Ángel Manuel 
dc.contributor.authorPeinado Herreros, Antonio Miguel 
dc.date.accessioned2024-11-18T12:36:13Z
dc.date.available2024-11-18T12:36:13Z
dc.date.issued2024-08
dc.identifier.urihttps://hdl.handle.net/10481/97026
dc.description.abstractThis work explores the performance of large speech self- supervised models as robust audio deepfake detectors. Despite the current trend of fine-tuning the upstream network, in this paper, we revisit the use of pre-trained models as feature extractors to adapt specialized downstream audio deepfake classifiers. The goal is to keep the general knowledge of the audio foundation model to extract discriminative features to feed up a simplified deepfake classifier. In addition, the generalization capabilities of the system are improved by augmenting the training corpora using additional synthetic data from different vocoder algorithms. This strategy is also complemented by various data augmentations covering challenging acoustic conditions. Our proposal is evaluated under different benchmark datasets for audio deepfake and anti-spoofing tasks, showing state-of-the-art performance. Furthermore, we analyze the relevant parts of the downstream classifier to achieve a robust system.es_ES
dc.description.sponsorshipProject EITHOS under Grant Agreement No. 101073928.es_ES
dc.description.sponsorshipProject PID2022-138711OB-I00 funded by the MICIU/AEI/10.13039/501100011033 and by ERDF/EUes_ES
dc.description.sponsorshipFPI grant PRE2022-000363es_ES
dc.language.isoenges_ES
dc.rightsCreative Commons Attribution-NonCommercial-NoDerivs 3.0 Licenseen_EN
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectaudio deepfake detectiones_ES
dc.subjectanti-spoofinges_ES
dc.subjectself-supervised modelses_ES
dc.subjectdata augmentationes_ES
dc.subjectvocoderses_ES
dc.titleExploring Self-supervised Embeddings and Synthetic Data Augmentation for Robust Audio Deepfake Detectiones_ES
dc.typeconference outputes_ES
dc.rights.accessRightsopen accesses_ES
dc.identifier.doi10.21437/Interspeech.2024-942
dc.type.hasVersionVoRes_ES


Ficheros en el ítem

[PDF]

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License
Excepto si se señala otra cosa, la licencia del ítem se describe como Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License