Speech Watermarking removal by DNN-based Speech Enhancement Attacks
Metadatos
Mostrar el registro completo del ítemMateria
Watermarking Tampering and removal Speech enhancement Deep learning
Fecha
2024-11Patrocinador
Project PID2022-138711OB- I00 funded by MICIU/AEI/10.13039/501100011033 and by ERDF/EU.Resumen
Audio watermarking allows for embedding a bitstream in an audio file while ensuring the introduced alteration remains im- perceptible. Although its primary application has been in copy- right protection, it has recently been proposed as a proactive and robust method to detect synthetic speech. By marking artifi- cial speech, voice deepfakes can be easily identified, preventing their misuse regardless of the naturalness or realism achieved by sophisticated generative speech models. However, watermarks could be subjected to tampering and removal attacks, which aim to disguise deepfakes as genuine speech. Recently, deep learn- ing approaches have been applied to watermarking, resulting in highly reliable and resilient speech watermarkers. Nonetheless, the very same approaches can be followed by attackers. In this paper, we propose a novel DNN-based removal attack and eval- uate its effectiveness against both classical and deep learning- based watermarking methods. This attack leverages a well- known speech enhancement architecture, the DCCRN model, and, as we demonstrate, it achieves remarkable success in re- moving watermarks while maintaining excellent speech quality and intelligibility.