Data augmentation techniques for Physical Access in voice anti-spoofing

Sanchez Valera, Jose Carlos; Peinado Herreros, Antonio Miguel; Gómez García, Ángel Manuel

doi:10.21437/IberSPEECH.2024-1

sanchez24_iberspeech.pdf (239.8Kb)

Identificadores

URI: https://hdl.handle.net/10481/98124

DOI: 10.21437/IberSPEECH.2024-1

Exportar

Editorial

IberSPEECH 2024

Materia

Anti-spoofing

Data augmentation

Physical access

Date

2024-11-11

Sponsorship

This paper is part of the project PID2022-138711OB-I00 funded by MICIU/AEI/10.13039/501100011033 and by ERDF/EU.

Abstract

In this paper, we explore how data augmentation (DA) techniques can improve spoofed audio detection. Specifically, we will focus on replay attacks, where a genuine voice is surreptitiously captured and then played back through a loudspeaker to the voice biometric system. We propose several approaches to handle with reverberation variability, different types of additive noise, and unseen spoofing attacks, which have all been proven to reduce the performance of countermeasure systems. In order to test the effectiveness and generalization capability of these DA techniques, out-of-domain experiments are carried out on the PA ASVspoof 2021 dataset as well as on the ASVspoof 2019 Real corpus, employing a LCNN classifier fed with STFT features and trained over an augmented version of the ASVspoof 2019 corpus. Four DA methodologies are explored: time masking, noise addition, Room Impulse Response filtering and data mixup. The experimental results show that meaningful improvements can be achieved when the DA procedures are suitably selected.

Collections

DTSTC - Comunicaciones congresos, conferencias, ...

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 Internacional