Recreating Neural Activity During Speech Production with Language and Speech Model Embeddings
Metadatos
Mostrar el registro completo del ítemAutor
Khanday, Owais Mujtaba; Rodríguez San Esteban, Pablo; Ahmad, Zubair; Ouellet, Marc; González López, José AndrésEditorial
ISCA
Fecha
2025-08-17Referencia bibliográfica
Khanday, O.M., Esteban, P.R.S., Lone, Z.A., Ouellet, M., Gonzalez-Lopez, J.A. (2025) Recreating Neural Activity During Speech Production with Language and Speech Model Embeddings. Proc. Interspeech 2025, 5553-5557, doi: 10.21437/Interspeech.2025-1400
Patrocinador
This work was supported by the grant PID2022-141378OB-C22 funded by MICIU/AEI/10.13039/501100011033 and ERDF/EU.Resumen
Understanding how neural activity encodes speech and language production is a fundamental challenge in neuroscience and artificial intelligence. This study investigates whether embeddings from large-scale, self-supervised language and speech models can effectively reconstruct high-gamma neural activity characteristics, key indicators of cortical processing, recorded during speech production. We use pre-trained embeddings from deep learning models on linguistic and acoustic data to map high-level speech features onto high-gamma signals. We analyze the extent to which these embeddings preserve the spatio-temporal dynamics of brain activity. Reconstructed neural signals are evaluated against high-gamma ground-truth activity using correlation metrics and signal reconstruction quality assessments. The results indicate High-gamma activity was effectively reconstructed using language and speech model embeddings, yielding Pearson correlation coefficients of 0.79–0.99 across all participants.





