UGR-MINDVOICE: A multimodal EEG-audio dataset for overt and covert Iberian Spanish speech production
Metadatos
Mostrar el registro completo del ítemFecha
2025-10Patrocinador
This work was supported by grant PID2022-141378OB-C22, funded by MICIU/AEI/10.13039/501100011033 and by ERDF/EU .Resumen
We present UGR-MINDVOICE, the University of Granada (UGR) multimodal electroencephalography (EEG) and audio dataset for overt and covert speech in Iberian Spanish intended for basic neuroscience and brain-computer interface (BCI) research. The dataset features EEG and audio recordings from 15 native Spanish speakers engaged in both overt and covert speech production tasks. This dataset is unique in its inclusion of all Spanish phonemes and a diverse set of words spanning various semantic categories and different usage frequencies. Validation of the dataset confirmed the presence of robust sensory event-related potentials, including the visual P100 and the auditory N1 (N100), indicating reliable early perceptual processing and sustained participant attention to both visual and auditory stimuli. Additionally, the EEG data were classified into rest, covert speech, and overt speech conditions with an accuracy of 81.40%, demonstrating active participant engagement in the tasks. By providing synchronised EEG and audio data for overt speech, along with EEG data for the same stimuli during covert speech, UGR-MINDVOICE constitutes a valuable resource for advancing research in basic neuroscience and brain-computer interfaces, particularly in the domain of silent speech communication. The full dataset is openly available on the Open Science Framework (OSF) (https://osf.io/6sh5d), and all accompanying code and analysis scripts are provided in a public GitHub repository (https://github.com/owaismujtaba/mind-voice)





