TY - JOUR AU - Gutiérrez Fandiño, Asier AU - Pérez Fernández, David AU - Armengol-Estapé, Jordi AU - Griol Barres, David AU - Kharitonova, Ksenia AU - Callejas Carrión, Zoraida PY - 2023 UR - https://hdl.handle.net/10481/88558 AB - In recent years, transformer-based models have played a significant role in advancing lan- guage modeling for natural language processing. However, they require substantial amounts of data and there is a shortage of high-quality non-English corpora.... LA - eng PB - MDPI KW - Corpus KW - Dataset KW - Massive TI - esCorpius-m: A massive multilingual crawling corpus with a focus on Spanish DO - 10.3390/app132212155 ER -