Annotation protocol and crowdsourcing multiple instance learning classification of skin histological images: The CR-AI4SkIN dataset
Metadatos
Mostrar el registro completo del ítemAutor
del Amor, Rocío; Pérez-Cano, José; López Pérez, Miguel; Terradez, Liria; Aneiros Fernández, José; Morales, Sandra; Mateos Delgado, Javier; Molina Soriano, Rafael; Naranjo, ValeryEditorial
Elsevier
Materia
Histopathology Skin cancer Gaussian processes Multiple instance learning Crowdsourcing
Fecha
2023-11Referencia bibliográfica
R. del Amor, J. Pérez-Cano, M. López-Pérez, L. Terradez, J. Aneiros-Fernandez, S. Morales, J. Mateos, R. Molina, and V. Naranjo, “Annotation Protocol and Crowdsourcing Multiple Instance Learning Classification of Skin Histological Images: the CR-AI4SkIN Dataset”, Artificial Intelligence in Medicine, vol. 145, 102686, November 2023. https://doi.org/10.1016/j.artmed.2023.102686
Patrocinador
Spanish Ministry of Economy and Competitiveness through project PID2019-105142RB-C21 and PID2019-105142RB-C22 (AI4SKIN); Spanish Ministry of Science and Innovation through project PID2022-140189OB-C22; European Union’s Framework Programme for Research and Innovation, under the grant agreement No. 860627 (CLARIFY),; grant B-TIC-324-UGR20 funded by Consejería de Universidad, Investigación e Innovación (Junta de Andalucía) and by ‘‘ERDF A way of making Europe’; GVA through the project INNEST/2021/321 (SAMUEL); The work of Rocío del Amor has been supported by the Spanish Ministry of Universities (FPU20/05263).; The work of Miguel López Pérez has been supported by the University of Granada postdoctoral program ‘‘Contrato Puente’’.; The work of Sandra Morales has been co-funded by the Universitat Politècnica de València through the program PAID-10-20Resumen
Digital Pathology (DP) has experienced a significant growth in recent years and has become an essential tool for diagnosing and prognosis of tumors. The availability of Whole Slide Images (WSIs) and the implementation of Deep Learning (DL) algorithms have paved the way for the appearance of Artificial Intelligence (AI) systems that support the diagnosis process. These systems require extensive and varied data for their training to be successful. However, creating labeled datasets in histopathology is laborious and time-consuming. We have developed a crowdsourcing-multiple instance labeling/learning protocol that is applied to the creation and use of the CR-AI4SkIN dataset.2 CR-AI4SkIN contains 271 WSIs of 7 Cutaneous Spindle Cell (CSC) neoplasms with expert and non-expert labels at region and WSI levels. It is the first dataset of these types of neoplasms made available. The regions selected by the experts are used to learn an automatic extractor of Regions of Interest (ROIs) from WSIs. To produce the embedding of each WSI, the representations of patches within the ROIs are obtained using a contrastive learning method, and then combined. Finally, they are fed to a Gaussian process-based crowdsourcing classifier, which utilizes the noisy non-expert WSI labels. We validate our crowdsourcing-multiple instance learning method in the CR-AI4SkIN dataset, addressing a binary classification problem (malign vs. benign). The proposed method obtains an F1 score of 0.7911 on the test set, outperforming three widely used aggregation methods for crowdsourcing tasks. Furthermore, our crowdsourcing method also outperforms the supervised model with expert labels on the test set (F1-score = 0.6035). The promising results support the proposed crowdsourcing multiple instance learning annotation protocol. It also validates the automatic extraction of interest regions and the use of contrastive embedding and Gaussian process classification to perform crowdsourcing classification tasks.