Self-Supervised Learning on Small In-Domain Datasets Can Overcome Supervised Learning in Remote Sensing Sánchez Fernández, Andrés J. Moreno Álvarez, Sergio Rico Gallego, Juan Antonio Tabik, Siham Deep learning Fraction estimation Land use and land cover (LULC) The availability of high-resolution satellite images has accelerated the creation of new datasets designed to tackle broader remote sensing (RS) problems. Although popular tasks, such as scene classification, have received significant attention, the recent release of the Land-1.0 RS datasetmarks the initiation of endeavors to estimate land-use and land-cover (LULC) fraction values per RGB satellite image. This challenging problem involves estimating LULC composition, i.e., the proportion of different LULC classes from satellite imagery, with major applications in environmental monitoring, agricultural/urban planning, and climate change studies. Currently, supervised deep learning models—the state-of-the-art in image classification—require large volumes of labeled training data to provide good generalization. To face the challenges posed by the scarcity of labeled RS data, self-supervised learning (SSL) models have recently emerged, learning directly from unlabeled data by leveraging the underlying structure. This is the first article to investigate the performance of SSL in LULC fraction estimation on RGB satellite patches using in-domain knowledge. We also performed a complementary analysis on LULC scene classification. Specifically, we pretrained Barlow Twins, MoCov2, SimCLR, and SimSiam SSL models with ResNet-18 using the Sentinel2GlobalLULC smallRS dataset and then performed transfer learning to downstream tasks on Land-1.0. Our experiments demonstrate that SSL achieves competitive or slightly better results when trained on a smaller high-quality in-domain dataset of 194 877 samples compared to the supervised model trained on ImageNet-1k with 1 281 167 samples. This outcome highlights the effectiveness of SSL using in-distribution datasets, demonstrating efficient learning with fewer but more relevant data. 2024-09-05T10:55:37Z 2024-09-05T10:55:37Z 2024-07-02 journal article A. J. Sanchez-Fernandez, S. Moreno-Álvarez, J. A. Rico-Gallego and S. Tabik, "Self-Supervised Learning on Small In-Domain Datasets Can Overcome Supervised Learning in Remote Sensing," in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 17, pp. 12797-12810, 2024, doi: 10.1109/JSTARS.2024.3421622 https://hdl.handle.net/10481/94007 10.1109/JSTARS.2024.3421622 eng info:eu-repo/grantAgreement/EC/NextGenerationEU/TED2021-129690B-I00 http://creativecommons.org/licenses/by/4.0/ open access Atribución 4.0 Internacional Institute of Electrical and Electronics Engineers