Active learning with RESSPECT: Resource allocation for extragalactic astronomical transients
Identificadores
URI: http://hdl.handle.net/10481/70887Metadata
Show full item recordAuthor
Kennamer, Noble; Galbany González, Lluis; LSST Dark Energy Science Collaboration; COIN CollaborationEditorial
IEEE
Materia
Active learning Machine learning Astrostatistics
Date
2020-10-26Referencia bibliográfica
Published version: N. Kennamer... [et al.], "Active learning with RESSPECT: Resource allocation for extragalactic astronomical transients," 2020 IEEE Symposium Series on Computational Intelligence (SSCI), 2020, pp. 3115-3124, doi: [10.1109/SSCI47803.2020.9308300]
Sponsorship
HPI Research Center in Machine Learning and Data Science at UC Irvine; CNRS 2017 MOMENTUM grant under the project Active Learning for Large Scale Sky Surveys; FCT under Project CRISP PTDC/FIS-AST-31546/2017; Hewlett Packard Enterprise Data Science Institute (HPE DSI) at the University of Houston; Gordon and Betty Moore Foundation postdoctoral fellowship at the University of California, Santa Cruz; Space Telescope Science Institute; National Aeronautics & Space Administration (NASA) HF2-51462.001 NAS5-26555; International Gemini Observatory, a program of NSF's NOIRLab; National Science Foundation (NSF); Max Planck Society; Foundation CELLEX; Alexander von Humboldt Foundation; European Commission 839090; Spanish grant within the European Funds for Regional Development (FEDER) PGC2018-095317-B-C21Abstract
The recent increase in volume and complexity of
available astronomical data has led to a wide use of supervised
machine learning techniques. Active learning strategies have been
proposed as an alternative to optimize the distribution of scarce
labeling resources. However, due to the specific conditions in
which labels can be acquired, fundamental assumptions, such as
sample representativeness and labeling cost stability cannot be
fulfilled. The Recommendation System for Spectroscopic followup
(RESSPECT) project aims to enable the construction of
optimized training samples for the Rubin Observatory Legacy
Survey of Space and Time (LSST), taking into account a realistic
description of the astronomical data environment. In this work,
we test the robustness of active learning techniques in a realistic
simulated astronomical data scenario. Our experiment takes into
account the evolution of training and pool samples, different costs per object, and two different sources of budget. Results show
that traditional active learning strategies significantly outperform
random sampling. Nevertheless, more complex batch strategies
are not able to significantly overcome simple uncertainty sampling
techniques. Our findings illustrate three important points:
1) active learning strategies are a powerful tool to optimize the
label-acquisition task in astronomy, 2) for upcoming large surveys
like LSST, such techniques allow us to tailor the construction
of the training sample for the first day of the survey, and
3) the peculiar data environment related to the detection of
astronomical transients is a fertile ground that calls for the
development of tailored machine learning algorithms.