LaANIL: ANIL with Look-Ahead Meta-Optimization and Data Parallelism
Metadatos
Mostrar el registro completo del ítemAutor
Tammisetti, Vasu; Bierzynski, Kay; Stettinger, Georg; Morales Santos, Diego Pedro; Pegalajar Cuéllar, Manuel; Molina Solana, Miguel JoséEditorial
MDPI
Materia
meta-learning MAML (Model-Agnostic Meta-Learning) ANIL (Almost No Inner Loop)
Fecha
2024-04-22Referencia bibliográfica
Tammisetti, V. et. al. Electronics 2024, 13, 1585. [https://doi.org/10.3390/electronics13081585]
Patrocinador
Infineon Technologies AG (Germany) and the University of Granada (Spain); European Union’s Horizon Europe Research; Innovation Program through Grant Agreement No. 101076754 (AIthena project); Spanish Ministry of Economic Affairs and Digital Transformation (NextGenerationEU funds) through project IA4TES MIA.2021.M04.0008.Resumen
Meta-few-shot learning algorithms, such as Model-Agnostic Meta-Learning (MAML) and
Almost No Inner Loop (ANIL), enable machines to learn complex tasks quickly with limited data
and based on previous experience. By maintaining the inner loop head of the neural network,
ANIL leads to simpler computations and reduces the complexity of MAML. Despite its benefits,
ANIL suffers from issues like accuracy variance, slow initial learning, and overfitting, hardening its
adaptation and generalization. This work proposes “Look-Ahead ANIL” (LaANIL), an enhancement
to ANIL for better learning. LaANIL reorganizes ANIL’s internal architecture, integrating parallel
computing techniques (to process multiple training examples simultaneously across computing units)
and incorporating Nesterov momentum (which accelerates convergence by adjusting the learning
rate based on past gradient information and extracting informative features for look-ahead gradient
computation). These additional features make our model more state-of-the-art capable and better
edge-compatible and thus improve few-short learning by enabling models to quickly adapt to new
information and tasks. LaANIL’s effectiveness is validated on established meta-few-shot learning
datasets, including FC100, CIFAR-FS, Mini-ImageNet, CUBirds-200-2011, and Tiered-ImageNet. The
proposed model achieved an increased validation accuracy by 7 ± 0.7% and a variance reduction
by 44 ± 4% in two-way two-shot classification as well as increased validation by 5 ± 0.4% and a
variance reduction by 18 ± 2% in five-way five-shot classification on the FC100 dataset and similarly
performed well on other datasets.