Few-Shot User-Definable Radar-Based Hand Gesture Recognition at the Edge

Mauro, Gianfranco; Pegalajar Cuéllar, Manuel; Morales Santos, Diego Pedro

doi:10.1109/ACCESS.2022.3155124

Few-Shot_User.pdf (3.285Mb)

Identificadores

URI: http://hdl.handle.net/10481/74318

DOI: 10.1109/ACCESS.2022.3155124

Exportar

Editorial

IEEE

Materia

Artificial neural networks

Edge computing

FMCW

Intel neural compute stick

Knowledge transfer

Meta learning

Human computer interaction

Radar

Variational autoencoder

Fecha

2022-02-28

Referencia bibliográfica

G. Mauro... [et al.]. "Few-Shot User-Definable Radar-Based Hand Gesture Recognition at the Edge," in IEEE Access, vol. 10, pp. 29741-29759, 2022, doi: [10.1109/ACCESS.2022.3155124]

Patrocinador

Federal Ministry of Education & Research (BMBF) 19006; Austrian Research Promotion Agency (FFG); Rijksdienst voor Ondernemend Nederland (Rvo); Innovation Fund Denmark (IFD)

Resumen

Technological advances and scalability are leading Human-Computer Interaction (HCI) to evolve towards intuitive forms, such as through gesture recognition. Among the various interaction strategies, radar-based recognition is emerging as a touchless, privacy-secure, and versatile solution in different environmental conditions. Classical radar-based gesture HCI solutions involve deep learning but require training on large and varied datasets to achieve robust prediction. Innovative self-learning algorithms can help tackling this problem by recognizing patterns and adapt from similar contexts. Yet, such approaches are often computationally expensive and hardly integrable into hardware-constrained solutions. In this paper, we present a gesture recognition algorithm which is easily adaptable to new users and contexts. We exploit an optimization-based meta-learning approach to enable gesture recognition in learning sequences. This method targets at learning the best possible initialization of the model parameters, simplifying training on new contexts when small amounts of data are available. The reduction in computational cost is achieved by processing the radar sensed data of gestures in the form of time maps, to minimize the input data size. This approach enables the adaptation of simple convolutional neural network (CNN) to new hand poses, thus easing the integration of the model into a hardware-constrained platform. Moreover, the use of a Variational Autoencoders (VAE) to reduce the gestures' dimensionality leads to a model size decrease of an order of magnitude and to half of the required adaptation time. The proposed framework, deployed on the Intel(R) Neural Compute Stick 2 (NCS 2), leads to an average accuracy of around 84% for unseen gestures when only one example per class is utilized at training time. The accuracy increases up to 92.6% and 94.2% when three and five samples per class are used.

Colecciones

DETC - Artículos

Excepto si se señala otra cosa, la licencia del ítem se describe como Atribución 3.0 España