Monocular visual SLAM, visual odometry, and structure from motion methods applied to 3D reconstruction: A comprehensive survey
Metadata
Show full item recordEditorial
Elsevier
Materia
Monocular SLAM Monocular visual odometry Monocular structure from motion
Date
2024-09-10Referencia bibliográfica
Herrera Granda, E.P. et. al. Heliyon 1 0 (2024) e37356. [https://doi.org/10.1016/j.heliyon.2024.e37356]
Abstract
Monocular Simultaneous Localization and Mapping (SLAM), Visual Odometry (VO), and Structure
from Motion (SFM) are techniques that have emerged recently to address the problem of
reconstructing objects or environments using monocular cameras. Monocular pure visual techniques
have become attractive solutions for 3D reconstruction tasks due to their affordability,
lightweight, easy deployment, good outdoor performance, and availability in most handheld
devices without requiring additional input devices. In this work, we comprehensively overview
the SLAM, VO, and SFM solutions for the 3D reconstruction problem that uses a monocular RGB
camera as the only source of information to gather basic knowledge of this ill-posed problem and
classify the existing techniques following a taxonomy. To achieve this goal, we extended the
existing taxonomy to cover all the current classifications in the literature, comprising classic,
machine learning, direct, indirect, dense, and sparse methods. We performed a detailed overview
of 42 methods, considering 18 classic and 24 machine learning methods according to the ten
categories defined in our extended taxonomy, comprehensively systematizing their algorithms
and providing their basic formulations. Relevant information about each algorithm was summarized
in nine criteria for classic methods and eleven criteria for machine learning methods to
provide the reader with decision components to implement, select or design a 3D reconstruction
system. Finally, an analysis of the temporal evolution of each category was performed, which
determined that the classical-sparse-indirect and classical-dense-indirect categories have been the
most accepted solutions to the monocular 3D reconstruction problem over the last 18 years.