Page 72 - Read Online
P. 72
Liu et al. Intell Robot 2024;4(3):256-75 Intelligence & Robotics
DOI: 10.20517/ir.2024.17
Research Article Open Access
Parallel implementation for real-time visual SLAM sys-
tems based on heterogeneous computing
1
1
Han Liu 1,# , Yanchao Dong 2,3,4, # , Chengbin Hou , Yuhao Liu 2 , Zhanyi Shu , Sixiong Xu 2,3 , Tingting Lv 2
1 CRRC Qingdao Sifang Co., Ltd, R&D Center, Qingdao 266000, Shandong, China.
2 College of Electronics and Information Engineering, Tongji University, Shanghai 201804, China.
3 National Key Laboratory of Autonomous Intelligent Unmanned Systems, Shanghai 201210, China.
4 Frontiers Science Center for Intelligent Autonomous Systems, Ministry of Education, Shanghai 200000, China.
# Authors contributed equally.
Correspondence to: Dr. Yuhao Liu, College of Electronics and Information Engineering, Tongji University, No. 4800, Cao’an High-
way, Jiading District, Shanghai 201804, China. E-mail: 2211271@tongji.edu.cn
How to cite this article: Liu H, Dong Y, Hou C, Liu Y, Shu Z, Xu S, Lv T. Parallel implementation for real-time visual SLAM systems
based on heterogeneous computing. Intell Robot 2024;4(3):256-75. http://dx.doi.org/10.20517/ir.2024.17
Received: 24 May 2024 First Decision: 9 Jul 2024 Revised: 20 Aug 2024 Accepted: 26 Aug 2024 Published: 31 Aug
2024
Academic Editor: Simon X. Yang Copy Editor: Dong-Li Li Production Editor: Dong-Li Li
Abstract
Simultaneous localization and mapping has become rapidly developed and plays an indispensable role in intelligent
vehicles. However, many state-of-the-art visual simultaneous localization and mapping (VSLAM) frameworks are
very time-consuming both in front-end and back-end, especially for large-scale scenes. Nowadays, the increasingly
popular use of graphics processors for general-purpose computing, and the progressively mature high-performance
programming theory based on compute unified device architecture (CUDA) have given the possibility for large-scale
VSLAM to solve the conflict between limited computing power and excessive computing tasks. The paper proposes
a full-flow optimal parallelization scheme based on heterogeneous computing to speed up the time-consuming mod-
ules in VSLAM. Firstly, a parallel strategy for feature extraction and matching is designed to reduce the time consump-
tion arising from multiple data transfers between devices. Secondly, a bundle adjustment method based solely on
CUDA is developed. By fully optimizing memory scheduling and task allocation, a large increase in speed is achieved
while maintaining accuracy. Besides, CUDA heterogeneous acceleration is fully utilized for tasks such as error com-
putation and linear system construction in the VSLAM back-end to enhance the operation speed. Our proposed
method is tested on numerous public datasets on both computer and embedded sides, respectively. A number of
qualitative and quantitative experiments are performed to verify its superiority in terms of speed compared to other
states-of-the-art.
Keywords: VSLAM, feature extraction and matching, heterogeneous computing, bundle adjustment
© The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution 4.0
International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, shar-
ing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you
give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate
if changes were made.
www.oaepublish.com/ir

