Page 26 - Read Online

P. 26

Page 124 Wu et al. Intell Robot 2022;2(2):10529 I http://dx.doi.org/10.20517/ir.2021.20

Table 8. Fusion stage and fusion methods of LiDAR-fusion tasks. Here, ”I” represents image; ”L” represents LiDAR point cloud; ”R”
represents Radar point cloud. Duplicate articles between classification and detection are merged to detection part
Task Model Input FusionStage Details of the Fusion Method
Classification ImVoteNet [48] I&L Late fusion Lift 2D image votes, semantic and texture cues to the 3D seed points
3D-CVF [64] I&L Early fusion Adaptive Gated Fusion: spatial attention maps to mix features according to the region
Roarnet [65] I&L Late fusion 3D detection conducts in-depth inferences recursively with candidate regions from 2D
MV3D [12] I&L Early fusion Region-based fusion via ROI pooling
SCANet [46] I&L Early fusion The multi-level fusion module fuses the region-based features
Detection
MMF [47] I&L Multi fusion Region-wise features from multiple views are fused by a deep fusion scheme
Pointpainting [66] I&L Early fusion Sequential fusion: project point cloud into the output of image semantic seg. network
CM3D [67] I&L Early fusion Two stage: point-wise feature and ROI-wise feature fusion
MVDNet [28] R&L Early fusion Region-wise features from two sensors are fused to improve final detection results
CLOCs [69] I&L Late fusion Output candidates of image and LiDAR point cloud before NMS are fused
MSRT [85] I&L Late fusion 2D bbox is converted to 3D bbox that are fused to associate between sensor data
MS3DT [86] I&L Early fusion Object proposals generated by MV3D as input of the match network to link detections
Tracking
Compl.-YOLO [87] I&L Late fusion Semantic Voxel Grid: project all relevant voxelized points into the semantic image
F-Siamese [88] I&L Late fusion 2D region proposals are extruded into 3D viewing frustums
3DMV [107] I&L Early fusion 3D geometry and per-voxel max-pooled images features are fed into two 3D conv.
SuperSensor [95] I&L Late fusion Segmentation results from the image space are transferred onto 3D points
Semantic [3]
Seg. FuseSeg I&L Early fusion Fuse RGB and range image features with point correspondences and feed to net
PMF [109] I&L Early fusion Residual-based fusion modules fuse image features into LiDAR stream network
Instance Pano.Fusion [116] I&L Late fusion 2D panoptic segmentation outputs are fused with depth to output volumetric map
Seg. F-PointNets [117] I&L Late fusion Frunstum proposal: extrud each 2D region proposal to a 3D viewing frustum

environment. We hope that this introductory survey serves as a step in the pursuit of a robust, precise, and
efficient 3D perception system and guides the direction of its future development.

DECLARATIONS
Authors’ contributions
Made substantial contributions to conception and design of the study and performed data analysis and inter-
pretation: Wu D, Liang Z
Performed data acquisition, as well as provided administrative, technical, and material support: Chen G

Availability of data and materials
Not applicable.

Financial support and sponsorship
None.

Conflicts of interest
All authors declared that there are no conflicts of interest.

Ethical approval and consent to participate
Not applicable.

Consent for publication
Not applicable.

21 22 23 24 25 26 27 28 29 30 31