Page 24 - Read Online
P. 24
Page 122 Wu et al. Intell Robot 2022;2(2):10529 I http://dx.doi.org/10.20517/ir.2021.20
Table 6. Summary of 3D semantic segmentation. ”I”, ”mvPC”, ”vPC”, ”pPC” and ”rm” stands for image, point cloud in multi-view based
representation, point cloud in voxel-based representation, point cloud in point-based representation and range map separately
Modality &
Category Model Architecture
Representation
PointNet [1] pPC Point-wise MLP+T-Net+global max pooling
PointNet++ [2] pPC Set abstraction (sampling, grouping, feature learning)+interpolation+skip link concatentation
KWYND [92] pPC Feature network + neighbors definition + regional descriptors
MPC [93] pPC PointNet++-like network+ gumbel subset sampling
3D-MiniNet [97] pPC Fast 3D point neighbor search + 3D MiniNet + post-processing
LiDAR LU-Net [100] pPC & vPC U-Net for point cloud
-Only SceneEncoder [101] pPC Multi-hot scene descriptor + region similarity loss
RPVNet [13] rpc&pPC&vPC Range-point-voxel fusion network(deep fusion + gated fusion module)
SqueezeSeg [102] mvPC SqueezeNet + conditional random field
PointSeg [103] mvPC SqueezeNet + new feature extract layers
Pointwise [105] pPC Pointwise convolution operator
Dilated [106] pPC Dilated point convolutions
3DMV [107] I & vPC A novel end-to-end network(back propagation layer)
SuperSensor [95] I & mvPC Associate architecture+360 degree sensor configuration
LiDAR [108]
-Fusion MVPNet I & mvPC Multi-view point regression network+geometric loss
FuseSeg [3] I & rPC Point correspondece+feature level fusion
PMF [109] I & mvPC Perspective projection+a two-stream network(fusion part)+perception-aware loss
Table 7. Summary of 3D instance segmentation. ”I”, ”mvPC”, ”vPC”, ”pPC”,”FPC” and ”rm” stands for image, point cloud in multi-
view based representation, point cloud in voxel-based representation, point cloud in point-based representation, point cloud in Frustum
representation and range map separately
Modality &
Category Model Architecture
Representation
GSPN [111] pPC Region-based PointNet(generative shape proposal network+Point RoIAlign)
3D-BoNet [112] pPC Instance-level bounding box prediction + point-level mask prediction
Joint [113] pPC Spatial embedding object proposal + local Bounding Boxes refinement
LiDAR-Only
SqueezeSeg [102] mvPC SqueezeNet + conditional random field
SqueezeSegV2 [114] mvPC SqueezeSeg-like + context aggregation module
3D-BEVIS [118] mvPC 2D-3D deep model(2D instance feature+3D feature propagation)
PanopticFusion [116] I & vPC Pixel-wise panoptic labels+a fully connected conditional random field
LiDAR-Fusion
Fustrum PointNets [117] I & FPC Frunstum proposal+3D instance segmentation(PointNet)
segmentation results through clustering.
7. DISCUSSION
As the upstream and key module of an autonomous vehicle, the perception system outputs its results to down-
stream modules (e.g., decision and planning modules). Therefore, the performance and reliability of the per-
ception system determine the implementation of downstream tasks, thus affecting the performance of the
whole autonomous system. For now, although sensor fusion (Table 8 shows a summary for LiDAR fusion ar-
chitectures in this paper) can make up for the shortcomings of single LiDAR in bad weather and other aspects,
there is still a huge gap between the algorithm design and practical applications in the real world. For this
reason, it is necessary to be properly aware of existing open challenges and figure out possible directions to the
solution. This section discusses the challenges and possible solutions for LiDAR-based 3D perception.
• Dealing with large-scale point clouds and high-resolution images. The need for higher accuracy has
prompted researchers to consider larger scale point clouds and higher resolution images. Most the existing
algorithms [2,29,36,119] are designed for small 3D point clouds (e.g., 4k points or 1 m × 1 m blocks) without
goodextendingcapabilitytolargerpointclouds(e.g., millionsofpointsandupto200m ×200m). However,
larger point clouds come with a higher computational cost that is hard to afford for self-driving cars with
limited computational processing ability. Several recent studies have focused on this problem and proposed
some solutions. A deep learning framework for large-scale point clouds named SPG [120] partitions point