Page 21 - Read Online

P. 21

Ji et al. Intell Robot 2021;1(2):151-75 https://dx.doi.org/10.20517/ir.2021.14 Page 165

Table 3. Adoption of deep learning models
Deep learning model Ref.
AlexNet, ResNet [76]
Autoencoder [89,101]
CNN [61,62,65,68,69,74,75,79,85,94,95,102,108,109]
CNN (object detection), RNN (distance estimation) [93]
CNN + YOLOv3 [117]
CNN based self-proposed DFF-Net [118]
CNN based self-proposed FR-Net [72]
CNN to extract feature, only 1 class [82]
CNN, transfer learning [64]
CNN, transfer learning, Bayesian optimization to tune hyperparameters [100]
CNN-LSTM [87,103,107]
Faster R-CNN [78,88]
Faster R-CNN + CNN [73]
FastNet, convolutional network-based [120]
Fine-grained bilinear CNNs model [70]
FCN [119]
GAN for CNN [115]
Inception-ResNet-v2 & CNN [113]
LSTM-RNN [63,71,99]
Mask R-CNN [121]
ML Tree based methods [80]
MobileNetV2, YOLOv3 [84]
Multilayer feedforward neural networks based on multi-valued neurons (MLMVN) [60]
neural network [96]
Point Cloud deep learning [92]
ResNet classifier, DenseNet classifier [81]
ResNet, FCN [83]
Resnet50, transfer learning, Inception, Faster R-CNN [67]
Self-proposed, 2 stage FaultyNet, CNN based [97]
Self-proposed, segment U-Net (CNN based) then detect, progressive [116]
Self-proposed, ShuffleNet-v2 extracts features from the track image, RPN predicts [86]
Siamese convolutional neural network [66,91]
Single Shot multibox Detector (SSD) [90]
SqueezeNet, MobileNetV2 [111]
U-Net to segment [105]
Variational autoencoder [98]
YOLO V3 [77,104,106,110,114]
YOLOv5 detect object; mast R-CNN detect surface defect; ResNet classify fastener state [112]

CNN: Convolutional neural network; RNN: recurrent neural network; YOLOv3: You Only Look Once, Version 3; LSTM: long-short-term memory;
DFF-net: differential feature fusion convolution neural network; FR-net: feature fusion refine neural network; FCN: fully convolutional networks;
RPN: region proposal network.

randomly assigned for training and testing purposes. Fourth, the selected deep learning model is trained
with the training data and validated by the testing data. Depending on the purposes, the deep learning
model could perform classification or localization tasks. It is also possible to perform classification and
localization concurrently, which is the most common type of task. The feature representations of the input
data are always extracted; however, the next steps to deal with the feature vectors differ. It is noted that
researchers also proposed their own neural network architectures to replace or complement the existing

16 17 18 19 20 21 22 23 24 25 26