Page 12 - Read Online

P. 12

Page 156 Ji et al. Intell Robot 2021;1(2):151-75 https://dx.doi.org/10.20517/ir.2021.14

[20]
[18]
2006. Methods and models such as decision trees , support vector machines (SVM) , AdaBoost , kernel
[19]
[22]
SVM , and random forests were proposed. Graphical models were proposed in 2001 to provide a
[21]
description framework for various machine learning methods such as SVM, naïve Bayes, and hidden
[23]
Markov model .
Complementary priors were introduced in 2006 to eliminate the vanishing gradient problem that makes
inference difficult in densely connected belief nets that have many hidden layers . The rectified linear
[24]
activation function (ReLU) was introduced in 2011 and became the default activation function of many
neural networks. ReLU outputs the input directly if it is positive; otherwise, it is zero. ReLU is effective in
tackling the vanishing gradient problems .
[25]

In 2012, AlexNet, a large deep convolutional neural network, was developed and trained to participate in the
ImageNet large scale visual recognition challenge (ILSVRC) for the first time and delivered state-of-the-art
results which drew attention from researchers . AlexNet was the pioneer of using the graphics processing
[26]
unit (GPU) to train the neural network. In the following years, deep learning became more and more
popular, the architecture and training methods were improved rapidly, and the hardware advanced quickly
and became more powerful. Deep learning has since been adopted by more industries including the railway
industry and delivered more meaningful results. This paper focuses on reviewing research works of deep
learning applications to rail track condition monitoring since 2013.

2.2. Common deep learning models
Artificial intelligence, machine learning, and deep learning have developed rapidly in recent years. There are
many more deep learning networks than one can practically remember. As the resources to learn a
particular deep learning method are abundant, we only list some deep learning methods in this section to
provide an overview of the techniques available for practitioners and researchers to select and provide brief
introductions about the methods. More detailed guides on implementations can be found from the
abundance of references available.

The most commonly heard neural network names are probably CNN and RNN. CNN might be noted as
ConvNet. The architecture of a CNN was inspired by the organization of the visual cortex and is analogous
to that of the connectivity pattern of neurons in the human brain. Individual neurons respond to stimuli
only in a restricted region of the visual field known as the receptive field. A collection of such fields overlaps
to cover the entire visual area. CNN takes in an input image, assigns importance (learnable weights and
biases) to various aspects/objects in the image, and can differentiate one from the other. In RNN, which was
derived from feedforward neural networks, nodes are connected to form a directed graph along a temporal
sequence to exhibit temporal dynamic behavior. RNN’s internal states (memory) are utilized to process
variable-length sequences of inputs. A typical RNN architecture is LSTM which has feedback connections
and can process both single data points (such as images) and entire sequences of data (such as speech or
video). Applications of CNN and RNN to rail maintenance operations are commonly available, but CNN
has been more widely adopted.

There are also some neural network architectures based on CNN with a novel configuration and supporting
specific functions and tasks which might give inspirations for the rail maintenance operations. A Siamese
neural network, also called a twin neural network, is an artificial neural network that uses the same weights
[27]
while working in tandem on two different input vectors to calculate similarity scores of output vectors .
Figure 2 shows how the CNN layers are positioned to form the architecture of the Siamese neural network.

7 8 9 10 11 12 13 14 15 16 17