Page 12 - Read Online
P. 12

Page 156                            Ji et al. Intell Robot 2021;1(2):151-75  https://dx.doi.org/10.20517/ir.2021.14

                                                                                                  [20]
                                                         [18]
               2006. Methods and models such as decision trees , support vector machines (SVM) , AdaBoost , kernel
                                                                                       [19]
                                        [22]
               SVM , and random forests  were proposed. Graphical models were proposed in 2001 to provide a
                   [21]
               description framework for various machine learning methods such as SVM, naïve Bayes, and hidden
                            [23]
               Markov model .
               Complementary priors were introduced in 2006 to eliminate the vanishing gradient problem that makes
               inference difficult in densely connected belief nets that have many hidden layers . The rectified linear
                                                                                      [24]
               activation function (ReLU) was introduced in 2011 and became the default activation function of many
               neural networks. ReLU outputs the input directly if it is positive; otherwise, it is zero. ReLU is effective in
               tackling the vanishing gradient problems .
                                                 [25]

               In 2012, AlexNet, a large deep convolutional neural network, was developed and trained to participate in the
               ImageNet large scale visual recognition challenge (ILSVRC) for the first time and delivered state-of-the-art
               results which drew attention from researchers . AlexNet was the pioneer of using the graphics processing
                                                      [26]
               unit (GPU) to train the neural network. In the following years, deep learning became more and more
               popular, the architecture and training methods were improved rapidly, and the hardware advanced quickly
               and became more powerful. Deep learning has since been adopted by more industries including the railway
               industry and delivered more meaningful results. This paper focuses on reviewing research works of deep
               learning applications to rail track condition monitoring since 2013.

               2.2. Common deep learning models
               Artificial intelligence, machine learning, and deep learning have developed rapidly in recent years. There are
               many more deep learning networks than one can practically remember. As the resources to learn a
               particular deep learning method are abundant, we only list some deep learning methods in this section to
               provide an overview of the techniques available for practitioners and researchers to select and provide brief
               introductions about the methods. More detailed guides on implementations can be found from the
               abundance of references available.

               The most commonly heard neural network names are probably CNN and RNN. CNN might be noted as
               ConvNet. The architecture of a CNN was inspired by the organization of the visual cortex and is analogous
               to that of the connectivity pattern of neurons in the human brain. Individual neurons respond to stimuli
               only in a restricted region of the visual field known as the receptive field. A collection of such fields overlaps
               to cover the entire visual area. CNN takes in an input image, assigns importance (learnable weights and
               biases) to various aspects/objects in the image, and can differentiate one from the other. In RNN, which was
               derived from feedforward neural networks, nodes are connected to form a directed graph along a temporal
               sequence to exhibit temporal dynamic behavior. RNN’s internal states (memory) are utilized to process
               variable-length sequences of inputs. A typical RNN architecture is LSTM which has feedback connections
               and can process both single data points (such as images) and entire sequences of data (such as speech or
               video). Applications of CNN and RNN to rail maintenance operations are commonly available, but CNN
               has been more widely adopted.


               There are also some neural network architectures based on CNN with a novel configuration and supporting
               specific functions and tasks which might give inspirations for the rail maintenance operations. A Siamese
               neural network, also called a twin neural network, is an artificial neural network that uses the same weights
                                                                                                       [27]
               while working in tandem on two different input vectors to calculate similarity scores of output vectors .
               Figure 2 shows how the CNN layers are positioned to form the architecture of the Siamese neural network.
   7   8   9   10   11   12   13   14   15   16   17