Page 9 - Read Online
P. 9

Page 4                            Chazhoor et al. Intell Robot 2022;2:1-19  https://dx.doi.org/10.20517/ir.2021.15

               2. METHODS
               2.1. Dataset
               The WaDaBa dataset is a sophisticated collection that contains images of common plastics used in society.
               The dataset includes seven distinct varieties of plastic. Images show several forms of plastics on a platform
               under two lighting conditions: an LED bulb and a fluorescent lamp and is displayed in Figure 2. Table 1
               shows the distribution of the 4000 images in the dataset according to their classes. As there are no images in
               the PVC and PE-LD classes, both the classes have been excluded from the deep learning models. Deep
               learning models are trained on five class types with images in the current work i.e., PETE, PE-HD, PP, PS,
               and Other. The deep learning models are set up in such a way that each output matches one of the five class
               categories. When the images for PVC and PE-LD are released, these classes can be included in the models.
               The dataset’s classes are imbalanced, with the last class holding just 40 images and the PETE class consisting
                                                                 [15]
               of 2000 images. The dataset is freely accessible to the public .
               2.2. Transfer learning
               A large amount of data is needed to get optimum accuracy in a neural network. Data needs to be trained for
               hours on a powerful Graphical Processing Unit (GPU) to get the results. With the advent of transfer
                      [26]
               learning , there has been a significant change in the learning processes in deep neural networks. The
               model which has been already trained on a large dataset like ImageNet , known as the pre-trained model,
                                                                           [27]
                                                                                             [28]
               enhances the transfer learning process. The transfer learning process works by freezing  the initially
               hidden layers of the model and fine-tuning the final layers of the models. The layer’s frozen state indicates
               that it will not be trained. As a result, its weights will remain unchanged. As the data set used in this
               research is relatively small with a limited number of images in each class, transfer learning best suits this
               research. The pre-trained models used in the research are further explained in the subsection.

               2.2.1. AlexNet
               AlexNet is a neural network with three convolutional layers and two fully connected layers, and it was
               introduced in 2012 by Alex Krizhevesky. AlexNet increases learning capacity by increasing network depth
               and using multi-parameter tuning techniques. AlexNet uses ReLU to add non-linearity and dropout to
               decrease the overfitting of data. CNN-based applications gained popularity following AlexNet's excellent
               performance on the ImageNet dataset in 2012 . The architecture of AlexNet is shown in Figure 3.
                                                     [23]
               2.2.2. Resnet-50
               Residual networks (Resnet-50) are convolutional neural networks with skip connections with an extremely
               deep convolution and 11 million parameters. A skip connection after each block solves the vanishing
               gradient problem. The skip connection skips some layers in the network. With batch normalization and
               ReLU activation, two 3 × 3 convolutions are used in each block to achieve the desired result . The
                                                                                                    [21]
               architecture of Resnet-50-50 is displayed in Figure 4.

               2.2.3. ResNeXt
               Proposed by Facebook and ranking second in ILSVRC 2016, ResNeXt uses the repeating layer strategy of
               Resnet-5050,  and  it  appends  the  split-transform-merge  method . The  magnitude  of  a  set  of
                                                                             [22]
               transformations is known as cardinality. Cardinality provides a novel approach to modifying model capacity
               by increasing the number of separate routes. Having width and depth as critical characteristics, ResNeXt
               adds on Cardinality as a new dimension. Increasing cardinality is a practical approach to enhance the
                                  [22]
               accuracy of the model . The architecture of ResNeXt is shown in Figure 5.
   4   5   6   7   8   9   10   11   12   13   14