Page 88 - Read Online
P. 88

Bah et al. Intell Robot 2022;2(1):72­88  I http://dx.doi.org/10.20517/ir.2021.16          Page 82


                                     Table 2. Basic model classification performance test results on FERGIT
                                                   Precision  Recall  F1-Score  Support
                                           Angry   0.64    0.49   0.56   260
                                           Disgust  0.75   0.57   0.65   37
                                           Fear    0.6     0.44   0.51   257
                                           Happy   0.89    0.93   0.91   735
                                           Sad     0.54    0.62   0.57   304
                                           Surprise  0.76  0.73   0.75   218
                                           Neutral  0.75   0.82   0.78   654
                                           Accuracy               0.74   2465
                                           Run time               44 min


                                  Table 3. ResNet based model classification performance test results on FERGIT

                                                   Precision  Recall  F1-Score  Support
                                           Angry   0.62    0.54   0.58   260
                                           Disgust  0.71   0.54   0.62   37
                                           Fear    0.62    0.44   0.51   257
                                           Happy   0.89    0.92   0.9    735
                                           Sad     0.55    0.56   0.55   304
                                           Surprise  0.76  0.79   0.77   218
                                           Neutral  0.75   0.84   0.79   654
                                           Accuracy               0.75   2465
                                           Run time               48 min


               To improve the success rate and at the same time reduce the loss, we used residual blocks, which proved to be
               efficient as the accuracy increased to 86% on the training data, and we finally got an accuracy of 75% on the
               test set as shown in Table 3. This training took more time, running for 48 minutes.


               The model does well on disgust, happiness, surprise, and neutral or contempt expressions during the two
               phases. Despite the very imbalanced training data that is alleviated with class-weighting loss -the happy label
               has around 30% of the test split- our model’s overall performance was quite good, as presented in the confusion
               matrix(SeeFigure5). Itcanbeseenthattheresidual-basednetworkbalancedtheperformanceversusthebasic
               network that biased more on the neutral and happy classes. In both cases, 93% of the images labelled happy
               were truly predicted while the prediction of fear was not good, barely 50% for the residual-based model. This
               is due to the mislabeling of most of the images.


               3.2. Accuracy and loss during training
               For the first attempt with the basic network, we observe that the model is learning very well in the training
               data and generalizing to the validation data of the FERGIT database. There was no overfitting during 100
               training epochs, but the overall accuracy did not increase much, and the network did not stop the training.
               We increased the number of epochs to seek better performance, but the network stuck to 75%, the best the
               model could achieve. And which evaluated to the test set it achieved a perfect accuracy of 74%. The loss rate
               was 0.48% on the train set, 0.79% on the validation set, and 0.7% on the test set. See Figure 6.



               In the second experiment, the accuracy increased a little bit with the help of the residual blocks. Using them
               allowed the model to propagate to the early layers and adjust all the weights to get a better result. We observe
               that after 35 epochs, the model was not generalizing well anymore. Nonetheless, we did not discontinue train-
               ing because the training loss rapidly decreased while the validation loss was stable. However, the accuracy, on
               the other hand, was increasing. The model achieved a training accuracy of 86%, validation accuracy of 74%,
               and test accuracy of 75%. And the loss rate was 0.2% on the train set, 0.82% on the validation set and 0.8% on
               the test set, See Figure 7.
   83   84   85   86   87   88   89   90   91   92   93