Page 91 - Read Online
P. 91

Page 85                                  Bah et al. Intell Robot 2022;2(1):72­88  I http://dx.doi.org/10.20517/ir.2021.16


                                   Table 4. ResNet based model classification performance test results on CK+
                                                   Precision  Recall  F1-Score  Support
                                          Anger    1.00    1.00   1.00    14
                                          Contempt  1.00   0.6    0.75    5
                                          Disgust  1.00    0.94   0.97    18
                                          Fear     0.88    1.00   0.93    7
                                          Happy    0.91    1.00   0.95    21
                                          Sadness  1.00    1.00   1.00    9
                                          Surprise  1.00   1.00   1.00    25
                                          Accuracy                0.97    99
                                          Run time                29 min


                                      Table 5. Comparison with other CNN based models results on CK+

                                                Methodology     Accuracy (%)
                                                Diff ResNet  [19]  95.74
                                                Improved VGG-19  [44]  96
                                                Ours            97






















                                            Figure 9. Framework testing on an individual pose


               much in terms of accuracy. The goal is to build a robust and accurate model. Therefore, looking at Table 2
               and Table 3, we observe that for the two models the precisions of all the labels are over 50%, this is to say
               that for each emotion at least 50% time the model is giving a good prediction. The harmonic mean of these
               of the recall and precision hereafter referred to as the f1-score, is utilized to determine how well the model
               performs in terms of facial emotion detection. The value of the f1-score in both cases is 65%, that value is very
               acceptable regarding how complex the dataset is. Finally, given that the Residual based model has a large 1%
               plus, accuracy was the metric that helped us identify the optimal model.


               We experimented on posed images of an individual to test how well our system recognizes facial expressions
               that have been previously trained. The first step is to apply some pre-processing such as face detection and face
               cropping on these images. The image is reshaped to (48,48,1) to have the same shape as the model is trained
               on, then we use the model for the prediction as shown in Figure 9.


               On this prediction, the model did very well with a high percentage of confidentiality (86%), which shows
               how efficient our framework is. Therefore, the model gave wrong predictions on some labels, but with a low
               percentage (40%), it’s due to the resemblance of emotions, sees Figure 10. For example, here, the individual
               was asked to pose disgust, but the model predicted neutral. And as already mentioned, the FERGIT dataset
               has a lot of mis-classified emotions.
   86   87   88   89   90   91   92   93   94   95   96