Page 66 - Read Online
P. 66

Page 287                       Zhuang et al. Intell Robot 2024;4(3):276-92  I http://dx.doi.org/10.20517/ir.2024.18






























               Figure 8. Some examples of the detection result on the FLIR dataset. The first column is the original image, and the second column is the
               result of MobileNetV3-YOLOv4.


               tor proposed in this experiment outperforms the previously mentioned detection models regarding detection
               accuracy and computational resource efficiency. We also compared our model with YOLOv5, YOLOv8s, and
               YOLOv3 MobileNetv3. In the IR target detection task, our model significantly outperforms previous models
               and aligns more closely with the requirements of our real-time monitoring task. YOLO-IR has demonstrated
               outstanding performance on the FILR dataset. Our model achieved higher accuracy in this task with fewer
               parameters, improving by 5%, despite some performance degradation. Additionally, we compared Source
               Model Guidance based on YOLOv3 (SMG-Y) and PMBW (a Paced MultiStage BlockWise approach to Object
               Detection in Thermal Images), both based on visual converters. It can be seen that our method has an absolute
               advantage in detection speed and high accuracy. Meanwhile, our model size is only 110MB, which performed
               better than many methods. This balanced improvement in the three evaluations makes the proposed method
               suitable for deployment on resource-constrained edge devices. Examples of the detection results are displayed
               in Figure 8.



               To demonstrate the excellent performance of this model, it was compared not only with many other models
               on the FLIR dataset but also on the KAIST dataset, and competitive results were achieved. Table 2 presents
               the comparison results of our model with other models on the KAIST dataset. We compared our model with
               recent excellent single-stage detectors and some lightweight detectors. The results indicate that our model
               is the smallest and superior to other detection models. Regarding accuracy, our mAP outperforms other
               detection models. Our model demonstrates significant performance advantages compared to other models.
               Compared to YOLOv3, YOLOv4, and other benchmark models, our model outperforms them in mAP. Com-
               pared to YOLOv4, our model shows a slight improvement in mAP, ranging from 81.0% to 86.8%, along with
               enhanced processing speed, increasing from 42 to 64.2 frames per second. Compared with pixel-wise con-
               textual attention network (PiCA-Net), Multimodal Feature Embedding (MuFEm) + Spatio-Contextual Fea-
               ture Aggregation (ScoFA), and multispectral fusion and double-stream detectors with Yolo-based information
               (MFDs-YOLO), our model demonstrates notable enhancements in detection accuracy. Additionally, although
               our model experiences a slight decrease in mAP compared to YOLO-ACN, there are significant improvements
               in processing speed and model size. Overall, our model achieves substantial accuracy, speed, and size advance-
               ments, making it more practical and competitive.
   61   62   63   64   65   66   67   68   69   70   71