Page 65 - Read Online
P. 65

Zhuang et al. Intell Robot 2024;4(3):276-92  I http://dx.doi.org/10.20517/ir.2024.18  Page 286



























                                       Figure 7. Detection results of different pre-processing methods.


               where    1 to       are the Recall values corresponding to the first interpolation of precision interpolation seg-
               ment. In simple terms, AP is the area under the curve in the PR plot, and mAP is the average of all categories
               of AP. The experiment’s Intersection over Union (IoU) threshold was set to 0.5. Besides, we used the frame
               per second (FPS) to evaluate the real-time detection ability of the system. However, though the state-of-the-
               art vehicle and pedestrian detection methods performed well in mAP or FPS, their huge model sizes made
               them unsuitable for deployment on edge devices. Considering that our ultimate goal is to deploy the model on
               resource-constrained edge devices, we also need to focus on the model size as a part of the evaluation. There-
               fore, we conducted a comprehensive comparison of mAP, FPS, and model sizes to highlight the advantages of
               our study.


               4.3. Experimental results and analysis
               We applied the centralized IR image pre-processing method mentioned in the third section to our improved
               model [Figure 7]. Compared with the original image experimental results, our improved IE-CGAN pre-
               processing method, mAP, has increased by 2.5%, the best detection effect among the experimental tech-
               niques. By incorporating this pre-processing method, our improved model enhances the detection accuracy
               and demonstrates a more robust performance across different IR target scenarios.



               We selected a more lightweight model for improvement due to the necessity of hardware deployment. While
               other advanced models offer higher accuracy, they often feature larger architectures less compatible with our
               hardware constraints. Consequently, we chose a YOLO series model, which balances moderate accuracy and
               manageable model size, making it one of our optimal solutions. We opted for YOLOv4 as the baseline to evalu-
               ate the effectiveness of our proposed model, given its superior accuracy and enhanced detection performance,
               as indicated in Table 1. Furthermore, we comprehensively compared baseline models and other leading mod-
               els on the FLIR IR vehicle detection dataset to ensure a diverse range of experimental evaluations. Detailed
               comparative experimental data can be found in Table 1. In contrast to the baseline detector YOLOv4, we intro-
               duced a new object detector in the experiment and evaluated several advanced two-stage detectors, including
               the classical Faster R-CNN detection model. The experimental findings reveal that, in practical scenarios,
               these secondary detectors perform suboptimally compared to our model. Our model has fewer parameters,
               lower computational costs, and superior performance. It can be deployed on hardware devices to achieve
               real-time object detection. Our model significantly improved by nearly 25% over YOLOv3 compared to other
               primary detectors. Moreover, we achieved a 2% improvement on the widely-used YOLOv4 model. The detec-
   60   61   62   63   64   65   66   67   68   69   70