Page 50 - Read Online
P. 50

Liu et al. Intell Robot 2024;4(4):503-23  I http://dx.doi.org/10.20517/ir.2024.29   Page 519

                                        Table 4. Ablation study on the proposed SANet components

                                       DUTS-TE   DUT-OMRON     ECSSD     PASCAL-S     HKU-IS      SOD
                 Ver.  Methods
                                    maxF↑  MAE↓  maxF↑  MAE↓  maxF↑  MAE↓  maxF↑  MAE↓  maxF↑  MAE↓  maxF↑  MAE↓
                 0   Basic           0.819  0.068  0.779  0.076  0.911  0.069  0.825  0.112  0.898  0.056  0.821  0.133
                 1   Basic+MI        0.828  0.063  0.790  0.067  0.920  0.060  0.832  0.094  0.910  0.048  0.833  0.125
                 2   Basic+MI+DS     0.830  0.060  0.792  0.065  0.922  0.058  0.834  0.093  0.912  0.047  0.836  0.124
                 3   Basic+MI+DS+MF  0.834  0.059  0.794  0.064  0.924  0.055  0.836  0.088  0.913  0.048  0.842  0.121
                 4   Basic+MI+DS+MF+PR  0.845  0.054  0.804  0.061  0.934  0.047  0.847  0.084  0.919  0.044  0.845  0.117
                 We use the vanilla single branch module as the base model (Ver.0). Here, “MI”, “DS”, “MF”, and “PR” refer to the
                  multi-scale feature interaction, dynamic selection, MFA module, and ImageNet pre-training, respectively.


               learning of lightweight networks.

               4.4. Ablation study
               In this section, we conduct an ablation study on the proposed module components, the backbone network’s
               effectiveness, and the SAFE module’s configuration to demonstrate our proposed model’s effectiveness. The
               relevant experimental settings are consistent with those outlined in Section 4.1.

               4.5. Proposed module components
               Table 4 shows the results of the ablation study of the model components in this paper. As the number of
               model components increases, the model performance improves progressively. Compared with Ver.0, the av-
               erage values of maxF on six datasets of Ver.3 increased by 0.015 and MAE decreased by 0.014. There is no
               ImageNet pre-training between Ver.0 and Ver.3, and the difference in their experimental results also shows
               that the proposed model is effective.


               4.6. The effectiveness of the backbone network
               In addition to existing SOD methods, we also compared several widely used lightweight backbone networks,
               including MobileNet, MobileNetV2, ShuffleNetV2, and EfficientNet. To use these lightweight backbone net-
               works for SOD tasks, we add the same decoder as SANet to these networks for ablation study.

               In Table 5, we can see that directly applying the existing lightweight backbone network to the SOD task does
               not produce satisfactory results regarding accuracy. Taking EfficientNet as an example, we take the average
               values of maxF, avgF, and MAE of six data sets. The results showed that compared to EfficientNet, SANet
               achieved a 13.20% improvement in maxF, an 11.14% improvement in avgF, and a 44.72% reduction in MAE.
               This further verifies the correctness and rationality of our redesign of the backbone network structure for SOD.

               4.7. Configuration of the SAFE module
               Table 6 presents the ablation study results of the SAFE module with varying branch numbers and dilation rates.
               Increasing the number of branches in the E 1-E 4 stages improves some metrics, but also significantly increases
               computational complexity, which contradicts our goal of maintaining a lightweight model. The default settings
               of the SAFE module are selected after weighing the trade-off between model accuracy and complexity.


               5. CONCLUSION
               This paper reviews existing research on SOD and analyzes the challenges in current approaches. Heavyweight
               SOD models face difficulties in scenarios with low computing power and high real-time requirements due to
               issues such as large model size and poor real-time performance. In contrast, lightweight SOD models have
               poor detection performance and struggle to handle complex scenarios. To address these problems, we pro-
               pose SANet, a scale-adaptive lightweight SOD model that achieves a trade-off between lightweight design
               and detection effectiveness. We first implement the SAFE module, a component unit of the backbone net-
   45   46   47   48   49   50   51   52   53   54   55