Page 37 - Read Online
P. 37

Page 506                          Liu et al. Intell Robot 2024;4(4):503-23  I http://dx.doi.org/10.20517/ir.2024.29











































                                        Figure 2. Comparison of detection effect of different methods.


               performance.


               (3)WequantitativelyandqualitativelycompareSANetwith fifteen heavyweightmethods and three lightweight
               methods on six typical SOD datasets. At the same time, we use the traffic dataset traffic salient object detection
               (TSOD) mentioned in TSOD using a feature deep interaction and guidance fusion network (TFGNet) [22]
               to verify our model. SANet demonstrates excellent detection effect and efficient reasoning speed with low
               parameters and model complexity.


               2. RELATED WORKS

               2.1. SOD
               SOD is based on simulating a human visual attention mechanism, which enables machines to automatically
               discover and filter important information. Since Professor Itti pioneered the research field of SOD in 1999,
               countless researchers have been engaged in research in this field and produced many scientific research results.
               SOD’s technical solutions have also shifted from traditional statistical methods, frequency domain conversion
               methods, and machine learning methods to the currently hot field of deep learning. Traditional methods [23]
               are mainly based on manually designed features. Although they are very efficient, manually designed fea-
               tures inherently lack the ability of high-level representation, which limits the performance of the model. The
               deeplearning-basedmethodshaveshownincomparableadvantagesovertraditionalmethodsincharacterizing
               salient objects. They have quickly occupied the forefront of SOD and raised the level of SOD to a new height.
               Early deep learning-based methods did not solve the problem of longitudinal transmission feature attenuation,
               and the model had problems of false positives or negatives [24] . To this end, the encoder network is applied to
   32   33   34   35   36   37   38   39   40   41   42