Page 32 - Read Online
P. 32
He et al. Intell. Robot. 2025, 5(2), 313-32 I http://dx.doi.org/10.20517/ir.2025.16 Page 331
23. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. Available
online: http://arxiv.org/abs/1409.1556. (accessed on 24 Mar 2025)
24. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), Las Vegas, USA. Jun 27-30, 2016. IEEE, 2016; pp. 770–8. DOI
25. Wu, X.; He, J.; Huang, Q.; et al. FER-CHC: facial expression recognition with cross-hierarchy contrast. Appl. Soft Comput. 2023, 145,
110530. DOI
26. Teng, J.; Zhang, D.; Zou, W.; Li, M.; Lee, D. Typical facial expression network using a facial feature decoupler and spatial-temporal
learning. IEEE Trans. Affect. Comput. 2023, 14, 1125–37. DOI
27. Zhao, R.; Liu, T.; Huang, Z.; Lun, D. P.; Lam, K. M. Geometry-aware facial expression recognition via attentive graph convolutional
networks. IEEE Trans. Affect. Comput. 2023, 14, 1159–74. DOI
28. Cai, J.; Meng, Z.; Khan, A.; Li, Z.; O’Reilly, J.; Tong, Y. Probabilistic attribute tree structured convolutional neural networks for facial
expression recognition in the wild. IEEE Trans. Affect. Comput. 2023, 14, 1927–41. DOI
29. Liu, T.; Li, J.; Wu, J.; Du, B.; Chang, J.; Liu, Y. Facial expression recognition on the high aggregation subgraphs. IEEE Trans. Image.
Process. 2023, 32, 3732–45. DOI
30. Zhang, F.; Chen, G.; Wang, H.; Zhang, C. CF-DAN: facial-expression recognition based on cross-fusion dual-attention network. Comput.
Vis. Media 2024, 10, 593–608. DOI
31. Li, Y.; Lu, G.; Li, J.; Zhang, Z.; Zhang, D. Facial expression recognition in the wild using multi-level features and attention mechanisms.
IEEE Trans. Affect. Comput. 2023, 14, 451–62. DOI
32. Zhang, X.; Zhu, J.; Wang, D.; et al. A gradual self distillation network with adaptive channel attention for facial expression recognition.
Appl. Soft Comput. 2024, 161, 111762. DOI
33. Chen, D.; Wen, G.; Li, H.; Chen, R.; Li, C. Multi-relations aware network for in-the-wild facial expression recognition. IEEE Trans.
Circuits Syst. Video Technol. 2023, 33, 3848–59. DOI
34. Vaswani, A.; Shazeer, N.; Parmar, N.; et al. Attention is all you need. In: Proceedings of the 31st International Conference on Neural
Information Processing Systems. NIPS’17, Red Hook, USA. Curran Associates Inc., 2017; pp. 6000–10. https://proceedings.neurips.cc/p
aper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html. (accessed 2025-03-24)
35. Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; et al. An image is worth 16x16 words: transformers for image recognition at scale. arXiv
2020, arXiv:2010.11929. Available online: https://arxiv.org/abs/2010.11929. (accessed on 24 Mar 2025)
36. Li, Y.; Miao, N.; Ma, L.; Shuang, F.; Huang, X. Transformer for object detection: review and benchmark. Eng. Appl. Artif. Intell. 2023,
126, 107021. DOI
37. Chen, X.; Yan, B.; Zhu, J.; Wang, D.; Yang, X.; Lu, H. Transformer tracking. In: 2021 IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), Nashville, USA. Jun 20-25, 2021. IEEE, 2021; pp. 8122-31. DOI
38. Wang, Y.; Xu, Z.; Wang, X.; et al. End-to-end video instance segmentation with transformers. In: 2021 IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR), Nashville, USA. Jun 20-25, 2021. IEEE, 2021; pp. 8737-46. DOI
39. Ma, F.; Sun, B.; Li, S. Transformer-augmented network with online label correction for facial expression recognition. IEEE Trans. Affect.
Comput. 2024, 15, 593–605. DOI
40. Zhang, X.; Li, M.; Lin, S.; Xu, H.; Xiao, G. Transformer-based multimodal emotional perception for dynamic facial expression recognition
in the wild. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 3192–203. DOI
41. Liu, C.; Hirota, K.; Dai, Y. Patch attention convolutional vision transformer for facial expression recognition with occlusion. Inf. Sci.
2023, 619, 781–94. DOI
42. Gao, S. H.; Cheng, M. M.; Zhao, K.; Zhang, X. Y.; Yang, M.H.; Torr, P. Res2Net: a new multi-scale backbone architecture. IEEE Trans.
Pattern Anal. Mach. Intell. 2021, 43, 652–62. DOI
43. Chen, Q.; Wu, Q.; Wang, J.; Hu, Q.; Hu, T.; Ding, E. MixFormer: mixing features across windows and dimensions. In: 2022 IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, USA. Jun 18-24, 2022. IEEE, 2022; pp. 5249–59. DOI
44. Goodfellow, I. J.; Erhan, D.; Carrier, P. L.; et al. Challenges in representation learning: a report on three machine learning contests. In:
Neural Information Processing. ICONIP 2013, Berlin, Heidelberg. Springer Berlin Heidelberg. 2013; pp. 117–24. DOI
45. Wang, K.; Peng, X.; Yang, J.; Meng, D.; Qiao, Y. Region attention networks for pose and occlusion robust facial expression recognition.
IEEE Trans. Image Process. 2020, 29, 4057–69. DOI
46. Zhang, K.; Zhang, Z.; Li, Z.; Qiao, Y. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal
Process. Lett. 2016, 23, 1499–503. DOI
47. Guo, Y.; Zhang, L.; Hu, Y.; He, X.; Gao, J. MS-Celeb-1M: a dataset and benchmark for large-scale face recognition. In: Leibe B, Matas
J, Sebe N, Welling M, editors. Computer Vision ECCV. Cham: Springer International Publishing, 2016; pp. 87–102. DOI
48. Wang, K.; Peng, X.; Yang, J.; Lu, S.; Qiao, Y. Suppressing uncertainties for large-scale facial expression recognition. In: 2020 IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA. Jun 13-19, 2020. IEEE, 2020; pp. 6897–906. DOI
49. Li, Y.; Zeng, J.; Shan, S.; Chen, X. Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans.
Image Process. 2019, 28, 2439–50. DOI
50. Ding, H.; Zhou, P.; Chellappa, R. Occlusion-adaptive deep network for robust facial expression recognition. In: 2020 IEEE International
Joint Conference on Biometrics (IJCB), Houston, USA. Sep 28 - Oct 01, 2020. IEEE, 2020; p. 1-9. DOI
51. Farzaneh, A. H.; Qi, X. Facial expression recognition in the wild via deep attentive center loss. In: 2021 IEEE Winter Conference on
Applications of Computer Vision (WACV), Waikoloa, USA. Jan 03-08, 2021. IEEE, 2021; pp. 2401-10. DOI
52. Zhao, Z.; Liu, Q.; Wang, S. Learning deep global multi-scale and local attention features for facial expression recognition in the wild.

