2025年04月05日 星期六

广西师范大学学报(自然科学版) ›› 2025, Vol. 43 ›› Issue (2): 70-82.doi: 10.16088/j.issn.1001-6600.2024021601

• 智能信息处理 • 上一篇    下一篇

抗噪声双约束网络的面部表情识别

苏春海1,2, 夏海英1,2*   

  1. 1.广西类脑计算与智能芯片重点实验室(广西师范大学), 广西 桂林 541004;
    2.广西师范大学 电子与信息工程学院/集成电路学院, 广西 桂林 541004
  • 收稿日期:2024-02-16 修回日期:2024-04-15 出版日期:2025-03-05 发布日期:2025-04-02
  • 通讯作者: 夏海英(1983—), 女, 山东聊城人, 广西师范大学教授, 博士。E-mail: xhy22@mailbox.gxnu.edu.cn
  • 基金资助:
    国家自然科学基金(62106054, 62366006, 62366005); 广西创新驱动重大专项(桂科AA20302003); 桂林市科技计划项目(20222C243986)

Facial Expression Recognition Based on Noise-Resistant Dual Constraint Network

SU Chunhai1,2, XIA Haiying1,2*   

  1. 1. Guangxi Key Laboratory of Brain-inspired Computing and Intelligent Chips (Guangxi Normal University), Guilin Guangxi 541004, China;
    2. School of Electronic and Information Engineering/School of Integrated Circuits, Guangxi Normal University, Guilin Guangxi 541004, China
  • Received:2024-02-16 Revised:2024-04-15 Online:2025-03-05 Published:2025-04-02

摘要: 由于标注主观性、图像模糊等因素,数据集不可避免存在噪声,使表情识别更具挑战性。现有面部表情识别方法在处理噪声标签时,模型会部分过度拟合噪声标签,对此,本文提出一种新颖的抗噪声双约束网络(NDC-Net)来自动抑制噪声样本。NDC-Net主要包括2个约束机制:类激活映射注意一致性(CAC)和通道空间特征一致性(CSC)。CAC使模型集中于局部重要特征信息,减少对噪声标签的过度关注,而CSC鼓励和确保模型在提取特征时从通道和空间上更加关注到与任务相关的信息,忽略不相关信息,减少对噪声标签的依赖。此外,为增强NDC-Net性能,输入样本采用旋转、缩放等策略进行增强。在 RAF-DB、FERPlus和AffectNet数据集30% 标签噪声下,NDC-Net 的识别性能分别为86.57%、88.22%和59.78%,显著优于EAC、NCCTFER等先进的噪声标签处理方法,并且在计算机视觉领域中被广泛应用于评估算法性能和泛化能力的 CIFAR100 和 Tiny-ImageNet中也取得不错的效果。

关键词: 噪声标签, 面部表情识别, 深度学习, 监督学习, 注意力机制

Abstract: Noise is inevitably present in datasets due to labeling subjectivity, image blurring, and other factors, making expression recognition more challenging. Existing facial expression recognition methods typically address noisy labels by partially overfitting to them. In this paper, a novel Noise-Resistant Dual Constraint Network (NDC-Net) is proposed to automatically suppress noisy samples. NDC-Net primarily consists of two constraint mechanisms: Class Activation mapping attention Consistency (CAC) and Channel and Spatial feature Consistency (CSC). CAC is used to make the model focus on locally important feature information and reduces the overfitting to noisy labels, while CSC is used to ensure that the model emphasizes task-relevant information from both channels and spatial dimensions during feature extraction, ignoring irrelevant information, and reducing reliance on noisy labels. Additionally, to enhance the performance of NDC-Net, input samples are augmented with strategies such as rotation and scaling. NDC-Net achieves recognition performances of 86.57%, 88.22%, and 59.78% under 30% label noise for RAF-DB, FERPlus, and AffectNet datasets, respectively. These results significantly outperform the state-of-the-art noisy labeling methods, such as EAC, NCCTFER. Moreover, NDC-Net also shows strong generalisation capability on general classification datasets such as CIFAR100 and Tiny-ImageNet.

Key words: noisy label learning, facial expression recognition, deep learning, supervised learning, attention mechanisms

中图分类号:  TP391.41

[1] 李俊侠,张秦,郑桂妹.超宽带雷达人体姿态识别综述[J].计算机工程与应用,2021,57(3):14-23.DOI: 10.3778/j.issn.1002-8331.2009-0444.
[2] CHEN S K, WANG J F, CHEN Y D, et al. Label distribution learning on auxiliary label space graphs for facial expression recognition[C] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2020: 13981-13990. DOI: 10.1109/CVPR42600.2020.01400.
[3] FAN X Y, DENG Z Y, WANG K, et al. Learning discriminative representation for facial expression recognition from uncertainties[C] // 2020 IEEE International Conference on Image Processing (ICIP). Los Alamitos, CA: IEEE Computer Society, 2020: 903-907. DOI: 10.1109/ICIP40778.2020.9190643.
[4] 王晓峰,王昆,刘轩,等.自适应重加权池化深度多任务学习的表情识别[J].计算机工程与设计,2022,43(4): 1111-1120.DOI: 10.16208/j.issn1000-7024.2022.04.029.
[5] SHE J H, HU Y B, SHI H L, et al. Dive into ambiguity: latent distribution mining and pairwise uncertainty estimation for facial expression recognition[C] // 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2021: 6244-6253. DOI: 10.1109/CVPR46437.2021.00618.
[6 ] WANG K, PENG X J, YANG J F, et al. Suppressing uncertainties for large-scale facial expression recognition[C] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2020: 6896-6905. DOI: 10.1109/CVPR42600.2020.00693.
[7] ZHANG Y H, WANG C R, DENG W H. Relative uncertainty learning for facial expression recognition[C] // Advances in Neural Information Processing Systems 34 (NeurIPS 2021). Red Hook, NY: Curran Associates Inc., 2021: 17616-17627.
[8] MA F Y, SUN B, LI S T. Transformer-augmented network with online label correction for facial expression recognition[J]. IEEE Transactions on Affective Computing, 2024, 15(2): 593-605. DOI: 10.1109/TAFFC.2023.3285231.
[9] GERA D, KUMAR B N S, KUMAR B V R, et al. Class adaptive threshold and negative class guided noisy annotation robust Facial Expression Recognition[EB/OL]. (2023-05-03)[2024-02-16]. https://arxiv.org/abs/2305.01884. DOI: 10.48550/arXiv.2305.01884.
[10] ZHANG Y H, WANG C R, LING X, et al. Learn from all: erasing attention consistency for noisy label facial expression recognition[C] // Computer Vision-ECCV 2022. Cham: Springer, 2022: 418-434. DOI: 10.1007/978-3-031-19809-0_24.
[11] ZHOU B L, KHOSLA A, LAPEDRIZA A, et al. Learning deep features for discriminative localization[C] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2016: 2921-2929. DOI: 10.1109/CVPR.2016.319.
[12] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C] // Computer Vision-ECCV 2018. Cham: Springer, 2018: 3-19. DOI: 10.1007/978-3-030-01234-2_1.
[13] 高丹,陈建英,谢盈.A-PSPNet:一种融合注意力机制的PSPNet图像语义分割模型[J].中国电子科学研究院学报,2020,15(6):518-523.DOI: 10.3969/j.issn.1673-5692.2020.06.005.
[14] 杨军奇,冯全,王书志,等.基于改进YOLOv4的田间密集小目标检测方法[J].东北农业大学学报,2022,53(5):69-79.DOI: 10.3969/j.issn.1005-9369.2022.05.008.
[15] 张珂,冯晓晗,郭玉荣,等.图像分类的深度卷积神经网络模型综述[J].中国图象图形学报,2021,26(10):2305-2325.DOI: 10.11834/jig.200302.
[16] 马龙祥,杨浩,宋婷婷,等.基于高分辨率特征的舌象分割算法研究[J].计算机工程,2020,46(10):248-252.DOI: 10.19678/j.issn.1000-3428.0056685.
[17] LI S, DENG W H, DU J P. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild[C] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2017: 2584-2593. DOI: 10.1109/CVPR.2017.277.
[18] BARSOUM E, ZHANG C, FERRER C C, et al. Training deep networks for facial expression recognition with crowd-sourced label distribution[C] // Proceedings of the 18th ACM International Conference on Multimodal Interaction. New York, NY: Association for Computing Machinery, 2016: 279-283. DOI: 10.1145/2993148.2993165.
[19] MOLLAHOSSEINI A, HASANI B, MAHOOR M H. AffectNet: a database for facial expression, valence, and arousal computing in the wild[J]. IEEE Transactions on Affective Computing, 2019, 10(1): 18-31. DOI: 10.1109/TAFFC.2017.2740923.
[20] KRIZHEVSKY A. Learning multiple layers of features from tiny images[D]. Toronto: University of Toronto, 2009.
[21] RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252. DOI: 10.1007/s11263-015-0816-y.
[22] 于婉莹,梁美玉,王笑笑,等.基于深度注意力网络的课堂教学视频中学生表情识别与智能教学评估[J].计算机应用,2022,42(3):743-749.DOI: 10.11772/j.issn.1001-9081.2021040846.
[23] 麻永田,齐晶,张秋实,等.基于二阶统计量的小样本学习算法研究[J].北京联合大学学报(自然科学版),2021,35(4):73-78.DOI: 10.16255/j.cnki.ldxbz.2021.04.012.
[24] 李成范,胡子荣,刘岚,等.一种基于“中国视云”平台的CNN核函数可视化方法[J].实验室研究与探索,2021,40(5):57-61.DOI: 10.19927/j.cnki.syyt.2021.05.014.
[25] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2016: 770-778. DOI: 10.1109/CVPR.2016.90.
[26] GUO Y D, ZHANG L, HU Y X, et al. MS-Celeb-1M: a dataset and benchmark for large-scale face recognition[C] // Computer Vision-ECCV 2016. Cham: Springer, 2016: 87-102. DOI: 10.1007/978-3-319-46487-9_6.
[27] KINGMA D, BA J. Adam: a method for stochastic optimization[EB/OL]. (2014-12-22)[2024-02-16]. https://arxiv.org/abs/1412.6980v1. DOI: 10.48550/arXiv.1412.6980.
[28] LI Z Y, ARORA S. An exponential learning rate schedule for deep learning[EB/OL]. (2019-11-21)[2024-02-16]. https://arxiv.org/abs/1910.07454. DOI: 10.48550/arXiv.1910.07454.
[29] LE N, NGUYEN K, TRAN Q, et al. Uncertainty-aware label distribution learning for facial expression recognition[C] // 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Los Alamitos, CA: IEEE Computer Society, 2023: 6077-6086. DOI: 10.1109/WACV56688.2023.00603.
[30] ZENG J B, SHAN S G, CHEN X L. Facial expression recognition with inconsistently annotated datasets[C] // Computer Vision-ECCV 2018. Cham: Springer, 2018: 227-243. DOI: 10.1007/978-3-030-01261-8_14.
[31] WANG K, PENG X J, YANG J F, et al. Region attention networks for pose and occlusion robust facial expression recognition[J]. IEEE Transactions on Image Processing, 2020, 29: 4057-4069. DOI: 10.1109/TIP.2019.2956143.
[32] FARZANEH A H, QI X J. Facial expression recognition in the wild via deep attentive center loss [C] // 2021 IEEE Winter Conference on Applications of Computer Vision (WACV). Los Alamitos, CA: IEEE Computer Society, 2021: 2401-2410. DOI: 10.1109/WACV48630.2021.00245.
[33] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2018: 7132-7141. DOI: 10.1109/CVPR.2018.00745.
[34] WANG Q L, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2020: 11531-11539. DOI: 10.1109/ CVPR42600.2020.01155.
[35] GU Y, YAN H, ZHANG X, et al. Toward facial expression recognition in the wild via noise-tolerant network[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(5): 2033-2047. DOI: 10.1109/TCSVT.2022.3220669.
[1] 郭翔羽, 石天怡, 陈燕楠, 南新元, 蔡鑫. 基于YOLO-CDBW模型的列车接触网异物检测研究[J]. 广西师范大学学报(自然科学版), 2025, 43(2): 56-69.
[2] 刘玉娜, 马双宝. 基于改进YOLOv8n的轻量化织物疵点检测算法[J]. 广西师范大学学报(自然科学版), 2025, 43(2): 83-94.
[3] 戴林华, 黎远松, 石睿, 何忠良, 李雷. HSED-YOLO:一种轻量化的带钢表面缺陷检测模型[J]. 广西师范大学学报(自然科学版), 2025, 43(2): 95-106.
[4] 余快, 宋宝贵, 邵攀, 余翱. 基于层级尺度交互的U-Net遥感影像建筑物提取方法[J]. 广西师范大学学报(自然科学版), 2025, 43(2): 121-132.
[5] 李欣, 宁静. 基于时空特征融合的电力系统暂态稳定评估[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 89-100.
[6] 侯海燕, 谭玉枚, 宋树祥, 夏海英. 头部姿态鲁棒的面部表情识别[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 126-137.
[7] 卢家辉, 陈庆锋, 王文广, 余谦, 何乃旭, 韩宗钊. 基于多尺度注意力的器官图像分割方法[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 138-148.
[8] 杜帅文, 靳婷. 基于用户行为特征的深度混合推荐算法[J]. 广西师范大学学报(自然科学版), 2024, 42(5): 91-100.
[9] 李向利, 梅建平, 莫元健. 基于超图正则NMF的自适应半监督多视图聚类[J]. 广西师范大学学报(自然科学版), 2024, 42(4): 137-152.
[10] 田晟, 胡啸. 基于Transformer模型的车辆轨迹预测[J]. 广西师范大学学报(自然科学版), 2024, 42(3): 47-58.
[11] 易见兵, 彭鑫, 曹锋, 李俊, 谢唯嘉. 多尺度特征融合的点云配准算法研究[J]. 广西师范大学学报(自然科学版), 2024, 42(3): 108-120.
[12] 王天雨, 袁嘉伟, 齐芮, 李洋. 多类型知识增强的微博立场检测模型[J]. 广西师范大学学报(自然科学版), 2024, 42(1): 79-90.
[13] 肖宇庭, 吕晓琪, 谷宇, 刘传强. 基于拆分残差网络的糖尿病视网膜病变分类[J]. 广西师范大学学报(自然科学版), 2024, 42(1): 91-101.
[14] 席凌飞, 伊力哈木·亚尔买买提, 刘雅洁. 基于改进YOLOv5的铝型材表面缺陷检测方法[J]. 广西师范大学学报(自然科学版), 2024, 42(1): 111-119.
[15] 高飞, 郭晓斌, 袁冬芳, 曹富军. 改进PINNs方法求解边界层对流占优扩散方程[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 33-50.
Viewed
Full text
0
HTML PDF
Just accepted Online first Issue Just accepted Online first Issue
0 0 0 0 0 0


Abstract
4
Just accepted Online first Issue
0 0 4
  From Others local
  Times 3 1
  Rate 75% 25%

Cited

Web of Science  Crossref   ScienceDirect  Search for Citations in Google Scholar >>
 
This page requires you have already subscribed to WoS.
  Shared   
  Discussed   
No Suggested Reading articles found!
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发