2025年04月22日 星期二

广西师范大学学报(自然科学版) ›› 2024, Vol. 42 ›› Issue (6): 126-137.doi: 10.16088/j.issn.1001-6600.2023121801

• “污水处理”专栏 • 上一篇    下一篇

头部姿态鲁棒的面部表情识别

侯海燕1, 谭玉枚2, 宋树祥1, 夏海英1*   

  1. 1.广西师范大学 电子与信息工程学院/集成电路学院, 广西 桂林 541004;
    2.广西师范大学 计算机科学与工程学院,广西 桂林 541004
  • 收稿日期:2023-12-18 修回日期:2024-03-10 出版日期:2024-12-30 发布日期:2024-12-30
  • 通讯作者: 夏海英(1983—), 女, 山东聊城人, 广西师范大学教授, 博士。E-mail:xhy22@gxnu.edu.cn
  • 基金资助:
    国家自然科学基金(62106054); 广西创新驱动重大专项(桂科AA20302003)

Head Pose-Robust Facial Expression Recognition

HOU Haiyan1, TAN Yumei2, SONG Shuxiang1, XIA Haiying1*   

  1. 1. School of Electronic and Information Engineering/School of Integrated Circuits, Guangxi Normal University, Guilin Guangxi 541004, China;
    2. School of Computer Science and Engineering, Guangxi Normal University, Guilin Guangxi 541004, China
  • Received:2023-12-18 Revised:2024-03-10 Online:2024-12-30 Published:2024-12-30

摘要: 针对面部表情识别中受头部姿态干扰导致识别性能低的问题,本文提出一种双分支特征融合(dual-branch feature fusion,DFF)方法,以增强面部表情识别的头部姿态鲁棒性。首先,在表情分支中,采用特征提取模块提取高维粗糙表情特征,再利用空间特征增强(spatial feature enhancement,SFE)模块增强高维表情特征在空间层面的信息交互,从而提升表情分支的表情特征提取能力。同时,在头部姿态分支中,利用预训练并固定权重的头部姿态特征提取(head pose feature extraction,HPFE)模块,提取出人脸表情图像的头部姿态特征。最后,将表情分支中的表情特征与头部姿态分支中的头部姿态特征逐元素相乘融合,实现特征间信息互补,得到对头部姿态鲁棒的情感表征。在RAF-DB和FERPlus数据集上针对2种头部姿态Pose(>30°)、Pose(>45°)进行实验评估:在RAF-DB数据集上识别准确率分别为89.98%、89.96%,在FERPlus数据集上分别为89.20%、87.94%。实验结果表明,本文提出的方法提高了存在头部姿态干扰时面部图像的表情识别准确率,对研究自然环境下面部表情识别具有一定贡献。

关键词: 表情识别, 头部姿态, 特征提取, 鲁棒性, 深度学习

Abstract: This paper proposes a Dual-branch Feature Fusion (DFF) method to enhance the robustness of head posture in facial expression recognition, addressing the issue of low recognition performance caused by head posture interference. Firstly, in the expression branch, high-dimensional rough semantic features are extracted using the ResNet18 backbone network. Then, the Spatial Feature Enhancement (SFE) module is employed to facilitate information interaction among high-level semantic features at the spatial level, thereby improving the expression feature extraction capability. Meanwhile, in the head pose branch, head pose features are extracted using the Head Pose Feature Extraction (HPFE),which is pre-trained on the head pose dataset 300W_LP with fixed weights. Finally, the expression features in the expression branch and the head pose features in the head pose branch are fused element-by-element to attain complementary information and establish a pose-robust emotional representation. The proposed method is evaluated on two widely-used datasets: RAF-DB dataset and FERPlus dataset. On the Pose Variation test set, the recognition accuracy of the two head poses (Pose>30° and Pose>45°) is 89.98% and 89.96% on the RAF-DB dataset, and 89.20% and 87.94% on FERPlus dataset, respectively. The experimental results show that the method proposed in this paper improves the accuracy of facial expression recognition in images under head posture interference, which is of great significance for research on facial expression recognition in natural environments.

Key words: expression recognition, head posture, feature extraction, robustness, deep learning

中图分类号:  TP391.41

[1] LI S, DENG W H. Deep facial expression recognition: A survey[J]. IEEE Transactions on Affective Computing, 2022, 13(3): 1195-1215. DOI: 10. 1109/TAFFC.2020.2981446.
[2] 李晶, 李健, 陈海丰, 等. 基于关键区域遮挡与重建的人脸表情识别[J]. 计算机工程, 2024, 50(5): 241-249. DOI: 10.19678/j.issn.1000-3428.0067538.
[3] VINCIARELLI A, PANTIC M, BOURLARD H. Social signal processing: survey of an emerging domain[J]. Image and Vision Computing, 2009, 27(12): 1743-1759. DOI: 10.1016/j.imavis.2008.11.007.
[4] 廖明明, 赵波. 基于面部表情和双流网络的驾驶员疲劳检测[J]. 科学技术与工程, 2022, 22(2): 614-619. DOI: 10.3969/j.issn.1671-1815.2022.02.024.
[5] GRECO M, CARUSO P F, CECCONI M. Artificial intelligence in the intensive care unit[J]. Seminars in Respiratory and Critical Care Medicine, 2021, 42(1): 2-9. DOI: 10.1055/S-0040-1719037.
[6] 陈子健. 在线学习环境下基于面部表情的学习情绪识别方法及应用研究[D]. 武汉: 华中师范大学, 2020.
[7] 张发勇, 刘袁缘, 李杏梅, 等. 基于多视角深度网络增强森林的表情识别[J]. 计算机辅助设计与图形学学报, 2018, 30(12): 2318-2326. DOI: 10.3724/SP.J.1089.2018.17154.
[8] 蒋斌, 钟瑞, 张秋闻, 等. 采用深度学习方法的非正面表情识别综述[J]. 计算机工程与应用, 2021, 57(8): 48-61. DOI: 10.3778/j.issn.1002-8331.2012-0227.
[9] XUE F L, WANG Q C, GUO G D. TransFER: learning relation-aware facial expression representations with transformers[C] // 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Los Alamitos, CA: IEEE Computer Society, 2021: 3581-3590. DOI: 10.1109/ICCV48922.2021.00358.
[10] ZHANG F F, ZHANG T Z, MAO Q R, et al. Joint pose and expression modeling for facial expression recognition[C] // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2018: 3359-3368. DOI: 10.1109/CVPR.2018.00354.
[11] 陈国社, 张青, 李凡. 基于多姿态多状态面部情绪模型的表情识别[J]. 华中科技大学学报(自然科学版), 2004, 32(8): 60-62. DOI: 10.13245/j.hust.2004.08.021.
[12] GÜNEY F, ARAR N M, FISCHER M, et al. Cross-pose facial expression recognition[C] // 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG). Piscataway, NJ: IEEE, 2013: 1-6. DOI: 10.1109/FG.2013.6553814.
[13] HASSNER T, HAREL S, PAZ E, et al. Effective face frontalization in unconstrained images[C] // 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2015: 4295-4304. DOI:10.1109/CVPR.2015.7299058.
[14] ZHANG F F, MAO Q R, SHEN X J, et al. Spatially coherent feature learning for pose-invariant facial expression recognition[J]. ACM Transactions on Multimedia Computing, Communications and Applications, 2018, 14(1s): 27. DOI: 10.1145/3176646.
[15] 孙哲. 基于解耦空间特征学习的稀疏表示面部表情识别算法研究[D]. 秦皇岛: 燕山大学, 2018. DOI: 10.7666/d.D01511248.
[16] RUAN D L, YAN Y, CHEN S, et al. Deep disturbance-disentangled learning for facial expression recognition[C] // Proceedings of the 28th ACM International Conference on Multimedia. New York, NY: Association for Computing Machinery, 2020: 2833-2841. DOI: 10.1145/3394171.3413907.
[17] RUAN D L, MO R Y, YAN Y, et al. Adaptive deep disturbance-disentangled learning for facial expression recognition[J]. International Journal of Computer Vision, 2022, 130(2): 455-477. DOI: 10.1007/s11263-021-01556-7.
[18] JIANG J, DENG W H. Disentangling identity and pose for facial expression recognition[J]. IEEE Transactions on Affective Computing, 2022. 13(4): 1868-1878. DOI: 10.1109/taffc.2022.3197761.
[19] 刘娟, 王颖, 胡敏, 等. 融合全局增强-局部注意特征的表情识别网络[J]. 计算机科学与探索, 2024,18(9): 2487-2500.DOI: 10.3778/j.issn.1673-9418.2307013.
[20] LIU Y Y, ZENG J B, SHAN S G, et al. Multi-channel pose-aware convolution neural networks for multi-view facial expression recognition[C] // 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). Los Alamitos, CA: IEEE Computer Society, 2018: 458-465. DOI:10.1109/FG.2018.00074.
[21] LIU Y Y, DAI W, FANG F, et al. Dynamic multi-channel metric network for joint pose-aware and identity-invariant facial expression recognition[J]. Information Sciences, 2021, 578: 195-213. DOI:10.1016/j.ins.2021.07.034.
[22] 郭胜, 蔡姗, 邹雪, 等. 基于加权多头并行注意力的局部遮挡面部表情识别[J]. 计算机系统应用, 2024, 33(1): 254-262. DOI: 10.15888/j.cnki.csa.009352.
[23] 南亚会, 华庆一. 局部加全局视角遮挡人脸表情识别方法[J]. 计算机工程与应用, 2024, 60(13): 180-189. DOI: 10.3778/j.issn.1002-8331.2309-0213.
[24] ZHU X Y, LEI Z, LIU X M, et al. Face alignment across large poses: a 3D solution[C] // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2016: 146-155. DOI: 10.1109/CVPR.2016.23.
[25] HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3[C] // 2019 IEEE/CVF International Conference on Computer Vision(ICCV). Los Alamitos, CA: IEEE Computer Society, 2019: 1314-1324. DOI: 10.1109/ICCV.2019.00140.
[26] WEN Z Y, LIN W Z, WANG T, et al. Distractyour attention: multi-head cross attention network for facial expression recognition[J]. Biomimetics, 2023, 8(2): 199. DOI: 10.3390/biomimetics8020199.
[27] LI S, DENG W H, DU J P. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild[C] // 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2017: 2584-2593. DOI: 10.1109/CVPR.2017.277.
[28] BARSOUM E, ZHANG C, FERRER C C, et al. Training deep networks for facial expression recognition with crowd-sourced label distribution[C] // Proceedings of the 18th ACM International Conference on Multimodal Interaction. New York, NY: Association for Computing Machinery, 2016: 279-283. DOI: 10.1145/2993148.2993165.
[29] WANG K, PENG X J, YANG J F, et al. Region attention networks for pose and occlusion robust facial expression recognition[J]. IEEE Transactions on Image Processing, 2020, 29: 4057-4069. DOI: 10.1109/TIP.2019.2956143.
[30] RUDER S. An overview of gradient descent optimization algorithms[EB/OL]. (2017-06-15)[2023-12-18]. https://arxiv.org/abs/1609.04747. DOI: 10.48550/arXiv.1609.04747.
[31] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[J]. International Journal of Computer Vision, 2020, 128(2): 336-359. DOI: 10.1007/s11263-019-01228-7.
[32] LI Y J, LU G M, LI J X, et al. Facial expression recognition in the wild using multi-level features and attention mechanisms[J]. IEEE Transactions on Affective Computing, 2023, 14(1): 451-462. DOI: 10.1109/TAFFC.2020.3031602.
[33] MA F Y, SUN B, LI S T. Facial expression recognition with visual transformers and attentional selective fusion[J]. IEEE Transactions on Affective Computing, 2023, 14(2): 1236-1248. DOI: 10.1109/TAFFC.2021.3122146.
[34] GERA D, BALASUBRAMANIAN S. Landmark guidance independent spatio-channel attention and complementary context information based facial expression recognition[J]. Pattern Recognition Letters, 2021, 145: 58-66. DOI: 10.1016/j.patrec.2021.01.029.
[35] XIA H Y, LU L D, SONG S X. Feature fusion of multi-granularity and multi-scale for facial expression recognition[J]. The Visual Computer, 2024, 40(3): 2035-2047. DOI: 10.1007/s00371-023-02900-3.
[36] 罗岩, 冯天波, 邵洁. 基于注意力及视觉Transformer的野外人脸表情识别[J]. 计算机工程与应用, 2022, 58(10): 200-207. DOI: 10.3778/j.issn.1002-8331.2111-0044.
[37] CHO S, LEE J. Learning local attention with guidance map for pose robust facial expression recognition[J]. IEEE Access, 2022, 10: 85929-85940. DOI: 10.1109/ACCESS.2022.3198658.
[38] JIANG J, DENG W H. Boosting facial expression recognition by a semi-supervised progressive teacher[J]. IEEE Transactions on Affective Computing, 2023, 14(3): 2402-2414. DOI: 10.1109/taffc.2021.3131621.
[39] WANG K, PENG X J, YANG J F, et al. Suppressing uncertainties for large-scale facial expression recognition[C] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2020: 6896-6905. DOI: 10.1109/CVPR42600.2020.00693.
[40] 南亚会, 华庆一, 刘继华. 嵌入注意力的Gabor CNN快速人脸表情识别方法[J]. 软件导刊, 2023, 22(9): 182-189. DOI: 10.11907/rjdk.231549.
[41] 何昱均, 韩永国, 张红英. FFDNet:复杂环境中的细粒度面部表情识别[J]. 计算机应用研究,2024, 41(5): 1578-1584. DOI: 10.19734/j.issn.1001-3695.2023.08.0394.
[42] SHAO J, LUO Y. TAMNet: two attention modules-based network on facial expression recognition under uncertainty[J]. Journal of Electronic Imaging, 2021, 30(3): 033021. DOI: 10.1117/1.JEI.30.3.033021.
[1] 李欣, 宁静. 基于时空特征融合的电力系统暂态稳定评估[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 89-100.
[2] 卢家辉, 陈庆锋, 王文广, 余谦, 何乃旭, 韩宗钊. 基于多尺度注意力的器官图像分割方法[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 138-148.
[3] 杜帅文, 靳婷. 基于用户行为特征的深度混合推荐算法[J]. 广西师范大学学报(自然科学版), 2024, 42(5): 91-100.
[4] 黄润琴, 苏珉, 刘佳, 王涛. 基于小波变换与奇异值分解的飞鸟动态电磁散射特征提取[J]. 广西师范大学学报(自然科学版), 2024, 42(4): 74-89.
[5] 田晟, 胡啸. 基于Transformer模型的车辆轨迹预测[J]. 广西师范大学学报(自然科学版), 2024, 42(3): 47-58.
[6] 易见兵, 彭鑫, 曹锋, 李俊, 谢唯嘉. 多尺度特征融合的点云配准算法研究[J]. 广西师范大学学报(自然科学版), 2024, 42(3): 108-120.
[7] 肖宇庭, 吕晓琪, 谷宇, 刘传强. 基于拆分残差网络的糖尿病视网膜病变分类[J]. 广西师范大学学报(自然科学版), 2024, 42(1): 91-101.
[8] 高飞, 郭晓斌, 袁冬芳, 曹富军. 改进PINNs方法求解边界层对流占优扩散方程[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 33-50.
[9] 林玩聪, 韩明杰, 靳婷. 基于数据增强的多层次论点立场分类方法[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 62-69.
[10] 梁正友, 蔡俊民, 孙宇, 陈磊. 结合残差动态图卷积与特征强化的点云分类[J]. 广西师范大学学报(自然科学版), 2023, 41(5): 37-48.
[11] 蒋懿波, 刘会家, 吴田. 基于改进残差网络的输电线路雷击过电压识别研究[J]. 广西师范大学学报(自然科学版), 2023, 41(4): 74-83.
[12] 梁镇锋, 夏海英. 一种面向无人机航拍图像的快速拼接算法[J]. 广西师范大学学报(自然科学版), 2023, 41(3): 41-52.
[13] 杨烁祯, 张珑, 王建华, 张恒远. 声音事件检测综述[J]. 广西师范大学学报(自然科学版), 2023, 41(2): 1-18.
[14] 王鲁娜, 杜洪波, 朱立军. 基于流形正则的堆叠胶囊自编码器优化算法[J]. 广西师范大学学报(自然科学版), 2023, 41(2): 76-85.
[15] 于梦竹, 唐振军. 基于手工特征的视频哈希研究综述[J]. 广西师范大学学报(自然科学版), 2022, 40(5): 72-89.
Viewed
Full text
95
HTML PDF
Just accepted Online first Issue Just accepted Online first Issue
0 0 0 0 0 95

  From Others local
  Times 32 63
  Rate 34% 66%

Abstract
61
Just accepted Online first Issue
0 0 61
  From Others local
  Times 51 10
  Rate 84% 16%

Cited

Web of Science  Crossref   ScienceDirect  Search for Citations in Google Scholar >>
 
This page requires you have already subscribed to WoS.
  Shared   
  Discussed   
[1] 朱格格, 黄安书, 覃盈盈. 基于Web of Science的国际红树林研究发展态势分析[J]. 广西师范大学学报(自然科学版), 2024, 42(5): 1 -12 .
[2] 何静, 冯元柳, 邵靖雯. 基于CiteSpace的多源数据融合研究进展[J]. 广西师范大学学报(自然科学版), 2024, 42(5): 13 -27 .
[3] 王淑颖, 卢宇翔, 董淑彤, 陈默, 康秉娅, 蒋长兰, 宿程远. 污水中抗生素抗性基因传播过程及控制技术研究进展[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 1 -15 .
[4] 钟俏, 陈生龙, 唐聪聪. 水凝胶技术在微藻采收中的应用:现状、挑战与发展分析[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 16 -29 .
[5] 翟思琪, 蔡文君, 朱苏, 李韩龙, 宋海亮, 杨小丽, 杨玉立. 汲取液溶质反向扩散与正渗透中膜污染的相互关系研究[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 30 -39 .
[6] 郑国权, 秦永丽, 汪晨祥, 葛仕佳, 闻倩敏, 蒋永荣. ABR硫酸盐还原体系分级沉淀酸性矿山废水中重金属及矿物形成[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 40 -52 .
[7] 刘洋, 张毅杰, 章延, 李玲, 孔祥铭, 李红. 饮用水处理中藻类混凝消除技术的现状与趋势——基于CiteSpace的可视化分析[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 53 -66 .
[8] 田晟, 陈东. 基于深度强化学习的网联燃料电池混合动力汽车生态驾驶联合优化方法[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 67 -80 .
[9] 陈秀锋, 王成鑫, 赵凤阳, 杨凯, 谷可鑫. 改进DQN算法的单点交叉口信号控制方法[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 81 -88 .
[10] 李欣, 宁静. 基于时空特征融合的电力系统暂态稳定评估[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 89 -100 .
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发