广西师范大学学报(自然科学版) ›› 2022, Vol. 40 ›› Issue (3): 104-111.doi: 10.16088/j.issn.1001-6600.2021071406

• 研究论文 • 上一篇    下一篇

基于多步态特征融合的情感识别

彭涛1,2,3, 唐经1,3, 何凯1,2,3*, 胡新荣1,2,3, 刘军平1,2,3*, 何儒汉1,2,3   

  1. 1.纺织服装智能化湖北省工程研究中心(武汉纺织大学), 湖北 武汉 430200;
    2.湖北省服装信息化工程技术研究中心(武汉纺织大学), 湖北 武汉 430200;
    3.武汉纺织大学 计算机与人工智能学院, 湖北 武汉 430200
  • 收稿日期:2021-07-14 修回日期:2021-10-15 出版日期:2022-05-25 发布日期:2022-05-27
  • 通讯作者: 何凯(1987—), 男, 湖北武汉人, 武汉纺织大学讲师, 博士。E-mail: khe@wtu.edu.cn; 刘军平(1979—), 男, 湖北武汉人, 武汉纺织大学副教授, 博士。E-mail: jpliu@wtu.edu.cn
  • 基金资助:
    国家自然科学基金(61901308)

Emotion Recognition Based on Multi-gait Feature Fusion

PENG Tao1,2,3, TANG Jing1,3, HE Kai1,2,3*, HU Xinrong1,2,3, LIU Junping1,2,3*, HE Ruhan1,2,3   

  1. 1. Hubei Engineering Research Center of Textile and Garment Intellectualization (Wuhan Textile University), Wuhan Hubei 430200, China;
    2. Hubei Engineering Research Center of Garment Information Technology (Wuhan Textile University), Wuhan Hubei 430200, China;
    3. School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan Hubei 430200, China
  • Received:2021-07-14 Revised:2021-10-15 Online:2022-05-25 Published:2022-05-27

摘要: 在情感计算、心理治疗、机器人、监视和观众理解等方面,基于步态特征的情感识别有着广泛的应用前景。已有方法表明,考虑手势位置等上下文信息可以显著提高情绪识别性能,且时空信息能显著提高情绪识别精度。但是单纯使用骨骼空间信息无法充分表达步态中的情绪信息。为了充分利用步态特征,本文提出自适应融合的方法,将骨骼时空信息与骨骼旋转角度结合,提升了现有模型的情感识别精度。本文模型利用自编码器,学习人类行走时的骨骼旋转信息,利用时空图卷积神经网络提取骨骼点时空信息,将骨骼旋转信息与时空信息输入自适应融合网络,得到最终特征进行分类。模型在Emotion-Gait数据集上测试,实验结果显示:悲伤、愤怒和中立情绪的AP值比最新HAP方法分别提升5、8、5个百分点;总体分类的平均MAP值提高了5个百分点。

关键词: 步态特征, 时空图卷积神经网络, 特征融合, 情感识别, 自编码器

Abstract: Emotion recognition based on gait features is considered to have a wide range of applications in emotion computing, psychotherapy, robotics, surveillance and audience understanding. Existing methods show that combining the context information such as gesture position can significantly improve the performance of emotion recognition, and spatiotemporal information can significantly improve the accuracy of emotion recognition. However, the emotional information in gait can not be fully expressed only by using bone spatial information. In order to make good use of the gait features, an adaptive fusion method is proposed in this paper, which combines the spatiotemporal information of the skeleton with the rotation angle of the skeleton, and improves the emotion recognition accuracy of the existing models. The model uses the Autoencoder to learn the bone rotation information of human walking, uses the spatio-temporal convolution neural network to extract the spatio-temporal information of bone points, inputs the bone rotation information and spatio-temporal information into the adaptive fusion network, and obtains the final feature for classification. The model is tested on the Emotion-gait data set, and the experimental results show that the AP values of sadness, anger and neutral emotion have increased by 5, 8 and 5 percentage point respectively compared with the latest HAP method, and the average map value of the overall classification has increased by 5 percentage point.

Key words: gait feature, spatial temporal graph convolutional network, feature fusion, emotion recognition, autoencoder

中图分类号: 

  • TP183
[1]SCHURGIN M W, NELSON J, LIDA S. et al. Eye movements during emotion recognition in faces[J]. Journal of Vision, 2014, 14(13):14. DOI: 10.1167/14.13.14.
[2]KLEINSMITH A, BIANCHI-BERTHOUZE N. Affective body expression perception and recognition: a survey[J]. IEEE Transactions on Affective Computing, 2013, 4(1): 15-33. DOI: 10.1109/T-AFFC.2012.16.
[3]苏文瑛. 基于三维人体运动数据的情感识别[D]. 天津: 天津大学, 2012. DOI: 10.7666/d.D323840.
[4]BHATTACHARYA U, MITTAL T, CHANDRA R, et al. STEP: spatial temporal graph convolutional networks for emotion perception from gaits[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(2): 1342-1350. DOI: 10.1609/aaai.v34i02.5490.
[5]MONTEPARE J M, GOLDSTEIN S B, CLAUSEN A. The identification of emotions from gait information[J]. Journal of Nonverbal Behavior, 1987, 11(1):33-42. DOI: 10.1007/BF00999605.
[6]MEEREN H K M, VAN HEIJNSBERGEN C C R J, DE GELDER B. Rapid perceptual integration of facial expression and emotional body language[J]. Proceedings of the National Academy of Sciences of the United States of America, 2005, 102(45): 16518-16523. DOI: 10.1073/pnas.0507650102.
[7]MICHALAK J, TROJE N F, FISCHER J, et al. Embodiment of sadness and depression - gait patterns associated with dysphoric mood[J]. Psychosomatic Medicine, 2009, 71(5):580-587. DOI: 10.1097/PSY.0b013e3181a2515c.
[8]DENG J, DONG W, SOCHER R, et al. Imagenet: a large-scale hierarchical image database[C]// 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE Press, 2009: 248-255. DOI: 10.1109/CVPR.2009.5206848.
[9]常亮, 邓小明, 周明全, 等. 图像理解中的卷积神经网络[J]. 自动化学报, 2016, 42(9): 1300-1312. DOI: 10.16383/j.aas.2016.c150800.
[10]RANDHAVANE T, BHATTACHARYA U, KAPSASKIS K, et al. Identifying emotions from walking using affective and deep features[EB/OL].(2020-01-09)[2021-07-14]. https://arxiv.org/pdf/1906.11884.
[11]BHATTACHARYA U, RONCAL C, MITTAL T, et al. Take an emotion walk: perceiving emotions from gaits using hierarchical attention pooling and affective mapping[C]// Computer Vision-ECCV 2020: Lecture Notes in Computer Science Vol 12355. Cham: Springer, 2020: 145-163. DOI: 10.1007/978-3-030-58607-2_9.
[12]YAN S J, XIONG Y J, LIN D H. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]// Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2018: 7444-7452.
[13]李卫. 深度学习在图像识别中的研究及应用[D]. 武汉: 武汉理工大学, 2014. DOI: 10.7666/d.D617675.
[14]李彦冬, 郝宗波, 雷航.卷积神经网络研究综述[J]. 计算机应用, 2016, 36(9): 2508-2515, 2565. DOI: 10.11772/j.issn.1001-9081.2016.09.2508.
[15]许可. 卷积神经网络在图像识别上的应用的研究[D].杭州: 浙江大学, 2012.
[16]高静静, 张菁, 沈兰荪. 视觉注意力模型的改进算法[J]. 电子测量技术, 2008, 31(3): 1-3, 10. DOI: 10.19651/j.cnki.emt.2008.03.001.
[17]张杰, 魏维. 基于视觉注意力模型的显著性提取[J]. 计算机技术与发展, 2010, 20(11): 109-113. DOI: 10.3969/j.issn.1673-629X.2010.11.027.
[18]LEE J Y, KIM S Y, KIM S N, et al. Context-aware emotion recognition networks[C]// 2019 IEEE/CVF International Conference on Computer Vision(ICCV). Los Alamitos, CA: IEEE Computer Society, 2019: 10142-10151. DOI: 10.1109/ICCV.2019.01024.
[19]CHEN C, WU Z X, JIANG Y G. Emotion in context: deep semantic feature fusion for video emotion recognition[C]// Proceedings of the 24th ACM International Conference on Multimedia. New York, NY: ACM Press, 2016: 127-131. DOI: 10.1145/2964284.2967196.
[20]张卡, 张道俊, 盛业华, 等. 三维坐标转换的两种方法及其比较研究[J]. 数学的实践与认识, 2008, 38(23): 121-128.
[21]敖道敢. 无监督特征学习结合神经网络应用于图像识别[D].广州: 华南理工大学, 2014.
[22]MA Y L, PATERSON H M, POLLICK F E. A motion capture library for the study of identity, gender, and emotion perception from biological motion[J]. Behavior Research Methods, 2006, 38(1): 134-141. DOI: 10.3758/BF03192758.
[23]IONESCU C, PAPAVA D, OLARU V, et al. Human3.6M: large scale datasets and predictive methods for 3D human sensing in natural environments[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(7): 1325-1339. DOI: 10.1109/TPAMI.2013.248.
[24]NARANG S, BEST A, FENG A, et al. Motion recognition of self and others on realistic 3D avatars[J]. Computer Animation and Virtual Worlds, 2017, 28(3/4): e1762. DOI: 10.1002/cav.1762.
[25]CMU Graphics Lab. CMU graphics lab motion capture database[DS/OL]. Pittsburgh, PA: CMU Graphics Lab, 2018[2021-07-14]. http://mocap.cs.cmu.edu.
[26]SHI L, ZHANG Y F, CHENG J, et al. Skeleton-based action recognition with directed graph neural networks[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Los Alamitos, CA: IEEE Computer Society, 2019: 7904-7913. DOI: 10.1109/CVPR.2019.00810.
[27]LIU Z Y, ZHANG H W, CHEN Z H, et al. Disentangling and unifying graph convolutions for skeleton-based action recognition[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Los Alamitos, CA: IEEE Computer Society, 2020: 140-149. DOI: 10.1109/CVPR42600.2020.00022.
[28]MEHRABIAN A, RUSSELL J A. An approach to environmental psychology[M]. Cambridge, MA: MIT Press, 1974.
[29]曹田熠. 多模态融合的情感识别研究[D]. 天津: 天津大学, 2012. DOI: 10.7666/d.D323808.
[30]陈文兵, 管正雄, 陈允杰. 基于条件生成式对抗网络的数据增强方法[J]. 计算机应用, 2018, 38(11): 3305-3311. DOI: 10.11772/j.issn.1001-9081.2018051008.
[1] 张伟彬, 吴军, 易见兵. 基于RFB网络的特征融合管制物品检测算法研究[J]. 广西师范大学学报(自然科学版), 2021, 39(4): 34-46.
[2] 吕惠炼, 胡维平. 基于端到端深度神经网络的语音情感识别研究[J]. 广西师范大学学报(自然科学版), 2021, 39(3): 20-26.
[3] 白捷, 高海力, 王永众, 杨来邦, 项晓航, 楼雄伟. 基于多路特征融合的Faster R-CNN与迁移学习的学生课堂行为检测[J]. 广西师范大学学报(自然科学版), 2020, 38(5): 1-11.
[4] 张灿龙, 李燕茹, 李志欣, 王智文. 基于核相关滤波与特征融合的分块跟踪算法[J]. 广西师范大学学报(自然科学版), 2020, 38(5): 12-23.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 艾艳, 贾楠, 王媛, 郭静, 潘东东. 多性状多位点遗传关联分析的统计方法研究及其应用进展[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 1 -14 .
[2] 白德发, 徐欣, 王国长. 函数型数据广义线性模型和分类问题综述[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 15 -29 .
[3] 曾庆樊, 秦永松, 黎玉芳. 一类空间面板数据模型的经验似然推断[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 30 -42 .
[4] 张治飞, 段谦, 刘乃嘉, 黄磊. 基于Jackknife互信息的高维非线性回归模型研究[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 43 -56 .
[5] 杨迪, 方扬鑫, 周彦. 基于MEB和SVM方法的新类别分类研究[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 57 -67 .
[6] 陈钟秀, 张兴发, 熊强, 宋泽芳. 非对称DAR模型的估计与检验[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 68 -81 .
[7] 杜锦丰, 王海荣, 梁焕, 王栋. 基于表示学习的跨模态检索方法研究进展[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 1 -12 .
[8] 李慕航, 韩萌, 陈志强, 武红鑫, 张喜龙. 面向复杂高效用模式的挖掘算法综述[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 13 -30 .
[9] 晁睿, 张坤丽, 王佳佳, 胡斌, 张维聪, 韩英杰, 昝红英. 中文多模态知识库构建[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 31 -39 .
[10] 李正光, 陈恒, 林鸿飞. 基于双向语言模型的社交媒体药物不良反应识别[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 40 -48 .
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发