Journal of Guangxi Normal University(Natural Science Edition) ›› 2026, Vol. 44 ›› Issue (1): 91-101.doi: 10.16088/j.issn.1001-6600.2025040903
• Intelligence Information Processing • Previous Articles Next Articles
WANG Xuyang*, MA Jin
| [1] 李梦云, 张景, 张换香, 等. 基于跨模态语义信息增强的多模态情感分析[J]. 计算机科学与探索, 2024, 18(9): 2476-2486. DOI: 10.3778/j.issn.1673-9418.2307045. [2] 王旭阳, 章家瑜. 基于跨模态增强网络的时序多模态情感分析[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 97-107. DOI: 10.16088/j.issn.1001-6600.2024081301. [3] ZADEH A, LIANG P P, PORIA S, et al. Multi-attention recurrent network for human communication comprehension[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 5642-5649. DOI: 10.1609/aaai.v32i1.12024. [4] 吴俊洁, 王佳阳, 朱萍, 等. 基于MLP和多头自注意力特征融合的双模态情感计算模型[J]. 计算机应用, 2024, 44(S1): 39-43. DOI: 10.11772/j.issn.1001-9081.2023091295. [5] 林宜山, 左景, 卢树华. 基于音视频特征优化与跨模态Transformer的多模态情感分析[J/OL]. 北京航空航天大学学报:1-13[2025-04-09]. https://doi.org/10.13700/j.bh.1001-5965.2024.0247. DOI: 10.13700/j.bh.1001-5965.2024.0247. [6] LIU C, WANG Y, YANG J. A transformer-encoder-based multimodal multi-attention fusion network for sentimentanalysis[J]. Applied Intelligence, 2024, 54(17/18): 8415-8441. DOI: 10.1007/s10489-024-05623-7. [7] 李文潇, 梅红岩, 李雨恬. 基于深度学习的多模态情感分析研究综述[J]. 辽宁工业大学学报(自然科学版), 2022, 42(5): 293-298. DOI: 10.15916/j.issn1674-3261.2022.05.003. [8] 刘继明, 张培翔, 刘颖, 等. 多模态的情感分析技术综述[J]. 计算机科学与探索, 2021, 15(7): 1165-1182. DOI: 10.3778/j.issn.1673-9418.2012075. [9] GARG A, PAVLOVIC V, REHG J M. Boosted learning in dynamic Bayesian networks for multimodal speaker detection[J]. Proceedings of the IEEE, 2003, 91(9): 1355-1369. DOI: 10.1109/JPROC.2003.817119. [10] ZADEH A, CHEN M H, PORIA S, et al. Tensor fusion network for multimodal sentiment analysis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2017: 1103-1114. DOI: 10.18653/v1/D17-1115. [11] ZADEH A, LIANG P P, MAZUMDER N, et al. Memory fusion network for multi-view sequential learning[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 5634-5641. DOI: 10.1609/aaai.v32i1.12021. [12] TSAI Y H H, BAI S J, LIANG P P, et al. Multimodal transformer for unaligned multimodal language sequences[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2019: 6558-6569. DOI: 10.18653/v1/P19-1656. [13] 高玮军, 孙子博, 刘书君. 基于多视角的图像文本情感分析[J]. 计算机科学, 2024, 51(S2): 138-145. DOI: 10.11896/jsjkx.231200163. [14] 卢婵, 郭军军, 谭凯文, 等. 基于文本指导的层级自适应融合的多模态情感分析[J]. 山东大学学报(理学版), 2023, 58(12): 31-40. DOI: 10.6040/j.issn.1671-9352.1.2022.421. [15] HUANG J, JI Y L, QIN Z, et al. Dominant single-modal supplementary fusion (SIMSUF) for multimodal sentiment analysis[J]. IEEE Transactions on Multimedia, 2024, 26: 8383-8394. DOI: 10.1109/TMM.2023.3344358. [16] 欧阳梦妮, 樊小超, 帕力旦·吐尔逊. 基于目标对齐和语义过滤的多模态情感分析[J]. 计算机技术与发展, 2024, 34(10): 171-177. DOI: 10.20165/j.cnki.ISSN1673-629X.2024.0209. [17] 谢润锋, 张博超, 杜永萍. 基于视觉语言模型的跨模态多级融合情感分析方法[J]. 模式识别与人工智能, 2024, 37(5): 459-468. DOI: 10.16451/j.cnki.issn1003-6059.202405007. [18] KE P, JI H Z, LIU S Y, et al. SentiLARE: sentiment-aware language representation learning with linguistic knowledge[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA: Association for Computational Linguistics, 2020: 6975-6988. DOI: 10.18653/v1/2020.emnlp-main.567. [19] XU H, LIU B, SHU L, et al. BERT post-training for review reading comprehension and aspect-based sentiment analysis[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2019: 2324-2335. DOI: 10.18653/v1/N19-1242. [20] ZADEH A, ZELLERS R, PINCUS E, et al. MOSI:multimodal corpus of sentiment intensity and subjectivity analysis in online opinion videos[EB/OL]. (2016-08-12)[2025-04-09]. https://arxiv.org/abs/1606.06259. DOI: 10.48550/arXiv.1606.06259. [21] BAGHER ZADEH A, LIANG P P, PORIA S, et al. Multimodal language analysis in the wild: CMU-MOSEI dataset and interpretable dynamic fusion graph[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2018: 2236-2246. DOI: 10.18653/v1/P18-1208. [22] DEGOTTEX G, KANE J, DRUGMAN T, et al. COVAREP: a collaborative voice analysis repository for speech technologies[C]//2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Piscataway, NJ: IEEE, 2014: 960-964. DOI: 10.1109/ICASSP.2014.6853739. [23] CHEONG J H, JOLLY E, XIE T K, et al. Py-Feat: Python facial expression analysis toolbox[J]. Affective Science, 2023, 4(4): 781-796. DOI: 10.1007/s42761-023-00191-4. [24] EKMAN P, ROSENBERG E L. What the face reveals: basic and applied studies of spontaneous expression using the facial action coding system (FACS)[M]. 2nd ed. New York: Oxford University Press, 2005. DOI: 10.1093/acprof:oso/9780195179644.001.0001. [25] LIN H, ZHANG P L, LING J D, et al. PS-Mixer: a polar-vector and strength-vector mixer model for multimodal sentiment analysis[J]. Information Processing & Management, 2023, 60(2): 103229. DOI: 10.1016/j.ipm.2022.103229. [26] SRIVASTAVA R K, GREFF K, SCHMIDHUBER J. Highway networks[EB/OL]. (2015-11-03)[2025-04-09]. http://arxiv.org/abs/1505.00387. DOI: 10.48550/arXiv.1505.00387. [27] DEVLIN J, CHANG M W, LEE K, et al. BERT:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2019: 4171-4186. DOI: 10.18653/v1/N19-1423. [28] LIU Z, SHEN Y, LAKSHMINARASIMHAN V B, et al. Efficient low-rank multimodal fusion with modality-specific factors[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2018: 2247-2256. DOI: 10.18653/v1/P18-1209. [29] HAZARIKA D, ZIMMERMANN R, PORIA S. MISA: modality-invariant and-specific representations for multimodal sentiment analysis[C]//Proceedings of the 28th ACM International Conference on Multimedia. New York, NY: Association for Computing Machinery, 2020: 1122-1131. DOI: 10.1145/3394171.3413678. [30] YU W M, XU H, YUAN Z Q, et al. Learning modality-specific representations with self-supervised multi-task learning for multimodal sentiment analysis[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(12): 10790-10797. DOI: 10.1609/aaai.v35i12.17289. [31] YANG B, WU L J, ZHU J H, et al. Multimodal sentiment analysis with two-phase multi-task learning[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2022, 30: 2015-2024. DOI: 10.1109/TASLP.2022.3178204. [32] SUN L C, LIAN Z, LIU B, et al. Efficient multimodal transformer with dual-level feature restoration for robust multimodal sentiment analysis[J]. IEEE Transactions on Affective Computing, 2024, 15(1): 309-325. DOI: 10.1109/TAFFC.2023.3274829. [33] WANG Y F, HE J H, WANG D, et al. Multimodal transformer with adaptive modality weighting for multimodal sentiment analysis[J]. Neurocomputing, 2024, 572: 127181. DOI: 10.1016/j.neucom.2023.127181. [34] LI X, ZHANG H J, DONG Z Q, et al. Learning fine-grained representation with token-level alignment for multimodal sentiment analysis[J]. Expert Systems with Applications, 2025, 269: 126274. DOI: 10.1016/j.eswa.2024.126274. |
| [1] | SHI Zihao, MENG Zuqiang, TAN Chaohong. A Detection Model for Multimodal Fake News Based on Attention Mechanism and Multiscale Fusion [J]. Journal of Guangxi Normal University(Natural Science Edition), 2026, 44(1): 68-79. |
| [2] | WANG Xuyang, ZHANG Jiayu. Temporal Multimodal Sentiment Analysis with Cross-Modal Augmentation Networks [J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(4): 97-107. |
| [3] | LI Zhixin, LIU Mingqi. A Dissimilarity Feature-Driven Decoupled Multimodal Sentiment Analysis [J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(3): 57-71. |
| [4] | WANG Xuyang, WANG Changrui, ZHANG Jinfeng, XING Mengyi. Multimodal Sentiment Analysis Based on Cross-Modal Cross-Attention Network [J]. Journal of Guangxi Normal University(Natural Science Edition), 2024, 42(2): 84-93. |
| [5] | GUO Jialiang, JIN Ting. Semantic Enhancement-Based Multimodal Sentiment Analysis [J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(5): 14-25. |
| [6] | SUN Yansong, YANG Liang, LIN Hongfei. Humor Recognition of Sitcom Based on Multi-granularity of Segmentation Enhancement and Semantic Enhancement [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 57-65. |
|