Journal of Guangxi Normal University(Natural Science Edition) ›› 2026, Vol. 44 ›› Issue (1): 68-79.doi: 10.16088/j.issn.1001-6600.2024122004

• Intelligence Information Processing • Previous Articles     Next Articles

A Detection Model for Multimodal Fake News Based on Attention Mechanism and Multiscale Fusion

SHI Zihao1, MENG Zuqiang1*, TAN Chaohong2   

  1. 1. College of Computer, Electronics and Information, Guangxi University, Nanning Guangxi 530004, China;
    2. Guangxi Key Laboratory of Digital Infrastructure (Guangxi Zhuang Autonomous Region Information Center), Nanning Guangxi 530000, China
  • Received:2024-12-20 Revised:2025-04-26 Online:2026-01-05 Published:2026-01-26

Abstract: Fake news can have serious consequences if not dealt with in a timely manner. Currently, various attention mechanisms are mainly employed in multimodal fake news detection methods to fuse unimodal features. The semantic gaps that may exist between different modal features are not taken into account, nor is the potential of multimodal pre-training models fully exploited. To this end, a new multimodal fake news detection model that performs multistage fusion of features is proposed in this paper. The pretrained multimodal model is utilized by the proposed model to extract the aligned features. Then the features are enhanced by each other through the attention mechanism, and the enhanced features are spliced to achieve early fusion. Finally, the interaction information between different modal features is captured by the multiscale fusion module, and the fusion weights are learned to realize the late fusion of features. It is shown by the experimental results that the model proposed in this paper achieves better results than similar models, and the effectiveness of the attention mechanism and the multiscale fusion module is also verified by the experimental results.

Key words: multimodality, fake news detection, attention mechanism, multiscale fusion, multistage fusion

CLC Number:  TP391.1
[1] MA J, GAO W, MITRA P, et al. Detecting rumors from microblogs with recurrent neural networks[C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16). Menlo Park, CA: AAAI Press, 2016: 3818-3824.
[2] YU F, LIU Q, WU S, et al. A convolutional approach for misinformation identification[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI-17). Menlo Park, CA: AAAI Press, 2017: 3901-3907. DOI: 10.24963/ijcai.2017/545.
[3] MA J, GAO W, WONG K F. Detect rumors on twitter by promoting information campaigns with generative adversarial learning[C]//The World Wide Web Conference. New York, NY: Association for Computing Machinery, 2019: 3049-3055. DOI: 10.1145/3308558.3313741.
[4] VAIBHAV V, MANDYAM R, HOVY E. Do sentence interactions matter? Leveraging sentence level representations for fake news classification[C]//Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13). Stroudsburg, PA: Association for Computational Linguistics, 2019: 134-139. DOI: 10.18653/v1/D19-5316.
[5] QI P, CAO J, YANG T Y, et al. Exploiting multi-domain visual information for fake news detection[C]//2019 IEEE International Conference on Data Mining (ICDM). Los Alamitos, CA: IEEE Computer Society, 2019: 518-527. DOI: 10.1109/ICDM.2019.00062.
[6] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2014-09-04)[2024-12-20]. https://arxiv.org/abs/1409.1556. DOI: 10.48550/arXiv.1409.1556.
[7] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2016: 770-778. DOI: 10.1109/CVPR.2016.90.
[8] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL]. (2017-06-12)[2024-12-20]. https://arxiv.org/abs/1706.03762. DOI: 10.48550/arXiv.1706.03762.
[9] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg, PA: Association for Computational Linguistics, 2019: 4171-4186. DOI: 10.18653/v1/N19-1423.
[10] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. (2020-10-22)[2024-12-20]. https://arxiv.org/abs/2010.11929. DOI: 10.48550/arXiv.2010.11929.
[11] 亓鹏, 曹娟, 盛强. 语义增强的多模态虚假新闻检测[J]. 计算机研究与发展, 2021, 58(7): 1456-1465. DOI: 10.7544/issn1000-1239.2021.20200804.
[12] 戚力鑫, 万书振, 唐斌, 等. 基于注意力机制的多模态融合谣言检测方法[J]. 计算机工程与应用, 2022, 58(19): 209-217. DOI: 10.3778/j.issn.1002-8331.2102-0229.
[13] 袁玥, 刘永彬, 欧阳纯萍, 等. 基于一对多关系的多模态虚假新闻检测[J]. 中文信息学报, 2023, 37(9): 131-139. DOI: 10.3969/j.issn.1003-0077.2023.09.017.
[14] 王旭阳, 王常瑞, 张金峰, 等. 基于跨模态交叉注意力网络的多模态情感分析方法[J]. 广西师范大学学报(自然科学版), 2024, 42(2): 84-93. DOI: 10.16088/j.issn.1001-6600.2023052701.
[15] 吴聪, 孟敏智, 郑炜, 等. 基于生成对抗网络和对比学习的假新闻检测方法研究[J]. 网络空间安全科学学报, 2024, 2(3): 27-40. DOI: 10.20172/j.issn.2097-3136.240303.
[16] 乔禹涵, 贾彩燕. 基于图自监督对比学习的社交媒体谣言检测[J]. 南京大学学报(自然科学), 2023, 59(5): 823-832. DOI: 10.13232/j.cnki.jnju.2023.05.010.
[17] 张明道, 周欣, 吴晓红, 等. 基于语义扩充和HDGCN的虚假新闻联合检测技术[J]. 计算机科学, 2024, 51(4): 299-306. DOI: 10.11896/jsjkx.230700170.
[18] 韩晓鸿, 赵梦凡, 张钰涛. 联合异质图卷积网络和注意力机制的假新闻检测[J]. 小型微型计算机系统, 2024, 45(2): 301-308. DOI: 10.20009/j.cnki.21-1106/TP.2022-0412.
[19] 吴娇, 汪可馨, 许锟. 融合多模态的虚假新闻检测[J]. 哈尔滨商业大学学报(自然科学版), 2023, 39(1): 47-52. DOI: 10.19492/j.cnki.1672-0946.2023.01.011.
[20] WANG Y Q, MA F L, JIN Z W, et al. EANN: event adversarial neural networks for multi-modal fake news detection[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, NY: Association for Computing Machinery, 2018: 849-857. DOI: 10.1145/3219819.3219903.
[21] SINGHAL S, SHAH R R, CHAKRABORTY T, et al. SpotFake: a multi-modal framework for fake news detection[C]//2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM). Los Alamitos, CA: IEEE Computer Society, 2019: 39-47. DOI: 10.1109/BigMM.2019.00-44.
[22] 刘金硕, 冯阔, PAN J Z, 等. MSRD: 多模态网络谣言检测方法[J]. 计算机研究与发展, 2020, 57(11): 2328-2336. DOI: 10.7544/issn1000-1239.2020.20200413.
[23] JIN Z W, CAO J, GUO H, et al. Multimodal fusion with recurrent neural networks for rumor detection on microblogs[C]//Proceedings of the 25th ACM International Conference on Multimedia. New York, NY: Association for Computing Machinery, 2017: 795-816. DOI: 10.1145/3123266.3123454.
[24] 周昊玮, 刘勇, 玄萍. 基于预训练和多模态融合的假新闻检测[J]. 计算机工程, 2024, 50(1): 289-295. DOI: 10.19678/j.issn.1000-3428.0066412.
[25] XUE J X, WANG Y B, TIAN Y C, et al. Detecting fake news by exploring the consistency of multimodal data[J]. Information Processing & Management, 2021, 58(5): 102610. DOI: 10.1016/j.ipm.2021.102610.
[26] QIAN S S, WANG J G, HU J, et al. Hierarchical multi-modal contextual attention network for fake news detection[C]//Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY: Association for Computing Machinery, 2021: 153-162. DOI: 10.1145/3404835.3462871.
[27] QI P, CAO J, LI X R, et al. Improving fake news detection by using an entity-enhanced framework to fuse diverse multimodal clues[C]//Proceedings of the 29th ACM International Conference on Multimedia. New York, NY: Association for Computing Machinery, 2021: 1212-1220. DOI: 10.1145/3474085.3481548.
[28] WU Y, ZHAN P W, ZHANG Y J, et al. Multimodal fusion with co-attention networks for fake news detection[C]//Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Stroudsburg, PA: Association for Computational Linguistics, 2021: 2560-2569. DOI: 10.18653/v1/2021.findings-acl.226.
[29] CHEN Y X, LI D S, ZHANG P, et al. Cross-modal ambiguity learning for multimodal fake news detection[C]//Proceedings of the ACM Web Conference 2022. New York, NY: Association for Computing Machinery, 2022: 2897-2905. DOI: 10.1145/3485447.3511968.
[30] 彭广川, 吴飞, 韩璐, 等. 基于跨模态交互与特征融合网络的假新闻检测方法[J]. 计算机科学, 2024, 51(11): 23-29. DOI: 10.11896/jsjkx.231200186.
[31] 刘华玲, 陈尚辉, 曹世杰, 等. 基于多模态学习的虚假新闻检测研究[J]. 计算机科学与探索, 2023, 17(9): 2015-2029. DOI: 10.3778/j.issn.1673-9418.2301064.
[32] TAN H, BANSAL M. Lxmert: learning cross-modality encoder representations from transformers[EB/OL]. (2019-08-20)[2024-12-20]. https://arxiv.org/abs/1908.07490. DOI: 10.48550/arXiv.1908.07490.
[33] QI D, SU L, SONG J, et al. Imagebert: cross-modal pre-training with large-scale weak-supervised image-text data[EB/OL]. (2020-01-22)[2024-12-20]. https://arxiv.org/abs/2001.07966. DOI: 10.48550/arXiv.2001.07966.
[34] RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[EB/OL]. (2021-02-26)[2024-12-20]. https://arxiv.org/abs/2103.00020. DOI: 10.48550/arXiv.2103.00020.
[35] ZHOU Y M, YANG Y Z, YING Q C, et al. Multimodal fake news detection via CLIP-guided learning[C]//2023 IEEE International Conference on Multimedia and Expo (ICME). Los Alamitos, CA: IEEE Computer Society, 2023: 2825-2830. DOI: 10.1109/ICME55011.2023.00480.
[36] ZHOU Y M, YANG Y Z, YING Q C, et al. Multi-modal fake news detection on social media via multi-grained information fusion[C]//Proceedings of the 2023 ACM International Conference on Multimedia Retrieval. New York, NY: Association for Computing Machinery, 2023: 343-352. DOI: 10.1145/3591106.3592271.
[37] WANG Q L, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2020: 11531-11539. DOI: 10.1109/CVPR42600.2020.01155.
[38] YANG A, PAN J S, LIN J Y, et al. Chinese clip: contrastive vision-language pretraining in chinese[EB/OL]. (2022-11-02)[2024-12-20]. https://arxiv.org/abs/2211.01335. DOI: 10.48550/arXiv.2211.01335.
[39] YANG Y, ZHENG L, ZHANG J W, et al. TI-CNN: convolutional neural networks for fake news detection[EB/OL]. (2018-01-03)[2024-12-20]. https://arxiv.org/abs/1806.00749. DOI: 10.48550/arXiv.1806.00749.
[40] ZHANG B C, ZHANG P, DONG X Y, et al. Long-CLIP: unlocking the long-text capability of CLIP[C]//Computer Vision-ECCV 2024: LNCS Volume 15109. Cham: Springer Nature Switzerland AG, 2025: 310-325. DOI: 10.1007/978-3-031-72983-6_18.
[41] PASZKE A, GROSS S, MASSA F, et al. PyTorch: an imperative style, high-performance deep learning library[C]//Advances in Neural Information Processing Systems 32 (NeurIPS 2019). Red Hook, NY: Curran Associates, Inc., 2019: 8026-8037.
[42] ZHU Y Y, LI Y J, WANG J L, et al. FaKnow: a unified library for fake news detection[EB/OL]. (2024-01-27)[2024-12-20]. https://arxiv.org/abs/2401.16441. DOI: 10.48550/arXiv.2401.16441.
[43] LAURENS VAN DER M, HINTON G. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9(86): 2579-2605.
[44] YE X J. Calflops: a FLOPs and Params calculate tool for neural networks in pytorch framework[EB/OL]. (2023-08-20)[2024-12-20]. https://github.com/MrYxJ/calculate-flops.pytorch.
[1] LIU Zhihao, LI Zili, SU Min. YOLOv8-based Helmet Detection Method for Electric Vehicle Riders Combining Intelligent Communication and UAV-Assistance [J]. Journal of Guangxi Normal University(Natural Science Edition), 2026, 44(1): 23-32.
[2] HUANG Qi, LI Bixin, WANG Mingwen, XIAO Cong, LIU Jing, LOU Wenbing. Fake News Detection with Integrated Emotional Knowledge [J]. Journal of Guangxi Normal University(Natural Science Edition), 2026, 44(1): 80-90.
[3] WANG Xuyang, MA Jin. Cross-modal Feature Enhancement and Hierarchical MLP Communication for Multimodal Sentiment Analysis [J]. Journal of Guangxi Normal University(Natural Science Edition), 2026, 44(1): 91-101.
[4] LI Fengwei, TAN Yumei, SONG Shuxiang, XIA Haiying. Occlusion-Aware Facial Expression Recognition Based on Attention Guidance [J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(5): 104-113.
[5] TIAN Sheng, XIONG Chenyin, LONG Anyang. Point Cloud Classification Method of Urban Roads Based on Improved PointNet++ [J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(4): 1-14.
[6] HAN Shuo, JIANG Linfeng, YANG Jianbin. Attention-based PINNs Method for Solving Saint-Venant Equations [J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(4): 58-68.
[7] SHI Tianyi, NAN Xinyuan, GUO Xiangyu, ZHAO Pu, CAI Xin. Improved ConvNeXt-based Algorithm for Apple Leaf Disease Classification [J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(4): 83-96.
[8] LU Zhanyue, CHEN Yanping, YANG Weizhe, HUANG Ruizhang, QIN Yongbin. Relational Extraction Method Based on Mask Attention and Multi-feature Convolutional Networks [J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(3): 12-22.
[9] GUO Xiangyu, SHI Tianyi, CHEN Yannan, NAN Xinyuan, CAI Xin. Research on Foreign Object Detection in Railway Overhead Contact System Based on YOLO-CDBW Model [J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(2): 56-69.
[10] SU Chunhai, XIA Haiying. Facial Expression Recognition Based on Noise-Resistant Dual Constraint Network [J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(2): 70-82.
[11] LIU Yuna, MA Shuangbao. Fabric Defect Detection Based on Improved Lightweight YOLOv8n [J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(2): 83-94.
[12] DAI Linhua, LI Yuansong, SHI Rui, HE Zhongliang, LI Lei. HSED-YOLO: A Lightweight Model for Detecting Surface Defects in Strip Steel [J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(2): 95-106.
[13] YU Kuai, SONG Baogui, SHAO Pan, YU Ao. Hierarchical-scale Interaction-based U-Net for Remote Sensing Image Building Extraction [J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(2): 121-132.
[14] LU Jiahui, CHEN Qingfeng, WANG Wenguang, YU Qian, HE Naixu, HAN Zongzhao. Multi-scale Attention Learning for Abdomen Multi-organ Image Segmentation [J]. Journal of Guangxi Normal University(Natural Science Edition), 2024, 42(6): 138-148.
[15] DU Shuaiwen, JIN Ting. A Deep Hybrid Recommendation Algorithm Based on User Behavior Characteristics [J]. Journal of Guangxi Normal University(Natural Science Edition), 2024, 42(5): 91-100.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LIU Xiaojuan, LIN Lu, HU Yucong, PAN Lei. Research on the Influence of Land Use Types Surrounding Stations on Subway Passenger Satisfaction[J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(6): 1 -12 .
[2] HAN Huabin, GAO Bingpeng, CAI Xin, SUN Kai. Fault Diagnosis of Wind Turbine Blade Icing Based on HO-CNN-BiLSTM-Transformer Model[J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(6): 13 -28 .
[3] CHEN Jianguo, LIANG Enhua, SONG Xuewei, QIN Zhangrong. Lattice Boltzmann Simulation for the Aqueous Humour Dynamics of the Human Eye Based on 3D Reconstruction of OCT Images[J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(6): 29 -41 .
[4] LI Hao, HE Bing. Droplet Rebound Behavior on Grooves Surface[J]. Journal of Guangxi Normal University(Natural Science Edition), 2025, 43(6): 42 -53 .
[5] TIAN Sheng, ZHAO Kailong, MIAO Jialin. Research on Automatic Driving Road Traffic Detection Algorithm Based on Improved YOLO11n Model[J]. Journal of Guangxi Normal University(Natural Science Edition), 2026, 44(1): 1 -9 .
[6] HUANG Yanguo, XIAO Jie, WU Shuiqing. Bidirectional Efficient Multi-scale Traffic Flow Prediction Based on D2STGNN[J]. Journal of Guangxi Normal University(Natural Science Edition), 2026, 44(1): 10 -22 .
[7] LIU Zhihao, LI Zili, SU Min. YOLOv8-based Helmet Detection Method for Electric Vehicle Riders Combining Intelligent Communication and UAV-Assistance[J]. Journal of Guangxi Normal University(Natural Science Edition), 2026, 44(1): 23 -32 .
[8] ZHANG Zhulu, LI Huaqiang, LIU Yang, XU Lixiong. Non-intrusive Load Identification Based on Bi-LSTM Feature Fusion and FT-FSL[J]. Journal of Guangxi Normal University(Natural Science Edition), 2026, 44(1): 33 -44 .
[9] WANG Tao, LI Yuansong, SHI Rui, CHEN Huining, HOU Xianqing. MGDE-UNet: Defect Segmentation Model for Lightweight Photovoltaic Cells[J]. Journal of Guangxi Normal University(Natural Science Edition), 2026, 44(1): 45 -55 .
[10] HUANG Wenjie, LUO Weiping, CHEN Zhennan, PENG Zhixiang, DING Zihao. Research on Lightweight PCB Defect Detection Algorithm Based on YOLO11[J]. Journal of Guangxi Normal University(Natural Science Edition), 2026, 44(1): 56 -67 .