基于熵图像静态分析技术的勒索软件分类研究

doi:10.16088/j.issn.1001-6600.2022100805

摘要/Abstract

摘要： 随着人工智能、5G、物联网等技术的快速发展,我国在网络安全领域遭受境外攻击的现象也愈发严重,勒索软件攻击事件已显著增加,给国家、企业和个人造成巨大的数据损失和经济损失。为了有效地对勒索软件家族进行分类,本文提出一种基于熵图像静态分析技术的勒索软件分类方法,直接利用从勒索软件二进制文件中提取的熵特征进行分类,同时提出一种名为Ran-GAN的数据增强方法以解决勒索软件家族间数据不平衡问题。本文提出的方法将注意力机制引入VGG16神经网络架构中,用于提升网络的特征提取能力。实验结果表明,本文提出的方法在14种勒索软件家族上可达97.16%的准确率以及97.12%的加权平均F1-score。与传统可视化方法相比,本文提出的方法在4种评价指标下均明显优于传统的可视化方法,同时,与其他神经网络方法相比,勒索软件的检测性能都有显著提升。

关键词: 勒索软件, 勒索软件可视化, 熵特征, 静态分析, 注意力机制

Abstract: With the rapid development of artificial intelligence, 5G, Internet of Things and other technologies, China has become increasingly vulnerable to attacks from outside the country in the field of cyber security. The number of ransomware attacks has increased significantly, causing huge data losses and economic losses to individuals, enterprises and countries. To effectively classify ransomware families, a ransomware classification method based on entropy image static analysis technology is proposed in this paper, which directly utilizes the entropy features extracted from ransomware binary files for classification. In addtion, a data augmentation method named Ran-GAN is proposed to solve the data imbalance problem among ransomware families. The method proposed in this paper introduces the attention mechanism into the VGG16 neural network architecture to improve the feature extraction ability of the network. Experimental results show that the proposed method achieves 97.16% accuracy and 97.12% weighted average F1-score on 14 ransomware families. Compared with the traditional visualization methods, the proposed method is obviously better than the traditional visualization methods under the four evaluation indicators. At the same time, the ransomware detection performance is significantly improved compared with other neural network methods.

Key words: ransomware, ransomware visualization, entropy features, static analysis, attention mechanism

中图分类号: TP309

邓希桢, 蒋明, 岑明灿, 罗玉玲. 基于熵图像静态分析技术的勒索软件分类研究[J]. 广西师范大学学报（自然科学版）, 2023, 41(3): 91-104.

DENG Xizhen, JIANG Ming, CEN Mingcan, LUO Yuling. Ransomware Classification Based on Entropy Image Static Analysis Technology[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(3): 91-104.

参考文献

[1] BRIDGES L. The changing face of malware[J]. Network Security, 2008, 2008(1): 17-20. DOI: 10.1016/S1353-4858(08)70010-2.
[2] 腾讯研究院. 2021年勒索攻击特征与趋势研究白皮书[R]. 武汉: 腾讯研究院, 2021.
[3] NATARAJ L, KARTHIKEYAN S, JACOB G, et al. Malware images: visualization and automatic classification[C]//Proceedings of the 8th International Symposium on Visualization for Cyber Security. New York, NY: Association for Computing Machinery, 2011: 4. DOI: 10.1145/2016904.2016908.
[4] KANCHERLA K, MUKKAMALA S. Image visualization based malware detection[C]//2013 IEEE Symposium on Computational Intelligence in Cyber Security (CICS). Piscataway, NJ: IEEE, 2013: 40-44. DOI: 10.1109/CICYBS.2013.6597204.
[5] SAXE J, BERLIN K. Deep neural network based malware detection using two dimension binary program features[C]//2015 10th International Conference on Malicious and Unwanted Software (MALWARE). Piscataway, NJ: IEEE, 2015: 11-20. DOI: 10.1109/MALWARE.2015.7413680.
[6] 郭春, 陈长青, 申国伟, 等. 一种基于可视化的勒索软件分类方法[J]. 信息网络安全, 2020, 20(4): 31-39. DOI: 10.3969/j.issn.1671-1122.2020.04.004.
[7] XIAO G Q, LI J N, CHEN Y D, et al. MalFCS:an effective malware classification framework with automated feature extraction based on deep convolutional neural networks[J]. Journal of Parallel and Distributed Computing, 2020, 141: 49-58. DOI: 10.1016/j.jpdc.2020.03.012.
[8] 杨春雨, 徐洋, 张思聪, 等. 一种基于三通道图像的恶意软件分类方法[J]. 武汉大学学报(理学版), 2022, 68(1): 26-34. DOI: 10.14188/j.1671-8836.2021.2005.
[9] 王方伟, 柴国芳, 李青茹, 等. 基于参数优化元学习和困难样本挖掘的小样本恶意软件分类方法[J]. 武汉大学学报(理学版), 2022, 68(1):17-25. DOI: 10.14188/j.1671-8836.2021.2008.
[10] 陈小寒, 魏书宁, 覃正泽.基于深度学习可视化的恶意软件家族分类[J]. 计算机工程与应用, 2021, 57(22): 131-138. DOI: 10.3778/j.issn.1002-8331.2007-0291.
[11] 张英韬, 王宝会.基于图表示学习的恶意软件分类方法[J]. 新型工业化, 2021, 11(10): 91-96. DOI: 10.19335/j.cnki.2095-6649.2021.10.019.
[12] ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//2017 IEEE International Conference on Computer Vision (ICCV). Piscataway, NJ: IEEE, 2017: 2223-2232. DOI: 10.1109/ICCV.2017.244.
[13] VINAYAKUMAR R, SOMAN K P, SENTHIL VELAN K K, et al. Evaluating shallow and deep networks for ransomware detection and classification[C]//2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI). Piscataway, NJ: IEEE, 2017: 259-265. DOI: 10.1109/ICACCI.2017.8125850.
[14] 陈长青, 郭春, 崔允贺, 等. 基于API短序列的勒索软件早期检测方法[J]. 电子学报, 2021, 49(3): 586-595. DOI: 10.12263/DZXB.20200623.
[15] 汪嘉来, 张超, 戚旭衍, 等. Windows平台恶意软件智能检测综述[J]. 计算机研究与发展, 2021, 58(5): 977-994. DOI: 10.7544/issn1000-1239.2021.20200964.
[16] ZHAO S, MA X B, ZOU W, et al. DeepCG:classifying metamorphic malware through deep learning of call graphs[C]//Security and Privacy in Communication Networks. Cham: Springer Nature Switzerland AG, 2019: 171-190. DOI: 10.1007/978-3-030-37228-6_9.
[17] 杨望, 高明哲, 蒋婷. 一种基于多特征集成学习的恶意代码静态检测框架[J]. 计算机研究与发展, 2021, 58(5): 1021-1034. DOI: 10.7544/issn1000-1239.2021.20200912.
[18] ZHANG B, XIAO W T, XIAO X, et al. Ransomware classification using patch-based CNN and self-attention network on embedded n-grams of opcodes[J]. Future Generation Computer Systems, 2020, 110: 708-720. DOI: 10.1016/j.future.2019.09.025.
[19] ZHANG H Q, XIAO X, MERCALDO F, et al. Classification of ransomware families with machine learning based on n-gram of opcodes[J]. Future Generation Computer Systems, 2019, 90: 211-221. DOI: 10.1016/j.future.2018.07.052.
[20] 白金荣, 王俊峰, 赵宗渠. 基于PE静态结构特征的恶意软件检测方法[J]. 计算机科学, 2013, 40(1): 122-126. DOI: 10.3969/j.issn.1002-137X.2013.01.029.
[21] 张光华, 高天娇, 陈振国, 等. 基于N-Gram静态分析技术的恶意软件分类研究[J].计算机科学, 2022, 49(8): 336-343. DOI: 10.11896/jsjkx.210900203.
[22] CONTI G, DEAN E, SINDA M, et al. Visual reverse engineering of binary and data files[C]//Visualization for Computer Security: LNCS Volume 5210. Berlin: Springer, 2008: 1-17. DOI: 10.1007/978-3-540-85933-8_1.
[23] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10)[2022-10-08]. https://arxiv.org/abs/1409.1556v6. DOI: 10.48550/arXiv.1409.1556.
[24] WOO S Y, PARK J C, LEE J Y, et al. CBAM: convolutional block attention module[C]//Computer Vision-ECCV 2018: LNCS Volume 11211. Cham: Springer, 2018: 3-19. DOI: 10.1007/978-3-030-01234-2_1.
[25] CONTINELLA A, GUAGNELLI A, ZINGARO G, et al. ShieldFS: a self-healing, ransomware-aware filesystem[C]//Proceedings of the 32nd Annual Conference on Computer Security Applications. New York, NY: Association for Computing Machinery, 2016: 336-347. DOI: 10.1145/2991079.2991110.
[26] SGANDURRA D, MUÑOZ-GONZÁLEZ L, MOHSEN R, et al. Automated dynamic analysis of ransomware:benefits, limitations and use for detection[EB/OL]. (2016-09-10)[2022-10-08]. https://arxiv.org/abs/1609.03020. DOI: 10.48550/arXiv.1609.03020.
[27] HIRANO M, HODOTA R, KOBAYASHI R. RanSAP: an open dataset of ransomware storage access patterns for training machine learning models[J]. Forensic Science International: Digital Investigation, 2022, 40: 301314. DOI: 10.1016/j.fsidi.2021.301314.
[28] HU J, SHEN L,ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. DOI: 10.1109/TPAMI.2019.2913372.
[29] HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-04-17)[2022-10-08]. https://arxiv.org/abs/1704.04861. DOI: 10.48550/arXiv.1704.04861.
[30] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2016: 770-778. DOI: 10.1109/cvpr.2016.90.
[31] SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2015: 1-9. DOI: 10.1109/cvpr.2015.7298594.
[32] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[EB/OL]. (2021-06-03)[2022-10-08]. https://arxiv.org/abs/2010.11929. DOI: 10.48550/arXiv.2010.11929.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed