基于卷积门控循环神经网络的Web攻击检测方法

doi:10.16088/j.issn.1001-6600.2023022203

摘要/Abstract

摘要： 针对Web应用程序的攻击一直是网络空间对抗的热点问题,随着 Web攻击技术的不断发展,传统的入侵检测系统和Web应用防火墙越来越无法满足安全防护需求。针对攻击者在Web请求中嵌入可执行代码或注入恶意代码来构造各种Web攻击,本文设计一种基于特征融合的恶意Web请求检测卷积门控循环单元(CGRU)神经网络。该网络利用CNN捕捉网络事件的局部特征和高阶特征,摒弃了传统的池化方法,采用GRU代替原有的池化层在时间维度上进行特征采集。同时,为了提高检测性能,筛选传统机器学习中在Web攻击检测领域分类效果较好的9个统计特征来增强原始特征。此外,还使用Word2Vec模型对词嵌入矩阵进行预训练,获得CGRU模型的输入,并对最终结果进行分类,有效提高多分类精度。在公开的HTTP CSIC 2010数据集上与当前典型方法进行对比实验,结果表明:本文所提方法的准确率为99.81%,召回率为99.78%,F₁值为98.80%,精准率为99.81%,较当前典型方法均有提高。

关键词: 网络攻击, Web攻击检测, 神经网络, 门控循环单元, 特征融合

Abstract: Web application attacks have always been a hot issue in cyberspace. With the continuous development of Web attack techniques, traditional intrusion detection systems and Web application firewalls are increasingly unable to meet the security protection needs. A Convolutional Gated Recurrent Unit (CGRU) neural network for detecting malicious Web requests based on feature fusion is proposed in this paper. The local features and high-order features of network events were captured by the designed network using CNN, traditional pooling methods are abandoned, and GRU is employed to collect features in the time dimension, replacing the original pooling layer. In addition, to improve detection performance, nine traditional machine learning statistical features are selected to complement the original features that perform effectively in web attack detection. Furthermore, the Word2Vec model is utilized to pretrain the word embedding matrix and obtain the input of CGRU model, which enables final results to be classified and facilitated the improvement of multi-classification accuracy. The proposed method is compared with current typical methods on the public HTTP CSIC 2010 dataset, and the results show that the accuracy, recall, F1-score, and precision of the proposed method are 99.81%, 99.78%, 98.80%, and 99.81%, respectively, which are all improved compared with the existing methods.

Key words: network attack, Web attack detection, neural network, gated recurrent unit, feature fusion

中图分类号: TP393.08; TP183

周桥, 翟江涛, 荚东升, 孙浩翔. 基于卷积门控循环神经网络的Web攻击检测方法[J]. 广西师范大学学报（自然科学版）, 2023, 41(6): 51-61.

ZHOU Qiao, ZHAI Jiangtao, JIA Dongsheng, SUN Haoxiang. A Web Attack Detection Method Based on Convolutional Gated Recurrent Neural Network[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(6): 51-61.

参考文献

[1] 黄长慧, 胡光俊, 李海威. 基于URL智能白名单的Web应用未知威胁阻断技术研究[J]. 信息网络安全, 2021, 21(3): 1-6. DOI: 10.3969/j.issn.1671-1122.2021.03.001.
[2] 赵凡, 倪志敏. 基于动态IP黑名单的轻量级WEB入侵主动防御关键技术与可视化度量模型研究与应用[J]. 中国建材科技, 2018, 27(1): 70-71. DOI: 10.3969/j.issn.1003-8965.2018.01.028.
[3] WITTERN E, YING A T T, ZHENG Y H, et al. Statically checking web API requests in JavaScript[C]// 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). Los Alamitos, CA: IEEE Computer Society, 2017: 244-254. DOI: 10.1109/ICSE.2017.30.
[4] 蹇诗婕, 卢志刚, 杜丹, 等. 网络入侵检测技术综述[J]. 信息安全学报, 2020, 5(4): 96-122. DOI: 10.19363/J.cnki.cn10-1380/tn.2020.07.07.
[5] LUO C C, TAN Z Y, MIN G Y, et al. A novel web attack detection system for internet of things via ensemble classification[J]. IEEE Transactions on Industrial Informatics, 2021, 17(8): 5810-5818. DOI: 10.1109/TII.2020.3038761.
[6] PRAKASH P, KUMAR M, KOMPELLA R R, et al. PhishNet: predictive blacklisting to detect phishing attacks[C]// 2010 Proceedings IEEE INFOCOM. Piscataway, NJ: IEEE, 2010: 1-5. DOI: 10.1109/INFCOM.2010.5462216.
[7] SUN B, AKIYAMA M, YAGI T, et al. Automating URL blacklist generation with similarity search approach[J]. IEICE Transactions on Information and Systems, 2016, E99-D(4): 873-882. DOI: 10.1587/transinf.2015ICP0027.
[8] 张磊, 蔡永新, 陈潮. 基于时间序列分析的无线传感器网络入侵检测研究[J]. 计算机时代, 2017(12): 24-27, 31. DOI: 10.16644/j.cnki.cn33-1094/tp.2017.12.007.
[9] MAMUN M S I, RATHORE M A, LASHKARI A H, et al. Detecting malicious URLs using lexical analysis[C]// Network and System Security: LNCS Volume 9955. Cham: Springer International Publishing AG, 2016: 467-482. DOI: 10.1007/978-3-319-46298-1_30.
[10] 侯禹洛. 基于机器学习的恶意HTTP请求检测研究[D]. 成都: 电子科技大学, 2022. DOI: 10.27005/d.cnki.gdzku. 2022.001458.
[11] NGUYEN H T, TORRANO-GIMENEZ C, ALVAREZ G, et al. Application of the generic feature selection measure in detection of web attacks[C]// Computational Intelligence in Security for Information Systems: LNCS Volume 6694. Berlin: Springer-Verlag, 2011: 25-32. DOI: 10.1007/978-3-642-21323-6_4.
[12] 刘健, 赵刚, 郑运鹏. 恶意URL多层过滤检测模型策略研究[J]. 信息安全研究, 2016, 2(1): 80-85.
[13] ALJABRI M, ALJAMEEL S S, MOHAMMAD R M A, et al. Intelligent techniques for detecting network attacks: review and research directions[J]. Sensors, 2021, 21(21): 7070. DOI: 10.3390/s21217070.
[14] AL-ALYAN A, AL-AHMADI S. Robust URL phishing detection based on deep learning[J]. KSII Transactions on Internet and Information Systems, 2020, 14(7): 2752-2768. DOI: 10.3837/tiis.2020.07.001.
[15] 范敏, 胥小波, 聂小明. 基于字符级扩张卷积网络的Web攻击检测方法[J]. 计算机应用研究, 2020, 37(S2): 234-237.
[16] YANG W C, ZUO W, CUI B J. Detecting malicious URLs via a keyword-based convolutional gated-recurrent-unit neural network[J]. IEEE Access, 2019, 7: 29891-29900. DOI: 10.1109/ACCESS.2019.2895751.
[17] LI J C, FU Y S, XU J, et al. Web application attack detection based on attention and gated convolution networks[J]. IEEE Access, 2020, 8: 20717-20724. DOI: 10.1109/ACCESS.2019.2955674.
[18] 刘拥民, 黄浩, 石婷婷, 等. 轻量级词典协同记忆聚焦处理的Web攻击检测研究[J]. 重庆理工大学学报(自然科学), 2023, 37(3): 172-182. DOI: 10.3969/j.issn.1674-8425(z).2023.03.020.
[19] TIAN Z H, LUO C C, QIU J, et al. A distributed deep learning system for web attack detection on edge devices[J]. IEEE Transactions on Industrial Informatics, 2020, 16(3): 1963-1971. DOI: 10.1109/TII.2019.2938778.
[20] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL].(2013-09-07)[2023-02-22]. http://arxiv.org/abs/1301.3781. DOI: 10.48550/arXiv.1301.3781.
[21] NIU Q Q, LI X Y. A high-performance web attack detection method based on CNN-GRU model[C]// 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). Piscataway, NJ: IEEE, 2020: 804-808. DOI: 10.1109/ITNEC48623.2020.9085028.
[22] ZHOU C T, SUN C L, LIU Z Y, et al. A C-LSTM neural network for text classification[EB/OL].(2015-11-30)[2023-02-22]. http://arxiv.org/abs/1511.08630. DOI: 10.48550/arXiv.1511.08630.
[23] PASCANU R, MIKOLOV T, BENGIO Y. On the difficulty of training recurrent neural networks[J]. Proceedings of Machine Learning Research, 2013, 28(3): 1310-1318.
[24] 刘学娥. 基于深度学习的web应用层攻击检测模型[D]. 成都: 电子科技大学, 2022. DOI: 10.27005/d.cnki.gdzku. 2022.001216.
[25] GRAVES A. Long short-term memory[M]// GRAVES A. Supervised Sequence Labelling with Recurrent Neural Networks. Berlin: Springer-Verlag, 2012: 37-45. DOI: 10.1007/978-3-642-24797-2_4.
[26] CHUNG J Y, GULCEHRE C, CHO K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL].(2014-12-11)[2023-02-22]. http://arxiv.org/abs/1412.3555. DOI: 10.48550/arXiv.1412.3555.
[27] ZHAO N, GAO H, WEN X, et al. Combination of convolutional neural network and gated recurrent unit for aspect-based sentiment analysis[J]. IEEE Access, 2021, 9: 15561-15569. DOI: 10.1109/ACCESS.2021.3052937.
[28] LIPPMANN R, HAINES J W, FRIEDD J, et al. The 1999 DARPA off-line intrusion detection evaluation[J]. Computer Networks, 2000, 34(4): 579-595. DOI: 10.1016/S1389-1286(00)00139-0.
[29] MCHUGH J. Testing intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory[J]. ACM Transactions on Information and System Security, 2000, 3(4): 262-294. DOI: 10.1145/382912.382923.
[30] TEKEREK A. A novel architecture for web-based attack detection using convolutional neural network[J]. Computers & Security, 2021, 100: 102096. DOI: 10.1016/j.cose.2020.102096.
[31] BASURTO N, MICHELENA Á, URDA D, et al. Dimensionality-reduction methods for the analysis of web traffic[C]//International Joint Conference 15th International Conference on Computational Intelligence in Security for Information Systems (CISIS 2022) 13th International Conference on EUropean Transnational Education (ICEUTE 2022). Cham: Springer Nature Switzerland AG, 2022: 62-72. DOI: 10.1007/978-3-031-18409-3_7.
[32] ITO M, IYATOMI H. Web application firewall using character-level convolutional neural network[C]// 2018 IEEE 14th International Colloquium on Signal Processing & Its Applications (CSPA). Piscataway, NJ: IEEE, 2018: 103-106. DOI: 10.1109/CSPA.2018.8368694.
[33] ZHANG M, XU B Y, BAI S, et al. A deep learning method to detect web attacks using a specially designed CNN[C]// Neural Information Processing: LNCS Volume 10638. Cham: Springer International Publishing AG, 2017: 828-836. DOI: 10.1007/978-3-319-70139-4_84.
[34] 王硕, 王坚, 王亚男, 等. 一种基于特征融合的恶意代码快速检测方法[J]. 电子学报, 2023, 51(1): 57-66. DOI: 10.12263/DZXB.20211701.
[35] 谭茹涵, 左黎明, 刘二根, 等. 基于图像特征融合的恶意代码检测[J]. 信息网络安全, 2021, 21(10): 90-95. DOI: 10.3969/j.issn.1671-1122.2021.10.013.
[36] 刘紫煊, 王晨. 基于多特征融合的BiLSTM恶意代码分类[J]. 电子设计工程, 2022, 30(18): 67-72. DOI: 10.14022/j.issn1674-6236.2022.18.014.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed