广西师范大学学报(自然科学版) ›› 2023, Vol. 41 ›› Issue (6): 51-61.doi: 10.16088/j.issn.1001-6600.2023022203

• • 上一篇    下一篇

基于卷积门控循环神经网络的Web攻击检测方法

周桥, 翟江涛*, 荚东升, 孙浩翔   

  1. 南京信息工程大学 电子与信息工程学院, 江苏 南京 210044
  • 收稿日期:2023-02-22 修回日期:2023-04-14 发布日期:2023-12-04
  • 通讯作者: 翟江涛(1983—), 男, 河南三门峡人, 南京信息工程大学副教授, 博士。E-mail: jiangtaozhai@gmail.com
  • 基金资助:
    国家自然科学基金(61931004, 62072250); 国家重点研发计划项目(2021QY0700)

A Web Attack Detection Method Based on Convolutional Gated Recurrent Neural Network

ZHOU Qiao, ZHAI Jiangtao*, JIA Dongsheng, SUN Haoxiang   

  1. School of Electronics and Engineering, Nanjing University of Information Science and Technology, Nanjing Jiangsu 210044, China
  • Received:2023-02-22 Revised:2023-04-14 Published:2023-12-04

摘要: 针对Web应用程序的攻击一直是网络空间对抗的热点问题,随着 Web攻击技术的不断发展,传统的入侵检测系统和Web应用防火墙越来越无法满足安全防护需求。针对攻击者在Web请求中嵌入可执行代码或注入恶意代码来构造各种Web攻击,本文设计一种基于特征融合的恶意Web请求检测卷积门控循环单元(CGRU)神经网络。该网络利用CNN捕捉网络事件的局部特征和高阶特征,摒弃了传统的池化方法,采用GRU代替原有的池化层在时间维度上进行特征采集。同时,为了提高检测性能,筛选传统机器学习中在Web攻击检测领域分类效果较好的9个统计特征来增强原始特征。此外,还使用Word2Vec模型对词嵌入矩阵进行预训练,获得CGRU模型的输入,并对最终结果进行分类,有效提高多分类精度。在公开的HTTP CSIC 2010数据集上与当前典型方法进行对比实验,结果表明:本文所提方法的准确率为99.81%,召回率为99.78%,F1值为98.80%,精准率为99.81%,较当前典型方法均有提高。

关键词: 网络攻击, Web攻击检测, 神经网络, 门控循环单元, 特征融合

Abstract: Web application attacks have always been a hot issue in cyberspace. With the continuous development of Web attack techniques, traditional intrusion detection systems and Web application firewalls are increasingly unable to meet the security protection needs. A Convolutional Gated Recurrent Unit (CGRU) neural network for detecting malicious Web requests based on feature fusion is proposed in this paper. The local features and high-order features of network events were captured by the designed network using CNN, traditional pooling methods are abandoned, and GRU is employed to collect features in the time dimension, replacing the original pooling layer. In addition, to improve detection performance, nine traditional machine learning statistical features are selected to complement the original features that perform effectively in web attack detection. Furthermore, the Word2Vec model is utilized to pretrain the word embedding matrix and obtain the input of CGRU model, which enables final results to be classified and facilitated the improvement of multi-classification accuracy. The proposed method is compared with current typical methods on the public HTTP CSIC 2010 dataset, and the results show that the accuracy, recall, F1-score, and precision of the proposed method are 99.81%, 99.78%, 98.80%, and 99.81%, respectively, which are all improved compared with the existing methods.

Key words: network attack, Web attack detection, neural network, gated recurrent unit, feature fusion

中图分类号:  TP393.08; TP183

[1] 黄长慧, 胡光俊, 李海威. 基于URL智能白名单的Web应用未知威胁阻断技术研究[J]. 信息网络安全, 2021, 21(3): 1-6. DOI: 10.3969/j.issn.1671-1122.2021.03.001.
[2] 赵凡, 倪志敏. 基于动态IP黑名单的轻量级WEB入侵主动防御关键技术与可视化度量模型研究与应用[J]. 中国建材科技, 2018, 27(1): 70-71. DOI: 10.3969/j.issn.1003-8965.2018.01.028.
[3] WITTERN E, YING A T T, ZHENG Y H, et al. Statically checking web API requests in JavaScript[C]// 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). Los Alamitos, CA: IEEE Computer Society, 2017: 244-254. DOI: 10.1109/ICSE.2017.30.
[4] 蹇诗婕, 卢志刚, 杜丹, 等. 网络入侵检测技术综述[J]. 信息安全学报, 2020, 5(4): 96-122. DOI: 10.19363/J.cnki.cn10-1380/tn.2020.07.07.
[5] LUO C C, TAN Z Y, MIN G Y, et al. A novel web attack detection system for internet of things via ensemble classification[J]. IEEE Transactions on Industrial Informatics, 2021, 17(8): 5810-5818. DOI: 10.1109/TII.2020.3038761.
[6] PRAKASH P, KUMAR M, KOMPELLA R R, et al. PhishNet: predictive blacklisting to detect phishing attacks[C]// 2010 Proceedings IEEE INFOCOM. Piscataway, NJ: IEEE, 2010: 1-5. DOI: 10.1109/INFCOM.2010.5462216.
[7] SUN B, AKIYAMA M, YAGI T, et al. Automating URL blacklist generation with similarity search approach[J]. IEICE Transactions on Information and Systems, 2016, E99-D(4): 873-882. DOI: 10.1587/transinf.2015ICP0027.
[8] 张磊, 蔡永新, 陈潮. 基于时间序列分析的无线传感器网络入侵检测研究[J]. 计算机时代, 2017(12): 24-27, 31. DOI: 10.16644/j.cnki.cn33-1094/tp.2017.12.007.
[9] MAMUN M S I, RATHORE M A, LASHKARI A H, et al. Detecting malicious URLs using lexical analysis[C]// Network and System Security: LNCS Volume 9955. Cham: Springer International Publishing AG, 2016: 467-482. DOI: 10.1007/978-3-319-46298-1_30.
[10] 侯禹洛. 基于机器学习的恶意HTTP请求检测研究[D]. 成都: 电子科技大学, 2022. DOI: 10.27005/d.cnki.gdzku. 2022.001458.
[11] NGUYEN H T, TORRANO-GIMENEZ C, ALVAREZ G, et al. Application of the generic feature selection measure in detection of web attacks[C]// Computational Intelligence in Security for Information Systems: LNCS Volume 6694. Berlin: Springer-Verlag, 2011: 25-32. DOI: 10.1007/978-3-642-21323-6_4.
[12] 刘健, 赵刚, 郑运鹏. 恶意URL多层过滤检测模型策略研究[J]. 信息安全研究, 2016, 2(1): 80-85.
[13] ALJABRI M, ALJAMEEL S S, MOHAMMAD R M A, et al. Intelligent techniques for detecting network attacks: review and research directions[J]. Sensors, 2021, 21(21): 7070. DOI: 10.3390/s21217070.
[14] AL-ALYAN A, AL-AHMADI S. Robust URL phishing detection based on deep learning[J]. KSII Transactions on Internet and Information Systems, 2020, 14(7): 2752-2768. DOI: 10.3837/tiis.2020.07.001.
[15] 范敏, 胥小波, 聂小明. 基于字符级扩张卷积网络的Web攻击检测方法[J]. 计算机应用研究, 2020, 37(S2): 234-237.
[16] YANG W C, ZUO W, CUI B J. Detecting malicious URLs via a keyword-based convolutional gated-recurrent-unit neural network[J]. IEEE Access, 2019, 7: 29891-29900. DOI: 10.1109/ACCESS.2019.2895751.
[17] LI J C, FU Y S, XU J, et al. Web application attack detection based on attention and gated convolution networks[J]. IEEE Access, 2020, 8: 20717-20724. DOI: 10.1109/ACCESS.2019.2955674.
[18] 刘拥民, 黄浩, 石婷婷, 等. 轻量级词典协同记忆聚焦处理的Web攻击检测研究[J]. 重庆理工大学学报(自然科学), 2023, 37(3): 172-182. DOI: 10.3969/j.issn.1674-8425(z).2023.03.020.
[19] TIAN Z H, LUO C C, QIU J, et al. A distributed deep learning system for web attack detection on edge devices[J]. IEEE Transactions on Industrial Informatics, 2020, 16(3): 1963-1971. DOI: 10.1109/TII.2019.2938778.
[20] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL].(2013-09-07)[2023-02-22]. http://arxiv.org/abs/1301.3781. DOI: 10.48550/arXiv.1301.3781.
[21] NIU Q Q, LI X Y. A high-performance web attack detection method based on CNN-GRU model[C]// 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). Piscataway, NJ: IEEE, 2020: 804-808. DOI: 10.1109/ITNEC48623.2020.9085028.
[22] ZHOU C T, SUN C L, LIU Z Y, et al. A C-LSTM neural network for text classification[EB/OL].(2015-11-30)[2023-02-22]. http://arxiv.org/abs/1511.08630. DOI: 10.48550/arXiv.1511.08630.
[23] PASCANU R, MIKOLOV T, BENGIO Y. On the difficulty of training recurrent neural networks[J]. Proceedings of Machine Learning Research, 2013, 28(3): 1310-1318.
[24] 刘学娥. 基于深度学习的web应用层攻击检测模型[D]. 成都: 电子科技大学, 2022. DOI: 10.27005/d.cnki.gdzku. 2022.001216.
[25] GRAVES A. Long short-term memory[M]// GRAVES A. Supervised Sequence Labelling with Recurrent Neural Networks. Berlin: Springer-Verlag, 2012: 37-45. DOI: 10.1007/978-3-642-24797-2_4.
[26] CHUNG J Y, GULCEHRE C, CHO K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL].(2014-12-11)[2023-02-22]. http://arxiv.org/abs/1412.3555. DOI: 10.48550/arXiv.1412.3555.
[27] ZHAO N, GAO H, WEN X, et al. Combination of convolutional neural network and gated recurrent unit for aspect-based sentiment analysis[J]. IEEE Access, 2021, 9: 15561-15569. DOI: 10.1109/ACCESS.2021.3052937.
[28] LIPPMANN R, HAINES J W, FRIEDD J, et al. The 1999 DARPA off-line intrusion detection evaluation[J]. Computer Networks, 2000, 34(4): 579-595. DOI: 10.1016/S1389-1286(00)00139-0.
[29] MCHUGH J. Testing intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory[J]. ACM Transactions on Information and System Security, 2000, 3(4): 262-294. DOI: 10.1145/382912.382923.
[30] TEKEREK A. A novel architecture for web-based attack detection using convolutional neural network[J]. Computers & Security, 2021, 100: 102096. DOI: 10.1016/j.cose.2020.102096.
[31] BASURTO N, MICHELENA Á, URDA D, et al. Dimensionality-reduction methods for the analysis of web traffic[C]//International Joint Conference 15th International Conference on Computational Intelligence in Security for Information Systems (CISIS 2022) 13th International Conference on EUropean Transnational Education (ICEUTE 2022). Cham: Springer Nature Switzerland AG, 2022: 62-72. DOI: 10.1007/978-3-031-18409-3_7.
[32] ITO M, IYATOMI H. Web application firewall using character-level convolutional neural network[C]// 2018 IEEE 14th International Colloquium on Signal Processing & Its Applications (CSPA). Piscataway, NJ: IEEE, 2018: 103-106. DOI: 10.1109/CSPA.2018.8368694.
[33] ZHANG M, XU B Y, BAI S, et al. A deep learning method to detect web attacks using a specially designed CNN[C]// Neural Information Processing: LNCS Volume 10638. Cham: Springer International Publishing AG, 2017: 828-836. DOI: 10.1007/978-3-319-70139-4_84.
[34] 王硕, 王坚, 王亚男, 等. 一种基于特征融合的恶意代码快速检测方法[J]. 电子学报, 2023, 51(1): 57-66. DOI: 10.12263/DZXB.20211701.
[35] 谭茹涵, 左黎明, 刘二根, 等. 基于图像特征融合的恶意代码检测[J]. 信息网络安全, 2021, 21(10): 90-95. DOI: 10.3969/j.issn.1671-1122.2021.10.013.
[36] 刘紫煊, 王晨. 基于多特征融合的BiLSTM恶意代码分类[J]. 电子设计工程, 2022, 30(18): 67-72. DOI: 10.14022/j.issn1674-6236.2022.18.014.
[1] 高飞, 郭晓斌, 袁冬芳, 曹富军. 改进PINNs方法求解边界层对流占优扩散方程[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 33-50.
[2] 吴正清, 曹晖, 刘宝锴. 基于注意力卷积神经网络的中文虚假评论检测[J]. 广西师范大学学报(自然科学版), 2023, 41(5): 26-36.
[3] 欧阳舒歆, 王洺钧, 荣垂田, 孙华波. 基于改进LSTM的多维QAR数据异常检测[J]. 广西师范大学学报(自然科学版), 2023, 41(5): 49-60.
[4] 唐侯清, 辛斌斌, 朱虹谕, 乙加伟, 张冬冬, 武新章, 双丰. 基于多尺度注意力倒残差网络的轴承故障诊断[J]. 广西师范大学学报(自然科学版), 2023, 41(4): 109-122.
[5] 韩欣月, 邓长征, 付添, 夏鹏雨, 刘旋. 基于MWOA-Elman神经网络的接地网瞬变电磁缺陷识别[J]. 广西师范大学学报(自然科学版), 2023, 41(3): 53-66.
[6] 杨烁祯, 张珑, 王建华, 张恒远. 声音事件检测综述[J]. 广西师范大学学报(自然科学版), 2023, 41(2): 1-18.
[7] 潘海明, 陈庆锋, 邱杰, 何乃旭, 刘春雨, 杜晓敬. 基于卷积推理的多跳知识图谱问答算法[J]. 广西师范大学学报(自然科学版), 2023, 41(1): 102-112.
[8] 张涛, 杜建民. 基于无人机遥感的荒漠草原微斑块识别研究[J]. 广西师范大学学报(自然科学版), 2022, 40(6): 50-58.
[9] 肖飞, 康增彦, 王维红. 两种算法用于预测A2/O工艺脱氮条件[J]. 广西师范大学学报(自然科学版), 2022, 40(6): 173-184.
[10] 郝雅茹, 董力, 许可, 李先贤. 预训练语言模型的可解释性研究进展[J]. 广西师范大学学报(自然科学版), 2022, 40(5): 59-71.
[11] 田晟, 宋霖. 基于CNN和Bagging集成的交通标志识别[J]. 广西师范大学学报(自然科学版), 2022, 40(4): 35-46.
[12] 蔡丽坤, 吴运兵, 陈甘霖, 刘翀凌, 廖祥文. 基于生成对抗网络的类别文本生成[J]. 广西师范大学学报(自然科学版), 2022, 40(4): 79-90.
[13] 周圣凯, 富丽贞, 宋文爱. 基于深度学习的短文本语义相似度计算模型[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 49-56.
[14] 彭涛, 唐经, 何凯, 胡新荣, 刘军平, 何儒汉. 基于多步态特征融合的情感识别[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 104-111.
[15] 马新娜, 赵猛, 祁琳. 基于卷积脉冲神经网络的故障诊断方法研究[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 112-120.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 董淑龙, 马姜明, 辛文杰. 景观视觉评价研究进展与趋势——基于CiteSpace的知识图谱分析[J]. 广西师范大学学报(自然科学版), 2023, 41(5): 1 -13 .
[2] 马乾然, 韦笃取. 基于线性耦合储备池计算的电机系统混沌预测研究[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 1 -7 .
[3] 颜闽秀, 靳琪森. 多维混沌系统的构建及其多通道自适应控制[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 8 -21 .
[4] 赵伟, 田帅, 张强, 王耀申, 王思博, 宋江. 基于改进YOLOv5的平贝母检测模型[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 22 -32 .
[5] 高飞, 郭晓斌, 袁冬芳, 曹富军. 改进PINNs方法求解边界层对流占优扩散方程[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 33 -50 .
[6] 林玩聪, 韩明杰, 靳婷. 基于数据增强的多层次论点立场分类方法[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 62 -69 .
[7] 温雪岩, 谷训开, 李祯, 黄英来, 黄鹤林. 融合释义与双向交互的成语阅读理解方法研究[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 70 -79 .
[8] 宋冠武, 陈知明, 李建军. 基于ResNet-50的级联注意力遥感图像分类[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 80 -91 .
[9] 徐紫钰, 吴克晴. Caputo型分数阶微分系统正解的唯一性[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 92 -104 .
[10] 郭洁, 索洪敏, 朱怡颖, 郭加超. 一类具有临界指数和不定位势的Kirchhoff型问题解的存在性[J]. 广西师范大学学报(自然科学版), 2023, 41(6): 105 -112 .
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发