广西师范大学学报(自然科学版) ›› 2026, Vol. 44 ›› Issue (3): 60-74.doi: 10.16088/j.issn.1001-6600.2025073101

• 智能信息处理 • 上一篇    下一篇

基于MHTD-YOLO11n的钢材表面缺陷检测算法

钱俊磊1,2, 王熹之1, 曾凯1,3*, 杜学强2, 刘贺2, 朱立光3   

  1. 1.华北理工大学 电气工程学院, 河北 唐山 063210;
    2.唐山市钢铁企业流程控制与优化技术创新中心,河北 唐山 063000;
    3.华北理工大学 冶金与能源学院, 河北 唐山 063210
  • 收稿日期:2025-07-31 修回日期:2025-09-30 出版日期:2026-05-05 发布日期:2026-05-13
  • 通讯作者: 曾凯(1990—), 男, 河北唐山人, 华北理工大学副教授, 博士。E-mail: kevinzengkai@126.com
  • 基金资助:
    国家自然科学基金(51904107); 中央引导地方科技发展资金项目(236Z1017G)

Steel Surface Defect Detection Algorithm Based on MHTD-YOLO11n

QIAN Junlei1,2, WANG Xizhi1, ZENG Kai1,3*, DU Xueqiang2, LIU He2, ZHU Liguang3   

  1. 1. College of Electrical Engineering, North China University of Science and Technology, Tangshan Hebei 063210, China;
    2. Tangshan Iron and Steel Enterprise Process Control and Optimization Technology Innovation Center, Tangshan Hebei 063000, China;
    3. College of Metallurgy and Energy, North China University of Science and Technology, Tangshan Hebei 063210, China
  • Received:2025-07-31 Revised:2025-09-30 Online:2026-05-05 Published:2026-05-13

摘要: 钢材表面缺陷检测面临缺陷形态多样、结构复杂、小目标占比高且伴随复杂环境等因素的干扰,而现有缺陷检测模型往往结构复杂、参数量庞大、检测精度和实时性较差。针对上述问题,本文提出一种基于YOLO11n的轻量高效钢材缺陷检测算法MHTD-YOLO11n。该方法首先引入多尺度分组膨胀卷积(multi-scale grouped dilated convolution,MSGDC)模块,通过集成不同膨胀率的分组卷积实现多尺度特征融合,提升对不同种类缺陷的检测能力;随后通过引用分层互补注意力混合模块(hierarchical reciprocal attention mixer,H-RAMi),补偿因下采样特征导致的像素级信息损失;接着设计C2PSA_TPA模块,通过引用张量积注意力(tensor product attention,TPA),显著压缩推理时的KV缓存规模;最后重构特征交互模块(C3K2_DFF),使网络能够在更大的感受野下有效结合多尺度信息,促进检测精度和速度的提升。实验结果表明,相较于YOLO11n算法,MHTD-YOLO11n算法的mAP值和召回率分别提升4.3和9.1个百分点,检测速度达到258.3 frame/s,参数量和计算量分别降低1.42×106和3.4×109,满足工业质检场景对高精度与实时性的双重需求。

关键词: 计算机图像处理, 钢材表面缺陷, 缺陷检测, 目标检测, YOLO11n, 注意力机制

Abstract: Steel surface defects exhibit diverse morphologies, complex structures, a high proportion of small targets, and susceptibility to interference from environmental factors, while existing defect detection models suffer from complex structures, large parameter counts, and poor detection accuracy and real-time performance. To address these issues, a lightweight and efficient steel defect detection algorithm (MHTD-YOLO11n) based on YOLO11n is proposed in this studyly. Firstly, a multi-scale grouped dilated convolution (MSGDC) module is introduced in this method, in which grouped convolutions with different dilation rates are integrated to achieve multi-scale feature fusion and enhance the detection capability for various types of defects. Subsequently, a Hierarchical Reciprocal Attention Mixer (H-RAMi) module is incorporated to compensate for pixel-level information loss caused by downsampled features. A C2PSA_TPA module is then designed, in which the KV cache size during inference is significantly compressed by leveraging Tensor Product Attention (TPA). Finally, the feature interaction module (C3K2_DFF) is reconfigured to enable the network to effectively combine multi-scale information under a larger receptive field, promoting improvements in both detection accuracy and speed.Experimental results show that compared with the YOLO11n algorithm, the mAP value and recall rate of the MHTD-YOLO11n algorithm are increased by 4.3 and 9.1 percentage points respectively, a detection speed of 258.3 frame/s is achieved, the parameter count and computational volume are reduced by 1.42×106 and 3.4×109 respectively, and the dual requirements of high accuracy and real-time performance in industrial quality inspection scenarios are met.

Key words: computer image processing, steel surface defects, defect detection, object detection, YOLO11n, attention mechanism

中图分类号:  TP391.41

[1] MORDIA R, KUMAR VERMA A. Visual techniques for defects detection in steel products: a comparative study[J]. Engineering Failure Analysis, 2022, 134: 106047. DOI: 10.1016/j.engfailanal.2022.106047.
[2] 邓能辉, 侯睿, 叶俊明. 基于深度学习的圆钢表面缺陷检测系统[J]. 中国冶金, 2022, 32(12): 113-121. DOI: 10.13228/j.boyuan.issn1006-9356.20220449.
[3] LIANG F T, ZHOU Y, CHEN X, et al. Review of target detection technology based on deep learning[C]// CCEAI'21: Proceedings of the 5th International Conference on Control Engineering and Artificial Intelligence. New York, NY: Association for Computing Machinery, 2021: 132-135. DOI: 10.1145/3448218.3448234.
[4] 李跃, 王子铭, 李鑫林, 等. 带钢表面缺陷检测方法研究进展[J]. 钢铁研究学报, 2023, 35(8): 950-962. DOI: 10.13228/j.boyuan.issn1001-0963.20220363.
[5] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI: 10.1109/TPAMI.2016.2577031.
[6] QIAN H M, WANG H L, FENG S, et al. FESSD: SSD target detection based on feature fusion and feature enhancement[J]. Journal of Real-Time Image Processing, 2023, 20(1): 2. DOI: 10.1007/s11554-023-01258-y.
[7] TERVEN J, CÓRDOVA-ESPARZA D M, ROMERO-GONZÁLEZ J A. A comprehensive review of YOLO architectures in computer vision: from YOLOv1 to YOLOv8 and YOLO-NAS[J]. Machine Learning and Knowledge Extraction, 2023, 5(4):1680-1716. DOI: 10.3390/make5040083.
[8] JIANG P Y, ERGU D J, LIU F Y, et al. A review of yolo algorithm developments[J]. Procedia Computer Science, 2022, 199: 1066-1073. DOI: 10.1016/j.procs.2022.01.135.
[9] 马磊, 李晔, 王宇翔. YOLOv8-FD: YOLOv8改进的钢板表面缺陷检测方法[J]. 计算机工程与应用, 2024, 60(24): 211-221. DOI: 10.3778/j.issn.1002-8331.2406-0223.
[10] 梁礼明, 龙鹏威, 金家新, 等. 基于改进YOLOv8s的钢材表面缺陷检测算法[J]. 浙江大学学报(工学版), 2025, 59(3): 512-522. DOI: 10.3785/j.issn.1008-973X.2025.03.009.
[11] 窦智, 高浩然, 刘国奇, 等. 轻量化YOLOv8的小样本钢板缺陷检测算法[J]. 计算机工程与应用, 2024, 60(9): 90-100. DOI: 10.3778/j.issn.1002-8331.2311-0070.
[12] 张航, 周毅, 邱宇峰. 融合HGnetv2和注意力机制的钢材表面缺陷检测方法[J]. 电子测量与仪器学报, 2025, 39(1): 36-49. DOI: 10.13382/j.jemi.B2407618.
[13] LIAO L F, SONG C, WU S L, et al. A novel YOLOv10-based algorithm for accurate steel surface defect detection[J]. Sensors, 2025, 25(3): 769. DOI: 10.3390/s25030769.
[14] SU P, HAN H Z, LIU M, et al. MOD-YOLO: rethinking the YOLO architecture at the level of feature information and applying it to crack detection[J]. Expert Systems with Applications, 2024, 237: 121346. DOI: 10.1016/j.eswa.2023.121346.
[15] AKHYAR F, LIU Y, HSU C Y, et al. FDD: a deep learning-based steel defect detectors[J]. The International Journal of Advanced Manufacturing Technology, 2023, 126(3): 1093-1107. DOI: 10.1007/s00170-023-11087-9.
[16] DAMACHARLA P,ACHUTH RAO M V, RINGENBERG J, et al. TLU-Net: a deep learning approach for automatic steel surface defect detection[C]// 2021 International Conference on Applied Artificial Intelligence (ICAPAI). Piscataway NJ: IEEE, 2021: 1-6. DOI: 10.1109/ICAPAI49758.2021.9462060.
[17] URAON P K, VERMA A, BADHOLIA A. Steel sheet defect detection using feature pyramid network and RESNET[C]// 2022 International Conference on Edge Computing and Applications (ICECAA). Piscataway NJ: IEEE, 2022: 1543-1550. DOI: 10.1109/ICECAA55415.2022.9936318.
[18] 周建新, 许兴博. 改进Steel-YOLO的钢材表面缺陷检测[J]. 东北师大学报(自然科学版), 2026, 58(1): 65-75. DOI: 10.16163/j.cnki.dslkxb202404200002.
[19] GAO T, ZHANG Y, ZHANG Z Y, et al. BHViT: binarized hybrid vision transformer[C]// 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2025: 3563-3572. DOI: 10.1109/CVPR52734.2025.00337.
[20] LIU Z C, SHEN Z Q, SAVVIDES M, et al. ReActNet: towards precise binary neural network with generalized activation functions[C]// Computer Vision-ECCV 2020: LNCS Volume 12359. Cham: Springer Nature Switzerland AG, 2020: 143-159. DOI: 10.1007/978-3-030-58568-6_9.
[21] CHOI H, NA C, OH J, et al. Reciprocal attention mixing transformer for lightweight image restoration[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Los Alamitos, CA: IEEE Computer Society, 2024: 5992-6002. DOI: 10.1109/CVPRW63382.2024.00606.
[22] GAO Z H, AI D N, LI W T, et al. N-gram swin transformer for CT image super-resolution[C]// Extended Reality: LNCS Volume 15461. Singapore: Springer Nature Singapore Pte Ltd., 2024: 136-148. DOI: 10.1007/978-981-96-3679-2_9.
[23] ZHANG Y F, LIU Y F, YUAN H Z, et al. Tensor product attention is all you need[EB/OL]. (2025-05-29)[2025-07-31]. https://arxiv.org/abs/2501.06425. DOI: 10.48550/arXiv.2501.06425.
[24] YANG J, QIU P J, ZHANG Y C, et al. D-Net: dynamic large kernel with dynamic feature fusion for volumetric medical image segmentation[J]. Biomedical Signal Processing and Control, 2026, 113(Part B): 108837. DOI: 10.1016/j.bspc.2025.108837.
[25] CHEN J R, KAO S H, HE H, et al. Run, don't walk: chasing higher FLOPS for faster neural networks[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2023: 12021-12031. DOI: 10.1109/CVPR52729.2023.01157.
[26] SUN J, PENG Y F, CHEN C, et al. ESC-YOLO: optimizing apple fruit recognition with efficient spatial and channel features in YOLOX[J]. Journal of Real-Time Image Processing, 2024, 21(5): 162. DOI: 10.1007/s11554-024-01540-7.
[27] LI H L, LI J, WEI H B, et al. Slim-neck by GSConv: a lightweight-design for real-time detector architectures[J]. Journal of Real-Time Image Processing, 2024, 21(3): 62. DOI: 10.1007/s11554-024-01436-6.
[28] QIAO S Y, CHEN L C, YUILLE A. DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2021: 10208-10219. DOI: 10.1109/CVPR46437.2021.01008.
[29] JIN X M, LIANG X Y, DENG P F. Lightweight daylily grading and detection model based on improved YOLOv10[J]. Smart Agriculture, 2024, 6(5): 108-118. DOI: 10.12133/j.smartag.SA202407022.
[30] DENG Y H, GUO D, GUO X F, et al. MQA: answering the question via robotic manipulation[EB/OL]. (2023-02-21)[2025-07-31]. https://arxiv.org/abs/2003.04641v4. DOI: 10.48550/arXiv.2003.04641.
[31] HUDSON D A, MANNING C D. GQA: a new dataset for real-world visual reasoning and compositional question answering[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2019: 6693-6702. DOI: 10.1109/CVPR.2019.00686.
[32] ZHENG C, SONG Y X. Personalized multi-head self-attention network for news recommendation[J]. Neural Networks, 2025, 181: 106824. DOI: 10.1016/j.neunet.2024.106824.
[33] TAN H C, LIU X P, YIN B C, et al. MHSA-Net: multihead self-attention network for occluded person re-identification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(11): 8210-8224. DOI: 10.1109/TNNLS.2022.
3144163.
[1] 杨云波, 南新元, 蔡鑫. 基于改进YOLO11n的光伏板缺陷检测方法[J]. 广西师范大学学报(自然科学版), 2026, 44(3): 47-59.
[2] 毕桦男, 高丙朋, 蔡鑫. SOP-DETR:基于改进RT-DETR的海下垃圾检测算法[J]. 广西师范大学学报(自然科学版), 2026, 44(3): 75-88.
[3] 田晟, 冯帅涛, 李嘉. 一种基于复合框架的城市道路场景车辆轨迹提取方法[J]. 广西师范大学学报(自然科学版), 2026, 44(2): 31-51.
[4] 吕辉, 司可. 基于改进RT-DETR的光伏板缺陷检测[J]. 广西师范大学学报(自然科学版), 2026, 44(2): 52-64.
[5] 张胜伟, 曹洁. 融合傅里叶卷积与差异感知的钢材表面微小缺陷检测算法[J]. 广西师范大学学报(自然科学版), 2026, 44(2): 90-102.
[6] 田晟, 赵凯龙, 苗佳霖. 基于改进YOLO11n模型的自动驾驶道路交通检测算法研究[J]. 广西师范大学学报(自然科学版), 2026, 44(1): 1-9.
[7] 黄艳国, 肖洁, 吴水清. 基于D2STGNN的双向高效多尺度交通流预测[J]. 广西师范大学学报(自然科学版), 2026, 44(1): 10-22.
[8] 刘志豪, 李自立, 苏珉. 智能通信与无人机结合的YOLOv8电动车骑行者头盔佩戴检测方法[J]. 广西师范大学学报(自然科学版), 2026, 44(1): 23-32.
[9] 黄文杰, 罗维平, 陈镇南, 彭志祥, 丁梓豪. 基于YOLO11的轻量化PCB缺陷检测算法研究[J]. 广西师范大学学报(自然科学版), 2026, 44(1): 56-67.
[10] 施子豪, 蒙祖强, 谈超洪. 基于注意力机制和多尺度融合的多模态虚假新闻检测模型[J]. 广西师范大学学报(自然科学版), 2026, 44(1): 68-79.
[11] 魏梓书, 陈志刚, 王衍学, 哈斯铁尔·马德提汗. 基于SBSI-YOLO11的轻量化轴承外观缺陷检测算法[J]. 广西师范大学学报(自然科学版), 2025, 43(6): 80-91.
[12] 黎豊玮, 谭玉枚, 宋树祥, 夏海英. 基于注意力引导的遮挡感知面部表情识别[J]. 广西师范大学学报(自然科学版), 2025, 43(5): 104-113.
[13] 刘廷汉, 梁艳, 黄鹏升, 闭金杰, 黄守麟, 李廷会. 基于改进YOLOv8s的人脸痤疮小目标检测[J]. 广西师范大学学报(自然科学版), 2025, 43(5): 114-129.
[14] 田晟, 熊辰崟, 龙安洋. 基于改进PointNet++的城市道路点云分类方法[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 1-14.
[15] 韩烁, 江林峰, 杨建斌. 基于注意力机制PINNs方法求解圣维南方程[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 58-68.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 孟春梅, 陆世银, 梁永红, 莫肖敏, 李卫东, 黄远洁, 成晓静, 苏志恒, 郑华. 岩黄连总碱诱导肝星状细胞凋亡和自噬的电镜实验研究[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 76 -79 .
[2] 李钰慧, 陈泽柠, 黄中豪, 周岐海. 广西弄岗熊猴的雨季活动时间分配[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 80 -86 .
[3] 庄枫红, 马姜明, 张雅君, 苏静, 于方明. 中华水韭对不同光照条件的生理生态响应[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 93 -100 .
[4] 韦宏金, 周喜乐, 金冬梅, 严岳鸿. 湖南蕨类植物增补[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 101 -106 .
[5] 包金萍, 郑连斌, 宇克莉, 宋雪, 田金源, 董文静. 大凉山彝族成人皮褶厚度特征[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 107 -112 .
[6] 林永生, 裴建国, 邹胜章, 杜毓超, 卢丽. 清江下游红层岩溶及其水化学特征[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 113 -120 .
[7] 张茹, 张蓓, 任鸿瑞. 山西轩岗矿区耕地流失时空特征及其影响因子研究[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 121 -132 .
[8] 李贤江, 石淑芹, 蔡为民, 曹玉青. 基于CA-Markov模型的天津滨海新区土地利用变化模拟[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 133 -143 .
[9] 王梦飞, 黄松. 广西西江经济带的城市旅游经济空间关联研究[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 144 -150 .
[10] 刘国伦, 宋树祥, 岑明灿, 李桂琴, 谢丽娜. 带宽可调带阻滤波器的设计[J]. 广西师范大学学报(自然科学版), 2018, 36(3): 1 -8 .
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发