|
|
广西师范大学学报(自然科学版) ›› 2026, Vol. 44 ›› Issue (3): 60-74.doi: 10.16088/j.issn.1001-6600.2025073101
钱俊磊1,2, 王熹之1, 曾凯1,3*, 杜学强2, 刘贺2, 朱立光3
QIAN Junlei1,2, WANG Xizhi1, ZENG Kai1,3*, DU Xueqiang2, LIU He2, ZHU Liguang3
摘要: 钢材表面缺陷检测面临缺陷形态多样、结构复杂、小目标占比高且伴随复杂环境等因素的干扰,而现有缺陷检测模型往往结构复杂、参数量庞大、检测精度和实时性较差。针对上述问题,本文提出一种基于YOLO11n的轻量高效钢材缺陷检测算法MHTD-YOLO11n。该方法首先引入多尺度分组膨胀卷积(multi-scale grouped dilated convolution,MSGDC)模块,通过集成不同膨胀率的分组卷积实现多尺度特征融合,提升对不同种类缺陷的检测能力;随后通过引用分层互补注意力混合模块(hierarchical reciprocal attention mixer,H-RAMi),补偿因下采样特征导致的像素级信息损失;接着设计C2PSA_TPA模块,通过引用张量积注意力(tensor product attention,TPA),显著压缩推理时的KV缓存规模;最后重构特征交互模块(C3K2_DFF),使网络能够在更大的感受野下有效结合多尺度信息,促进检测精度和速度的提升。实验结果表明,相较于YOLO11n算法,MHTD-YOLO11n算法的mAP值和召回率分别提升4.3和9.1个百分点,检测速度达到258.3 frame/s,参数量和计算量分别降低1.42×106和3.4×109,满足工业质检场景对高精度与实时性的双重需求。
中图分类号: TP391.41
| [1] MORDIA R, KUMAR VERMA A. Visual techniques for defects detection in steel products: a comparative study[J]. Engineering Failure Analysis, 2022, 134: 106047. DOI: 10.1016/j.engfailanal.2022.106047. [2] 邓能辉, 侯睿, 叶俊明. 基于深度学习的圆钢表面缺陷检测系统[J]. 中国冶金, 2022, 32(12): 113-121. DOI: 10.13228/j.boyuan.issn1006-9356.20220449. [3] LIANG F T, ZHOU Y, CHEN X, et al. Review of target detection technology based on deep learning[C]// CCEAI'21: Proceedings of the 5th International Conference on Control Engineering and Artificial Intelligence. New York, NY: Association for Computing Machinery, 2021: 132-135. DOI: 10.1145/3448218.3448234. [4] 李跃, 王子铭, 李鑫林, 等. 带钢表面缺陷检测方法研究进展[J]. 钢铁研究学报, 2023, 35(8): 950-962. DOI: 10.13228/j.boyuan.issn1001-0963.20220363. [5] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI: 10.1109/TPAMI.2016.2577031. [6] QIAN H M, WANG H L, FENG S, et al. FESSD: SSD target detection based on feature fusion and feature enhancement[J]. Journal of Real-Time Image Processing, 2023, 20(1): 2. DOI: 10.1007/s11554-023-01258-y. [7] TERVEN J, CÓRDOVA-ESPARZA D M, ROMERO-GONZÁLEZ J A. A comprehensive review of YOLO architectures in computer vision: from YOLOv1 to YOLOv8 and YOLO-NAS[J]. Machine Learning and Knowledge Extraction, 2023, 5(4):1680-1716. DOI: 10.3390/make5040083. [8] JIANG P Y, ERGU D J, LIU F Y, et al. A review of yolo algorithm developments[J]. Procedia Computer Science, 2022, 199: 1066-1073. DOI: 10.1016/j.procs.2022.01.135. [9] 马磊, 李晔, 王宇翔. YOLOv8-FD: YOLOv8改进的钢板表面缺陷检测方法[J]. 计算机工程与应用, 2024, 60(24): 211-221. DOI: 10.3778/j.issn.1002-8331.2406-0223. [10] 梁礼明, 龙鹏威, 金家新, 等. 基于改进YOLOv8s的钢材表面缺陷检测算法[J]. 浙江大学学报(工学版), 2025, 59(3): 512-522. DOI: 10.3785/j.issn.1008-973X.2025.03.009. [11] 窦智, 高浩然, 刘国奇, 等. 轻量化YOLOv8的小样本钢板缺陷检测算法[J]. 计算机工程与应用, 2024, 60(9): 90-100. DOI: 10.3778/j.issn.1002-8331.2311-0070. [12] 张航, 周毅, 邱宇峰. 融合HGnetv2和注意力机制的钢材表面缺陷检测方法[J]. 电子测量与仪器学报, 2025, 39(1): 36-49. DOI: 10.13382/j.jemi.B2407618. [13] LIAO L F, SONG C, WU S L, et al. A novel YOLOv10-based algorithm for accurate steel surface defect detection[J]. Sensors, 2025, 25(3): 769. DOI: 10.3390/s25030769. [14] SU P, HAN H Z, LIU M, et al. MOD-YOLO: rethinking the YOLO architecture at the level of feature information and applying it to crack detection[J]. Expert Systems with Applications, 2024, 237: 121346. DOI: 10.1016/j.eswa.2023.121346. [15] AKHYAR F, LIU Y, HSU C Y, et al. FDD: a deep learning-based steel defect detectors[J]. The International Journal of Advanced Manufacturing Technology, 2023, 126(3): 1093-1107. DOI: 10.1007/s00170-023-11087-9. [16] DAMACHARLA P,ACHUTH RAO M V, RINGENBERG J, et al. TLU-Net: a deep learning approach for automatic steel surface defect detection[C]// 2021 International Conference on Applied Artificial Intelligence (ICAPAI). Piscataway NJ: IEEE, 2021: 1-6. DOI: 10.1109/ICAPAI49758.2021.9462060. [17] URAON P K, VERMA A, BADHOLIA A. Steel sheet defect detection using feature pyramid network and RESNET[C]// 2022 International Conference on Edge Computing and Applications (ICECAA). Piscataway NJ: IEEE, 2022: 1543-1550. DOI: 10.1109/ICECAA55415.2022.9936318. [18] 周建新, 许兴博. 改进Steel-YOLO的钢材表面缺陷检测[J]. 东北师大学报(自然科学版), 2026, 58(1): 65-75. DOI: 10.16163/j.cnki.dslkxb202404200002. [19] GAO T, ZHANG Y, ZHANG Z Y, et al. BHViT: binarized hybrid vision transformer[C]// 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2025: 3563-3572. DOI: 10.1109/CVPR52734.2025.00337. [20] LIU Z C, SHEN Z Q, SAVVIDES M, et al. ReActNet: towards precise binary neural network with generalized activation functions[C]// Computer Vision-ECCV 2020: LNCS Volume 12359. Cham: Springer Nature Switzerland AG, 2020: 143-159. DOI: 10.1007/978-3-030-58568-6_9. [21] CHOI H, NA C, OH J, et al. Reciprocal attention mixing transformer for lightweight image restoration[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Los Alamitos, CA: IEEE Computer Society, 2024: 5992-6002. DOI: 10.1109/CVPRW63382.2024.00606. [22] GAO Z H, AI D N, LI W T, et al. N-gram swin transformer for CT image super-resolution[C]// Extended Reality: LNCS Volume 15461. Singapore: Springer Nature Singapore Pte Ltd., 2024: 136-148. DOI: 10.1007/978-981-96-3679-2_9. [23] ZHANG Y F, LIU Y F, YUAN H Z, et al. Tensor product attention is all you need[EB/OL]. (2025-05-29)[2025-07-31]. https://arxiv.org/abs/2501.06425. DOI: 10.48550/arXiv.2501.06425. [24] YANG J, QIU P J, ZHANG Y C, et al. D-Net: dynamic large kernel with dynamic feature fusion for volumetric medical image segmentation[J]. Biomedical Signal Processing and Control, 2026, 113(Part B): 108837. DOI: 10.1016/j.bspc.2025.108837. [25] CHEN J R, KAO S H, HE H, et al. Run, don't walk: chasing higher FLOPS for faster neural networks[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2023: 12021-12031. DOI: 10.1109/CVPR52729.2023.01157. [26] SUN J, PENG Y F, CHEN C, et al. ESC-YOLO: optimizing apple fruit recognition with efficient spatial and channel features in YOLOX[J]. Journal of Real-Time Image Processing, 2024, 21(5): 162. DOI: 10.1007/s11554-024-01540-7. [27] LI H L, LI J, WEI H B, et al. Slim-neck by GSConv: a lightweight-design for real-time detector architectures[J]. Journal of Real-Time Image Processing, 2024, 21(3): 62. DOI: 10.1007/s11554-024-01436-6. [28] QIAO S Y, CHEN L C, YUILLE A. DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2021: 10208-10219. DOI: 10.1109/CVPR46437.2021.01008. [29] JIN X M, LIANG X Y, DENG P F. Lightweight daylily grading and detection model based on improved YOLOv10[J]. Smart Agriculture, 2024, 6(5): 108-118. DOI: 10.12133/j.smartag.SA202407022. [30] DENG Y H, GUO D, GUO X F, et al. MQA: answering the question via robotic manipulation[EB/OL]. (2023-02-21)[2025-07-31]. https://arxiv.org/abs/2003.04641v4. DOI: 10.48550/arXiv.2003.04641. [31] HUDSON D A, MANNING C D. GQA: a new dataset for real-world visual reasoning and compositional question answering[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2019: 6693-6702. DOI: 10.1109/CVPR.2019.00686. [32] ZHENG C, SONG Y X. Personalized multi-head self-attention network for news recommendation[J]. Neural Networks, 2025, 181: 106824. DOI: 10.1016/j.neunet.2024.106824. [33] TAN H C, LIU X P, YIN B C, et al. MHSA-Net: multihead self-attention network for occluded person re-identification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(11): 8210-8224. DOI: 10.1109/TNNLS.2022. 3144163. |
| [1] | 杨云波, 南新元, 蔡鑫. 基于改进YOLO11n的光伏板缺陷检测方法[J]. 广西师范大学学报(自然科学版), 2026, 44(3): 47-59. |
| [2] | 毕桦男, 高丙朋, 蔡鑫. SOP-DETR:基于改进RT-DETR的海下垃圾检测算法[J]. 广西师范大学学报(自然科学版), 2026, 44(3): 75-88. |
| [3] | 田晟, 冯帅涛, 李嘉. 一种基于复合框架的城市道路场景车辆轨迹提取方法[J]. 广西师范大学学报(自然科学版), 2026, 44(2): 31-51. |
| [4] | 吕辉, 司可. 基于改进RT-DETR的光伏板缺陷检测[J]. 广西师范大学学报(自然科学版), 2026, 44(2): 52-64. |
| [5] | 张胜伟, 曹洁. 融合傅里叶卷积与差异感知的钢材表面微小缺陷检测算法[J]. 广西师范大学学报(自然科学版), 2026, 44(2): 90-102. |
| [6] | 田晟, 赵凯龙, 苗佳霖. 基于改进YOLO11n模型的自动驾驶道路交通检测算法研究[J]. 广西师范大学学报(自然科学版), 2026, 44(1): 1-9. |
| [7] | 黄艳国, 肖洁, 吴水清. 基于D2STGNN的双向高效多尺度交通流预测[J]. 广西师范大学学报(自然科学版), 2026, 44(1): 10-22. |
| [8] | 刘志豪, 李自立, 苏珉. 智能通信与无人机结合的YOLOv8电动车骑行者头盔佩戴检测方法[J]. 广西师范大学学报(自然科学版), 2026, 44(1): 23-32. |
| [9] | 黄文杰, 罗维平, 陈镇南, 彭志祥, 丁梓豪. 基于YOLO11的轻量化PCB缺陷检测算法研究[J]. 广西师范大学学报(自然科学版), 2026, 44(1): 56-67. |
| [10] | 施子豪, 蒙祖强, 谈超洪. 基于注意力机制和多尺度融合的多模态虚假新闻检测模型[J]. 广西师范大学学报(自然科学版), 2026, 44(1): 68-79. |
| [11] | 魏梓书, 陈志刚, 王衍学, 哈斯铁尔·马德提汗. 基于SBSI-YOLO11的轻量化轴承外观缺陷检测算法[J]. 广西师范大学学报(自然科学版), 2025, 43(6): 80-91. |
| [12] | 黎豊玮, 谭玉枚, 宋树祥, 夏海英. 基于注意力引导的遮挡感知面部表情识别[J]. 广西师范大学学报(自然科学版), 2025, 43(5): 104-113. |
| [13] | 刘廷汉, 梁艳, 黄鹏升, 闭金杰, 黄守麟, 李廷会. 基于改进YOLOv8s的人脸痤疮小目标检测[J]. 广西师范大学学报(自然科学版), 2025, 43(5): 114-129. |
| [14] | 田晟, 熊辰崟, 龙安洋. 基于改进PointNet++的城市道路点云分类方法[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 1-14. |
| [15] | 韩烁, 江林峰, 杨建斌. 基于注意力机制PINNs方法求解圣维南方程[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 58-68. |
|
|
版权所有 © 广西师范大学学报(自然科学版)编辑部 地址:广西桂林市三里店育才路15号 邮编:541004 电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn 本系统由北京玛格泰克科技发展有限公司设计开发 |