广西师范大学学报(自然科学版) ›› 2025, Vol. 43 ›› Issue (4): 83-96.doi: 10.16088/j.issn.1001-6600.2024072303

• 智能信息处理 • 上一篇    下一篇

基于改进ConvNeXt的苹果叶片病害分类算法

石天怡, 南新元*, 郭翔羽, 赵濮, 蔡鑫   

  1. 新疆大学 电气工程学院, 新疆 乌鲁木齐 830017
  • 收稿日期:2024-07-23 修回日期:2024-12-23 出版日期:2025-07-05 发布日期:2025-07-14
  • 通讯作者: 南新元(1967—),男,新疆乌鲁木齐人,新疆大学教授。E-mail: xynan@xju.edu.cn
  • 基金资助:
    国家自然科学基金(62303394);新疆维吾尔自治区自然科学基金(2022D01C693)

Improved ConvNeXt-based Algorithm for Apple Leaf Disease Classification

SHI Tianyi, NAN Xinyuan*, GUO Xiangyu, ZHAO Pu, CAI Xin   

  1. School of Electrical Engineering, Xinjiang University, Urumqi Xinjiang 830017, China
  • Received:2024-07-23 Revised:2024-12-23 Online:2025-07-05 Published:2025-07-14

摘要: 针对传统苹果叶片病害分类方法精准性差的问题,本文提出一种基于改进ConvNeXt的苹果叶片分类算法CALDNet (ConvNeXt apple leaf disease enhance network)。CALDNet设计3223结构的网络对模型结构进行调整,同时引入跳跃连接、位置编码以增强模型对空间的捕捉能力,提高训练过程中的稳定性;引入空间金字塔池化(spatial pyramid pooling, SPP)以捕捉不同尺度上的空间特征,增强模型对不同大小病斑的适应能力;在ConvNeXtblock的基础上,设计G-ConvNeXtblock,引入Gabor滤波器作为卷积核,对深度卷积进行改进,更好地捕捉图像中的纹理信息;为了提高模型对小范围苹果叶片病害的识别能力,设计一种增强型的通道和空间注意力机制(enhanced convolutional block attention module, enhanced CBAM)。实验以6种常见苹果叶片病害(黑星病、黑腐病、褐斑病、花叶病、锈病、灰斑病)及健康叶片为主要研究对象,并与主流算法进行对比。实验结果表明,CALDNet模型识别叶片病害的精确率、召回率以及F1值达到97.58%、97.54%和97.54%,相较于原始ConvNeXt模型,分别提高4.63、4.56和4.60个百分点,参数量下降23.97%,解决了传统苹果叶片病害分类精准性差的问题。

关键词: 苹果叶片病害, ConvNeXt, CNN, 注意力机制, 深度学习

Abstract: Aiming at the problems of poor accuracy of traditional apple leaf disease classification methods, an apple leaf classification algorithm CALDNet based on improved ConvNeXt is proposed. 3223 Network is designed to adjust the structure of the model, while jump connection and position coding are introduced to enhance the model’s ability to capture the space and to improve the stability of the training process, and Spatial Pyramid Pooling (SPP) is used to capture spatial features on different scales and enhance the model’s ability to adapt to large and small lesions; on the basis of ConvNeXtblock, G-ConvNeXtblock is designed to improve the depth convolution, and a Gabor filter is introduced as a convolution kernel to better capture texture information in the image. In order to improve the model’s ability to recognize a small range of apple leaf disease recognition ability, an enhanced channel and attention mechanism (enhanced CBAM) is designed. In the experiments, seven common leaf diseases (black-star disease, black-rot, brown-spot, mosaic disease, healthy, rust, gray-spot) are chosed as the main research subjects, and the experimental results by using the improved algorithm and other mainstream algorithms are compared. The experimental results show that the CALDNet model recognizes the leaf disease model with the precision rate, recall rate, and F1 value of 97.58%, 97.54%, and 97.54%, respectively, compared with the original ConvNeXt model, which increased by 4.63,4.56 and 4.60 percentage points, solving the problems of poor precision of traditional apple leaf disease classification.

Key words: apple leaf disease, ConvNeXt, CNN, attention mechanism, deep learning

中图分类号:  TP391.41

[1] 胡清玉, 胡同乐, 王亚南, 等. 中国苹果病害发生与分布现状调查[J]. 植物保护, 2016,42(1): 175-179. DOI: 10.3969/j.issn.0529-1542.2016.01.032.
[2] 徐艳蕾, 孔朔琳, 陈清源, 等. 基于Transformer的强泛化苹果叶片病害识别模型[J]. 农业工程学报, 2022, 38(16): 198-206. DOI: 10.11975/j.issn.1002-6819.2022.16.022.
[3] 刘斌, 贾润昌, 朱先语, 等. 面向移动端的苹果叶部病虫害轻量级识别模型[J]. 农业工程学报, 2022, 38(6): 130-139. DOI: 10.11975/j.issn.1002-6819.2022.06.015.
[4] 姜红花, 杨祥海, 丁睿柔, 等. 基于改进ResNet18的苹果叶部病害多分类算法研究[J]. 农业机械学报, 2023, 54(4): 295-303. DOI: 10.6041/j.issn.1000-1298.2023.04.030.
[5] LV M, SU W H. YOLOV5-CBAM-C3TR: an optimized model based on transformer module and attention mechanism for apple leaf disease detection[J]. Frontiers in Plant Science, 2024, 14: 1323301. DOI: 10.3389/fpls.2023.1323301.
[6] CHEN X L, XING X Z, ZHANG Y Z, et al. MSCR-FuResNet: a three-residual network fusion model based on multi-scale feature extraction and enhanced channel spatial features for close-range apple leaf diseases classification under optimal conditions[J]. Horticulturae, 2024, 10(9): 953. DOI: 10.3390/horticulturae10090953.
[7] 李大湘, 曾小通, 刘颖. 耦合全局与局部特征的苹果叶部病害识别模型[J]. 农业工程学报, 2022, 38(16): 207-214. DOI: 10.11975/j.issn.1002-6819.2022.16.023.
[8] 鲍文霞, 吴刚, 胡根生, 等. 基于改进卷积神经网络的苹果叶部病害识别[J]. 安徽大学学报(自然科学版), 2021,45(1): 53-59. DOI: 10.3969/j.issn.1000-2162.2021.01.008.
[9] 于雪莹, 高继勇, 王首程, 等. 基于生成对抗网络和混合注意力机制残差网络的苹果病害识别[J]. 中国农机化学报, 2022,43(6): 166-174. DOI: 10.13733/j.jcam.issn.2095-5553.2022.06.022.
[10] LIU Z, MAO H Z, WU C Y, et al. A ConvNet for the 2020s[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2022: 11966-11976. DOI: 10.1109/CVPR52688.2022.01167.
[11] GAO Y X, CAO Z Z, CAI W W, et al. Apple leaf disease identification in complex background based on BAM-Net[J]. Agronomy, 2023, 13(5): 1240. DOI: 10.3390/agronomy13051240.
[12] LI X P, LI S Q. Transformer help CNN see better: a lightweight hybrid apple disease identification model based on transformers[J]. Agriculture, 2022, 12(6): 884. DOI: 10.3390/agriculture12060884.
[13] LU J W, LU B B, MA W L, et al. EAIS-Former: an efficient and accurate image segmentation method for fruit leaf diseases[J]. Computers and Electronics in Agriculture, 2024, 218: 108739. DOI: 10.1016/j.compag.2024.108739.
[14] 徐红明, 王兴华, 方诚, 等. 基于旋转不变性的高分辨率遥感影像船舶检测[J]. 中国航海, 2024,47(2): 120-127. DOI: 10.3969/j.issn.1000-4653.2024.02.016.
[15] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018: 3-19. DOI: 10.1007/978-3-030-01234-2_1.
[16] YANG L X, ZHANG R Y, LI L D, et al. SimAM: a simple, parameter-free attention module for convolutional neural networks[J]. Proceedings of Machine Learning Research, 2021, 139: 11863-11874.
[17] HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020,42(8): 2011-2023. DOI: 10.1109/TPAMI.2019.2913372.
[18] HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2021: 13708-13717. DOI: 10.1109/CVPR46437.2021.01350.
[19] QIN Z Q, ZHANG P Y, WU F, et al. FcaNet: frequency channel attention networks[C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Los Alamitos, CA: IEEE Computer Society, 2021: 763-772. DOI: 10.1109/ICCV48922.2021.00082.
[20] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2016: 770-778. DOI: 10.1109/cvpr.2016.90.
[21] TAN M X, LE Q. EfficientNetV2: smaller models and faster training[J]. Proceedings of Machine Learning Research, 2021, 139: 10096-10106.
[22] IANDOLAF, MOSKEWICZ M, KARAYEV S, et al. DenseNet: implementing efficient ConvNet descriptor pyramids[EB/OL]. (2014-04-07)[2024-07-23]. https://doi.org/10.48550/arXiv.1404.1869. DOI: 10.48550/arXiv.1404.1869.
[23] WANG C Y, MARK LIAO H Y, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Los Alamitos, CA: IEEE Computer Society, 2020: 1571-1580. DOI: 10.1109/CVPRW50498.2020.00203.
[24] XIE S N, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2017: 5987-5995. DOI: 10.1109/CVPR.2017.634.
[25] TROCKMAN A, KOLTER J Z. Patches are all you need?[EB/OL]. (2022-01-24)[2024-07-23]. https://arxiv.org/abs/2201.09792. DOI: 10.48550/arXiv.2201.09792.
[26] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[C]// International Conference on Learning Representations 2021. Red Hook: Curran Associates, Inc., 2021: 1-21.
[27] LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Los Alamitos, CA: IEEE Computer Society, 2021: 9992-10002. DOI: 10.1109/ICCV48922.2021.00986.
[1] 田晟, 熊辰崟, 龙安洋. 基于改进PointNet++的城市道路点云分类方法[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 1-14.
[2] 黎宗孝, 张健, 罗鑫悦, 赵嶷飞, 卢飞. 基于K-means和Adam-LSTM的机场进场航迹预测研究[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 15-23.
[3] 韩烁, 江林峰, 杨建斌. 基于注意力机制PINNs方法求解圣维南方程[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 58-68.
[4] 卢展跃, 陈艳平, 杨卫哲, 黄瑞章, 秦永彬. 基于掩码注意力与多特征卷积网络的关系抽取方法[J]. 广西师范大学学报(自然科学版), 2025, 43(3): 12-22.
[5] 郭翔羽, 石天怡, 陈燕楠, 南新元, 蔡鑫. 基于YOLO-CDBW模型的列车接触网异物检测研究[J]. 广西师范大学学报(自然科学版), 2025, 43(2): 56-69.
[6] 苏春海, 夏海英. 抗噪声双约束网络的面部表情识别[J]. 广西师范大学学报(自然科学版), 2025, 43(2): 70-82.
[7] 刘玉娜, 马双宝. 基于改进YOLOv8n的轻量化织物疵点检测算法[J]. 广西师范大学学报(自然科学版), 2025, 43(2): 83-94.
[8] 戴林华, 黎远松, 石睿, 何忠良, 李雷. HSED-YOLO:一种轻量化的带钢表面缺陷检测模型[J]. 广西师范大学学报(自然科学版), 2025, 43(2): 95-106.
[9] 余快, 宋宝贵, 邵攀, 余翱. 基于层级尺度交互的U-Net遥感影像建筑物提取方法[J]. 广西师范大学学报(自然科学版), 2025, 43(2): 121-132.
[10] 李欣, 宁静. 基于时空特征融合的电力系统暂态稳定评估[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 89-100.
[11] 侯海燕, 谭玉枚, 宋树祥, 夏海英. 头部姿态鲁棒的面部表情识别[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 126-137.
[12] 卢家辉, 陈庆锋, 王文广, 余谦, 何乃旭, 韩宗钊. 基于多尺度注意力的器官图像分割方法[J]. 广西师范大学学报(自然科学版), 2024, 42(6): 138-148.
[13] 杜帅文, 靳婷. 基于用户行为特征的深度混合推荐算法[J]. 广西师范大学学报(自然科学版), 2024, 42(5): 91-100.
[14] 田晟, 胡啸. 基于Transformer模型的车辆轨迹预测[J]. 广西师范大学学报(自然科学版), 2024, 42(3): 47-58.
[15] 易见兵, 彭鑫, 曹锋, 李俊, 谢唯嘉. 多尺度特征融合的点云配准算法研究[J]. 广西师范大学学报(自然科学版), 2024, 42(3): 108-120.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 何安康, 陈艳平, 扈应, 黄瑞章, 秦永彬. 融合边界交互信息的命名实体识别方法[J]. 广西师范大学学报(自然科学版), 2025, 43(3): 1 -11 .
[2] 卢展跃, 陈艳平, 杨卫哲, 黄瑞章, 秦永彬. 基于掩码注意力与多特征卷积网络的关系抽取方法[J]. 广西师范大学学报(自然科学版), 2025, 43(3): 12 -22 .
[3] 齐丹丹, 王长征, 郭少茹, 闫智超, 胡志伟, 苏雪峰, 马博翔, 李时钊, 李茹. 基于主题多视图表示的零样本实体检索方法[J]. 广西师范大学学报(自然科学版), 2025, 43(3): 23 -34 .
[4] 黄川洋, 程灿儿, 李松威, 陈鸿东, 张秋楠, 张钊, 邵来鹏, 唐剑, 王咏梅, 郭奎奎, 陆航林, 胡君辉. 带涂覆层的长周期光纤光栅温度传感特性研究[J]. 广西师范大学学报(自然科学版), 2025, 43(3): 35 -42 .
[5] 田晟, 熊辰崟, 龙安洋. 基于改进PointNet++的城市道路点云分类方法[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 1 -14 .
[6] 黎宗孝, 张健, 罗鑫悦, 赵嶷飞, 卢飞. 基于K-means和Adam-LSTM的机场进场航迹预测研究[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 15 -23 .
[7] 宋铭楷, 朱成杰. 基于H-WOA-GWO和区段修正策略的配电网故障定位研究[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 24 -37 .
[8] 陈禹, 陈磊, 张怡, 张志瑞. 基于QMD-LDBO-BiGRU的风速预测模型[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 38 -57 .
[9] 韩烁, 江林峰, 杨建斌. 基于注意力机制PINNs方法求解圣维南方程[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 58 -68 .
[10] 李志欣, 匡文兰. 结合互注意力空间自适应和特征对集成判别的细粒度图像分类[J]. 广西师范大学学报(自然科学版), 2025, 43(4): 69 -82 .
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发