基于ResNet-50的级联注意力遥感图像分类

doi:10.16088/j.issn.1001-6600.2023031702

摘要/Abstract

摘要： 知识蒸馏能提高神经网络的泛化能力,可解决遥感图像场景分类时标注数据不足的问题。遥感图像存在的类间高相似性会导致中间知识特征丢失,针对该问题,本文提出一种基于自蒸馏级联注意力机制的特征提取方法(SDCASA)。首先构造权值共享的教师、学生网络;然后使用级联注意力模块精细化深层教师网络所提取到的特征,同时保留被浅层神经网络过滤的中间边缘信息;再利用精细化之后的特征指导学生网络学习;最后在下游训练一个线性分类器完成特征分类。在3个公开数据集AID、MLRSNet、EuroSAT上使用20%和50%的样本训练,分类准确率分别达到85.17%、90.10%、91.13%和85.50%、92.13%、91.17%。此方法能有效提高遥感图像场景分类准确率,性能优于主流自监督图像分类方法 SimSiam、SwAV、MoCov2、Deepcluster,具有良好的应用价值。

关键词: 自蒸馏, 注意力机制, 遥感图像, 自监督学习, 图像分类

Abstract: Knowledge distillation can improve the generalization ability of neural networks and solve the problem of insufficient labeled data when classifying remote sensing image scenes. And the high similarity between classes existing in remote sensing images can lead to the loss of intermediate knowledge features. To address this problem, a feature extraction method (SDCASA) based on the self-distillation cascaded attention mechanism is proposed. Firstly, a teacher and student network with shared weights is constructed; then the cascaded attention module is used to refine the features extracted by the deep teacher network while retaining the intermediate edge information filtered by the shallow neural network. Secondly, the refined features are used to guide the student network to learn. Finally, a linear classifier is trained downstream to complete feature classification. The classification accuracies of 85.17%, 90.10%, 91.13% and 85.50%, 92.13%, 91.17% are achieved on three publicly available datasets AID, MLRSNet, and EuroSAT using 20% and 50% of the samples trained, respectively. This method can effectively improve the classification accuracy of remote sensing image scenes and outperforms the mainstream self-supervised image classification methods SimSiam, SwAV, MoCov2, Deepcluster, and has good application value.

Key words: self-distillation, attention mechanism, remote sensing images, self-supervised learning, image classification

中图分类号: TP751

宋冠武, 陈知明, 李建军. 基于ResNet-50的级联注意力遥感图像分类[J]. 广西师范大学学报（自然科学版）, 2023, 41(6): 80-91.

SONG Guanwu, CHEN Zhiming, LI Jianjun. Remote Sensing Image Classification with Cascade Attention Based on ResNet-50[J]. Journal of Guangxi Normal University(Natural Science Edition), 2023, 41(6): 80-91.

参考文献

[1] 张康, 黑保琴, 李盛阳, 等. 基于CNN模型的遥感图像复杂场景分类[J]. 国土资源遥感, 2018, 30(4): 49-55. DOI: 10.6046/gtzyyg.2018.04.08.
[2] 汪晓洲, 石翠萍, 杨焜, 等. 基于深度学习的场景遥感图像分类方法研究[J]. 齐齐哈尔大学学报(自然科学版), 2021, 37(5): 11-15. DOI: 10.3969/j.issn.1007-984X.2021.05.003.
[3] 郭棚跃. 基于深度学习的高光谱遥感图像分类[D]. 桂林: 桂林电子科技大学, 2021. DOI: 10.27049/d.cnki.ggldc.2021.000389.
[4] TAO C, LU W P, QI J, et al. Spatial information considered network for scene classification[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 18(6): 984-988. DOI: 10.1109/LGRS.2020.2992929.
[5] 陈知明, 张江, 邱汉清, 等. 基于密集连接的高分辨率遥感图像分类[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 88-94. DOI: 10.16088/j.issn.1001-6600.2021071503.
[6] 张馨月. 基于DCNN的高分辨率遥感图像场景分类[D]. 长春: 吉林大学, 2019.
[7] 王振国, 陈宏宇, 徐文明. 利用DCNN融合特征对遥感图像进行场景分类[J]. 电子设计工程, 2018, 26(1): 189-193. DOI: 10.3969/j.issn.1674-6236.2018.01.042.
[8] 刘金香, 班伟, 陈宇, 等. 融合多维度CNN的高光谱遥感图像分类算法[J]. 中国激光, 2021, 48(16): 1610003. DOI: 10.3788/CJL202148.1610003.
[9] TAO C, QI J, LU W P, et al. Remote sensing image scene classification with self-supervised paradigm under limited labeled samples[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 8004005. DOI: 10.1109/LGRS.2020.3038420.
[10] JING L L, TIAN Y L. Self-supervised visual feature learning with deep neural networks: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(11): 4037-4058. DOI: 10.1109/TPAMI.2020.2992393.
[11] MAÑAS O, LACOSTE A, GIRÓ-I-NIETO X, et al. Seasonal contrast: unsupervised pre-training from uncurated remote sensing data[C]// 2021 IEEE/CVF International Conference on Computer Vision(ICCV). Los Alamitos, CA: IEEE Computer Society, 2021: 9394-9403. DOI: 10.1109/ICCV48922.2021.00928.
[12] HE K M, FAN H Q, WU Y X, et al. Momentum contrast for unsupervised visual representation learning[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA: IEEE Computer Society, 2020: 9726-9735. DOI: 10.1109/CVPR42600.2020.00975.
[13] CHEN X L, HE K M. Exploring simple Siamese representation learning[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Los Alamitos, CA: IEEE Computer Society, 2021: 15745-15753. DOI: 10.1109/CVPR46437.2021.01549.
[14] CARON M, MISRA I, MAIRAL J, et al. Unsupervised learning of visual features by contrasting cluster assignments[C]// Advances in Neural Information Processing Systems 33(NeurIPS 2020). Red Hook, NY: Curran Associates Inc., 2020: 9912-9924.
[15] HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network[EB/OL].(2015-03-09)[2023-03-17]. http://arxiv.org/abs/1503.02531. DOI: 10.48550/arXiv.1503.02531.
[16] YUE J, FANG L Y, RAHMANI H, et al. Self-supervised learning with adaptive distillation for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5501813. DOI: 10.1109/TGRS.2021.3057768.
[17] ZAGORUYKO S, KOMODAKIS N. Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer[EB/OL].(2017-02-12)[2023-03-17]. http://arxiv.org/abs/1612.03928v3. DOI: 10.48550/arXiv.1612.03928.
[18] 孙显, 杨竹君, 李俊希, 等. 基于知识自蒸馏的轻量化复杂遥感图像精细分类方法[J]. 指挥与控制学报, 2021, 7(4): 365-373. DOI: 10.3969/j.issn.2096-0204.2021.04.0365.
[19] CHEN G Z, ZHANG X D, TAN X L, et al. Training small networks for scene classification of remote sensing images via knowledge distillation[J]. Remote Sensing, 2018, 10(5): 719. DOI: 10.3390/rs10050719.
[20] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Computer Vision-ECCV 2018: LNCS Volume 11211. Cham: Springer Nature Switzerland AG, 2018: 3-19. DOI: 10.1007/978-3-030-01234-2_1.
[21] WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2018: 7794-7803. DOI: 10.1109/CVPR.2018.00813.
[22] FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA: IEEE Computer Society, 2019: 3141-3149. DOI: 10.1109/CVPR.2019.00326.
[23] 王元东. 基于ResNet模型的图像分类方法及应用研究[D]. 南昌: 华东交通大学, 2019. DOI: 10.27147/d.cnki.ghdju.2019.000432
[24] 冯凯, 崔弘, 吴锐. 基于3D残差网络的视频哈希检索[J]. 电子设计工程, 2021, 29(22): 128-133. DOI: 10.14022/j.issn1674-6236.2021.22.028.
[25] GRILL J B, STRUB F, ALTCHÉ F, et al. Bootstrap your own latent: a new approach to self-supervised learning[C]// Advances in Neural Information Processing Systems 33(NeurIPS 2020). Red Hook, NY: Curran Associates Inc., 2020: 21271-21284.
[26] XIA G S, HU J W, HU F, et al. AID: a benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(7): 3965-3981. DOI: 10.1109/TGRS.2017.2685945.
[27] TAN X W, XIAO Z F, ZHU J J, et al. Transformer-driven semantic relation inference for multilabel classification of high-resolution remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15: 1884-1901. DOI: 10.1109/JSTARS.2022.3145042.
[28] 苗壮, 王亚鹏, 李阳, 等. 一种鲁棒的双教师自监督蒸馏哈希学习方法[J]. 计算机科学, 2022, 49(10): 159-168. DOI: 10.11896/jsjkx.210800050.
[29] CHEN X L, FAN H Q, GIRSHICK R, et al. Improved baselines with momentum contrastive learning[EB/OL].(2020-03-09)[2023-03-17]. http://arxiv.org/abs/2003.04297. DOI: 10.48550/arXiv.2003.04297.
[30] CARON M, BOJANOWSKI P, JOULIN A, et al. Deep clustering for unsupervised learning of visual features[C]// Computer Vision-ECCV 2018: LNCS Volume 11218. Cham: Springer Nature Switzerland AG, 2018: 139-156. DOI: 10.1007/978-3-030-01264-9_9.
[31] CHENG G, HAN J W, LU X Q. Remote sensing image scene classification: benchmark and state of the art[J]. Proceedings of the IEEE, 2017, 105(10): 1865-1883. DOI: 10.1109/JPROC.2017.2675998.
[32] GUO Y Y, JI J S, LU X K, et al. Global-local attention network for aerial scene classification[J]. IEEE Access, 2019, 7: 67200-67212. DOI: 10.1109/ACCESS.2019.2918732.
[33] TANG X, MA Q S, ZHANG X R, et al. Attention consistent network for remote sensing scene classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14: 2030-2045. DOI: 10.1109/JSTARS.2021.3051569.
[34] WANG D, ZHANG J, DU B, et al. An empirical study of remote sensing pretraining[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5608020. DOI: 10.1109/TGRS.2022.3176603.
[35] AREFEEN M A, NIMI S T, UDDIN M Y S, et al. A lightweight ReLU-based feature fusion for aerial scene classification[C]// 2021 IEEE International Conference on Image Processing(ICIP). Piscataway, NJ: IEEE, 2021: 3857-3861. DOI: 10.1109/ICIP42928.2021.9506524.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed