当前位置: X-MOL 学术ISPRS J. Photogramm. Remote Sens. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adaptive Discrepancy Masked Distillation for remote sensing object detection
ISPRS Journal of Photogrammetry and Remote Sensing ( IF 10.6 ) Pub Date : 2025-02-26 , DOI: 10.1016/j.isprsjprs.2025.02.006
Cong Li , Gong Cheng , Junwei Han

Knowledge distillation (KD) has become a promising technique for obtaining a performant student detector in remote sensing images by inheriting the knowledge from a heavy teacher detector. Unfortunately, not every pixel contributes (even detrimental) equally to the final KD performance. To dispel this problem, the existing methods usually derived a distillation mask to stress the valuable regions during KD. In this paper, we put forth Adaptive Discrepancy Masked Distillation (ADMD), a novel KD framework to explicitly localize the beneficial pixels. Our approach stems from the observation that the feature discrepancy between the teacher and student is the essential reason for their performance gap. With this regard, we make use of the feature discrepancy to determine which location causes the student to lag behind the teacher and then regulate the student to assign higher learning priority to them. Furthermore, we empirically observe that the discrepancy masked distillation leads to loss vanishing in later KD stages. To combat this issue, we introduce a simple yet practical weight-increasing module, in which the magnitude of KD loss is adaptively adjusted to ensure KD steadily contributes to student optimization. Comprehensive experiments on DIOR and DOTA across various dense detectors show that our ADMD consistently harvests remarkable performance gains, particularly under a prolonged distillation schedule, and exhibits superiority over state-of-the-art counterparts. Code and trained checkpoints will be made available at https://github.com/swift1988.

中文翻译:


用于遥感目标检测的 Adaptive Discrepancy Masked Distillation



知识蒸馏 (KD) 已成为一种很有前途的技术,通过继承重型教师探测器的知识,在遥感图像中获得高性能的学生探测器。遗憾的是,并非每个像素对最终 KD 性能的贡献(甚至有害)相同。为了消除这个问题,现有的方法通常衍生出一个蒸馏掩码来强调 KD 过程中的宝贵区域。在本文中,我们提出了自适应差异掩盖蒸馏 (ADMD),这是一种新颖的 KD 框架,用于显式定位有益像素。我们的方法源于这样一种观察,即教师和学生之间的特征差异是他们表现差距的根本原因。在这方面,我们利用特征差异来确定哪个位置导致学生落后于教师,然后调节学生为他们分配更高的学习优先级。此外,我们凭经验观察到,掩盖蒸馏的差异导致损失在后期 KD 阶段消失。为了解决这个问题,我们引入了一个简单而实用的增重模块,其中 KD 损失的幅度被自适应调整,以确保 KD 稳定地为学生优化做出贡献。在各种密集检测器上对 DIOR 和 DOTA 进行的综合实验表明,我们的 ADMD 始终如一地获得显着的性能提升,尤其是在长时间的蒸馏计划下,并表现出优于最先进的同类产品。代码和经过训练的检查点将在 https://github.com/swift1988 上提供。
更新日期:2025-02-26
down
wechat
bug