当前位置:
X-MOL 学术
›
ISPRS J. Photogramm. Remote Sens.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Bounding box versus point annotation: The impact on deep learning performance for animal detection in aerial images
ISPRS Journal of Photogrammetry and Remote Sensing ( IF 10.6 ) Pub Date : 2025-02-27 , DOI: 10.1016/j.isprsjprs.2025.02.017
Zeyu Xu , Tiejun Wang , Andrew K. Skidmore , Richard Lamprey , Shadrack Ngene
ISPRS Journal of Photogrammetry and Remote Sensing ( IF 10.6 ) Pub Date : 2025-02-27 , DOI: 10.1016/j.isprsjprs.2025.02.017
Zeyu Xu , Tiejun Wang , Andrew K. Skidmore , Richard Lamprey , Shadrack Ngene
Bounding box and point annotations are widely used in deep learning-based animal detection from remote sensing imagery, yet their impact on model performance and training efficiency remains insufficiently explored. This study systematically evaluates the influence of these two annotation methods using aerial survey datasets of African elephants and antelopes across three commonly employed deep learning networks: YOLO, CenterNet, and U-Net. In addition, we assess the effect of image spatial resolution and the training efficiency associated with each annotation method. Our findings indicate that when using YOLO, there is no statistically significant difference in model accuracy between bounding box and point annotations. However, for CenterNet and U-Net, bounding box annotations consistently yield significantly higher accuracy compared to point-based annotations, with these trends remaining consistent across different spatial resolution ranges. Furthermore, training efficiency varies depending on the network and annotation method. While YOLO exhibits similar convergence speeds for both annotation types, U-Net models trained with bounding box annotations converge significantly faster, followed by CenterNet, where bounding box-based models also show improved convergence. These findings demonstrate that the choice of annotation method should be guided by the specific deep learning architecture employed. While point-based annotations are more cost-effective, their lower training efficiency in U-Net and CenterNet suggests that bounding box annotations are preferable when maximizing both accuracy and computational efficiency. Therefore, when selecting annotation strategies for animal detection in remote sensing applications, researchers should carefully balance detection accuracy, annotation cost, and training efficiency to optimize performance for specific task requirements.
中文翻译:
边界框与点注释:航空图像中动物检测对深度学习性能的影响
边界框和点注释广泛用于基于深度学习的遥感图像动物检测,但它们对模型性能和训练效率的影响仍未得到充分探索。本研究使用非洲象和羚羊的航空调查数据集,系统地评估了这两种注释方法在三个常用的深度学习网络中的影响:YOLO、CenterNet 和 U-Net。此外,我们还评估了图像空间分辨率的影响和与每种注释方法相关的训练效率。我们的研究结果表明,当使用 YOLO 时,边界框和点注释之间的模型准确性没有统计学上的显着差异。但是,对于 CenterNet 和 U-Net,与基于点的注释相比,边界框注释始终产生明显更高的准确性,并且这些趋势在不同的空间分辨率范围内保持一致。此外,训练效率因网络和注释方法而异。虽然 YOLO 在两种注释类型中表现出相似的收敛速度,但使用边界框注释训练的 U-Net 模型的收敛速度要快得多,其次是 CenterNet,其中基于边界框的模型也显示出改进的收敛性。这些发现表明,注释方法的选择应以所采用的特定深度学习架构为指导。虽然基于点的注释更具成本效益,但它们在 U-Net 和 CenterNet 中的训练效率较低,这表明在最大限度地提高准确性和计算效率时,边界框注释是可取的。 因此,在遥感应用中选择用于动物检测的注释策略时,研究人员应仔细平衡检测精度、注释成本和训练效率,以优化特定任务要求的性能。
更新日期:2025-02-27
中文翻译:

边界框与点注释:航空图像中动物检测对深度学习性能的影响
边界框和点注释广泛用于基于深度学习的遥感图像动物检测,但它们对模型性能和训练效率的影响仍未得到充分探索。本研究使用非洲象和羚羊的航空调查数据集,系统地评估了这两种注释方法在三个常用的深度学习网络中的影响:YOLO、CenterNet 和 U-Net。此外,我们还评估了图像空间分辨率的影响和与每种注释方法相关的训练效率。我们的研究结果表明,当使用 YOLO 时,边界框和点注释之间的模型准确性没有统计学上的显着差异。但是,对于 CenterNet 和 U-Net,与基于点的注释相比,边界框注释始终产生明显更高的准确性,并且这些趋势在不同的空间分辨率范围内保持一致。此外,训练效率因网络和注释方法而异。虽然 YOLO 在两种注释类型中表现出相似的收敛速度,但使用边界框注释训练的 U-Net 模型的收敛速度要快得多,其次是 CenterNet,其中基于边界框的模型也显示出改进的收敛性。这些发现表明,注释方法的选择应以所采用的特定深度学习架构为指导。虽然基于点的注释更具成本效益,但它们在 U-Net 和 CenterNet 中的训练效率较低,这表明在最大限度地提高准确性和计算效率时,边界框注释是可取的。 因此,在遥感应用中选择用于动物检测的注释策略时,研究人员应仔细平衡检测精度、注释成本和训练效率,以优化特定任务要求的性能。