当前位置:
X-MOL 学术
›
ISPRS J. Photogramm. Remote Sens.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Bridging real and simulated data for cross-spatial- resolution vegetation segmentation with application to rice crops
ISPRS Journal of Photogrammetry and Remote Sensing ( IF 10.6 ) Pub Date : 2024-10-28 , DOI: 10.1016/j.isprsjprs.2024.10.007 Yangmingrui Gao, Linyuan Li, Marie Weiss, Wei Guo, Ming Shi, Hao Lu, Ruibo Jiang, Yanfeng Ding, Tejasri Nampally, P. Rajalakshmi, Frédéric Baret, Shouyang Liu
ISPRS Journal of Photogrammetry and Remote Sensing ( IF 10.6 ) Pub Date : 2024-10-28 , DOI: 10.1016/j.isprsjprs.2024.10.007 Yangmingrui Gao, Linyuan Li, Marie Weiss, Wei Guo, Ming Shi, Hao Lu, Ruibo Jiang, Yanfeng Ding, Tejasri Nampally, P. Rajalakshmi, Frédéric Baret, Shouyang Liu
Accurate image segmentation is essential for image-based estimation of vegetation canopy traits, as it minimizes background interference. However, existing segmentation models often lack the generalization ability to effectively tackle both ground-based and aerial images across a wide range of spatial resolutions. To address this limitation, a cross-spatial-resolution image segmentation model for rice crop was trained using the integration of in-situ and in silico multi-resolution images. We collected more than 3,000 RGB images (real set) covering 17 different resolutions reflecting diverse canopy structures, illumination conditions and background in rice fields, with vegetation pixels annotated manually. Using the previously developed Digital Plant Phenotyping Platform, we created a simulated dataset (sim set) including 10,000 RGB images with resolutions ranging from 0.5 to 3.5 mm/pixel, accompanied by corresponding mask labels. By employing a domain adaptation technique, the simulated images were further transformed into visually realistic images while preserving the original labels, creating a simulated-to-realistic dataset (sim2real set). Building upon a SegFormer deep learning model, we demonstrated that training with multi-resolution samples led to more generalized segmentation results than single-resolution training on the real dataset. Our exploration of various integration strategies revealed that a training set of 9,600 sim2real images combined with only 60 real images achieved the same segmentation accuracy as 2,400 real images (IoU = 0.819, F1 = 0.901). Moreover, combining 2,400 real images and 1,200 sim2real images resulted in the best performing model, effective against six challenging situations, such as specular reflections and shadows. Compared with models trained with single-resolution samples and an established model (i.e., VegANN), our model effectively improved the estimation of both green fraction and green area index across spatial resoultions. The strategy of bridging real and simulated data for cross-resolution deep learning model is expected to be applicable to other crops. The best trained model is available at https://github.com/PheniX-Lab/crossGSD-seg .
中文翻译:
将真实数据和模拟数据联系起来,实现跨空间分辨率植被分割,并应用于水稻作物
准确的图像分割对于基于图像的植被冠层特征估计至关重要,因为它可以最大限度地减少背景干扰。然而,现有的分割模型通常缺乏泛化能力,无法在广泛的空间分辨率范围内有效处理地基和航空影像。为了解决这一限制,使用原位和计算机多分辨率图像的集成来训练水稻作物的跨空间分辨率图像分割模型。我们收集了 3,000 多张 RGB 图像(真实设置),涵盖 17 种不同的分辨率,反映了不同的树冠结构、光照条件和稻田背景,并手动注释了植被像素。使用之前开发的数字植物表型平台,我们创建了一个模拟数据集(sim 集),包括 10,000 张分辨率为 0.5 至 3.5 毫米/像素的 RGB 图像,并附有相应的掩码标签。通过采用域自适应技术,模拟图像进一步转换为视觉逼真的图像,同时保留原始标签,创建模拟到现实的数据集 (sim2real set)。基于 SegFormer 深度学习模型,我们证明了与真实数据集上的单分辨率训练相比,使用多分辨率样本进行训练会产生更广义的分割结果。我们对各种集成策略的探索表明,由 9,600 张 sim2real 图像组成的训练集与仅 60 张真实图像相结合,实现了与 2,400 张真实图像相同的分割精度(IoU = 0.819,F1 = 0.901)。此外,结合 2,400 张真实图像和 1,200 张 sim2real 图像,产生了性能最佳的模型,可有效应对六种具有挑战性的情况,例如镜面反射和阴影。 与使用单分辨率样本和已建立模型(即 VegANN)训练的模型相比,我们的模型有效地提高了空间分辨率中绿色分数和绿色面积指数的估计。跨分辨率深度学习模型的桥接真实数据和模拟数据的策略有望适用于其他作物。https://github.com/PheniX-Lab/crossGSD-seg 提供经过最佳训练的模型。
更新日期:2024-10-28
中文翻译:
将真实数据和模拟数据联系起来,实现跨空间分辨率植被分割,并应用于水稻作物
准确的图像分割对于基于图像的植被冠层特征估计至关重要,因为它可以最大限度地减少背景干扰。然而,现有的分割模型通常缺乏泛化能力,无法在广泛的空间分辨率范围内有效处理地基和航空影像。为了解决这一限制,使用原位和计算机多分辨率图像的集成来训练水稻作物的跨空间分辨率图像分割模型。我们收集了 3,000 多张 RGB 图像(真实设置),涵盖 17 种不同的分辨率,反映了不同的树冠结构、光照条件和稻田背景,并手动注释了植被像素。使用之前开发的数字植物表型平台,我们创建了一个模拟数据集(sim 集),包括 10,000 张分辨率为 0.5 至 3.5 毫米/像素的 RGB 图像,并附有相应的掩码标签。通过采用域自适应技术,模拟图像进一步转换为视觉逼真的图像,同时保留原始标签,创建模拟到现实的数据集 (sim2real set)。基于 SegFormer 深度学习模型,我们证明了与真实数据集上的单分辨率训练相比,使用多分辨率样本进行训练会产生更广义的分割结果。我们对各种集成策略的探索表明,由 9,600 张 sim2real 图像组成的训练集与仅 60 张真实图像相结合,实现了与 2,400 张真实图像相同的分割精度(IoU = 0.819,F1 = 0.901)。此外,结合 2,400 张真实图像和 1,200 张 sim2real 图像,产生了性能最佳的模型,可有效应对六种具有挑战性的情况,例如镜面反射和阴影。 与使用单分辨率样本和已建立模型(即 VegANN)训练的模型相比,我们的模型有效地提高了空间分辨率中绿色分数和绿色面积指数的估计。跨分辨率深度学习模型的桥接真实数据和模拟数据的策略有望适用于其他作物。https://github.com/PheniX-Lab/crossGSD-seg 提供经过最佳训练的模型。