当前位置: X-MOL 学术Int. J. Appl. Earth Obs. Geoinf. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A SAM-adapted weakly-supervised semantic segmentation method constrained by uncertainty and transformation consistency
International Journal of Applied Earth Observation and Geoinformation ( IF 7.6 ) Pub Date : 2025-02-25 , DOI: 10.1016/j.jag.2025.104440
Yinxia Cao , Xin Huang , Qihao Weng

Semantic segmentation of remote sensing imagery is a fundamental task to generate pixel-wise category maps. Existing deep learning networks rely heavily on dense pixel-wise labels, incurring high acquisition costs. Given this challenge, this study introduces sparse point labels, a type of cost-effective weak labels, for semantic segmentation. Existing weakly-supervised methods often leverage low-level visual or high-level semantic features from networks to generate supervision information for unlabeled pixels, which can easily lead to the issue of label noises. Furthermore, these methods rarely explore the general-purpose foundation model, segment anything model (SAM), with strong zero-shot generalization capacity in image segmentation. In this paper, we proposed a SAM-adapted weakly-supervised method with three components: 1) an adapted EfficientViT-SAM network (AESAM) for semantic segmentation guided by point labels, 2) an uncertainty-based pseudo-label generation module to select reliable pseudo-labels for supervising unlabeled pixels, and 3) a transformation consistency constraint for enhancing AESAM’s robustness to data perturbations. The proposed method was tested on the ISPRS Vaihingen dataset (collected from airplane), the Zurich Summer dataset (satellite), and the UAVid dataset (drone). Results demonstrated a significant improvement in mean F1 (by 5.89 %–10.56 %) and mean IoU (by 5.95 %–11.13 %) compared to the baseline method. Compared to the closest competitors, there was an increase in mean F1 (by 0.83 %–5.29 %) and mean IoU (by 1.04 %–6.54 %). Furthermore, our approach requires only fine-tuning a small number of parameters (0.9 M) using cheap point labels, making it promising for scenarios with limited labeling budgets. The code is available at https://github.com/lauraset/SAM-UTC-WSSS.

中文翻译:


一种受不确定性和变换一致性约束的 SAM 适应弱监督语义分割方法



遥感影像的语义分割是生成像素级类别地图的基本任务。现有的深度学习网络严重依赖密集的像素标签,从而产生高昂的购置成本。鉴于这一挑战,本研究引入了稀疏点标签,这是一种具有成本效益的弱标签,用于语义分割。现有的弱监督方法往往利用来自网络的低级视觉或高级语义特征来生成未标记像素的监督信息,这很容易导致标签噪声的问题。此外,这些方法很少探索通用基础模型,即分割任何模型 (SAM),在图像分割中具有很强的零镜头泛化能力。在本文中,我们提出了一种 SAM 适应的弱监督方法,其中包含三个组成部分:1) 用于点标签引导的语义分割的适配 EfficientViT-SAM 网络 (AESAM),2) 基于不确定性的伪标签生成模块,用于选择可靠的伪标签来监督未标记的像素,以及 3) 用于增强 AESAM 对数据扰动的鲁棒性的转换一致性约束。所提出的方法在 ISPRS Vaihingen 数据集(从飞机上收集)、苏黎世夏季数据集(卫星)和 UAVid 数据集(无人机)上进行了测试。结果显示,与基线方法相比,平均 F1 (5.89 %–10.56 %) 和平均 IoU (5.95 %–11.13 %) 显着改善。与最接近的竞争对手相比,平均 F1 (增加 0.83 %–5.29 %) 和平均 IoU (增加 1.04 %–6.54 %)。此外,我们的方法只需要使用廉价的点标签对少量参数 (0.9 M) 进行微调,这使得它适用于标记预算有限的场景。 该代码可在 https://github.com/lauraset/SAM-UTC-WSSS 获取。
更新日期:2025-02-25
down
wechat
bug