当前位置: X-MOL 学术Med. Image Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation
Medical Image Analysis ( IF 10.7 ) Pub Date : 2024-11-12 , DOI: 10.1016/j.media.2024.103387
Qiyuan Wang, Shang Zhao, Zikang Xu, S. Kevin Zhou

Surgical instrument segmentation is instrumental to minimally invasive surgeries and related applications. Most previous methods formulate this task as single-frame-based instance segmentation while ignoring the natural temporal and stereo attributes of a surgical video. As a result, these methods are less robust against the appearance variation through temporal motion and view change. In this work, we propose a novel LACOSTE model that exploits Location-Agnostic COntexts in Stereo and TEmporal images for improved surgical instrument segmentation. Leveraging a query-based segmentation model as core, we design three performance-enhancing modules. Firstly, we design a disparity-guided feature propagation module to enhance depth-aware features explicitly. To generalize well for even only a monocular video, we apply a pseudo stereo scheme to generate complementary right images. Secondly, we propose a stereo-temporal set classifier, which aggregates stereo-temporal contexts in a universal way for making a consolidated prediction and mitigates transient failures. Finally, we propose a location-agnostic classifier to decouple the location bias from mask prediction and enhance the feature semantics. We extensively validate our approach on three public surgical video datasets, including two benchmarks from EndoVis Challenges and one real radical prostatectomy surgery dataset GraSP. Experimental results demonstrate the promising performances of our method, which consistently achieves comparable or favorable results with previous state-of-the-art approaches.

中文翻译:


LACOSTE:利用立体和时间上下文进行手术器械分割



手术器械分割有助于微创手术和相关应用。以前的大多数方法都将此任务表述为基于单帧的实例分割,同时忽略了手术视频的自然时间和立体属性。因此,这些方法对通过时间运动和视图变化引起的外观变化的稳健性较差。在这项工作中,我们提出了一种新的 LACOSTE 模型,该模型利用立体和图像中与位置无关的文本来改进手术器械分割。利用基于查询的分段模型作为核心,我们设计了三个性能增强模块。首先,我们设计了一个视差导向的特征传播模块,以显式增强深度感知特征。为了更好地泛化甚至仅针对单目视频,我们应用伪立体方案来生成互补的右侧图像。其次,我们提出了一个立体时间集合分类器,它以一种通用的方式聚合立体时间上下文,以进行整合预测并减轻瞬态故障。最后,我们提出了一个与位置无关的分类器,以将位置偏差与掩码预测解耦,并增强特征语义。我们在三个公共手术视频数据集上广泛验证了我们的方法,包括来自 EndoVis Challenges 的两个基准和一个真正的根治性前列腺切除术数据集 GraSP。实验结果表明,我们的方法具有有希望的性能,它始终如一地获得与以前最先进的方法相当或有利的结果。
更新日期:2024-11-12
down
wechat
bug