ModeT: Learning Deformable Image Registration via Motion Decomposition Transformer,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

ModeT: Learning Deformable Image Registration via Motion Decomposition Transformer
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2023-06-09 , DOI: arxiv-2306.05688
Haiqiao Wang, Dong Ni, Yi Wang

The Transformer structures have been widely used in computer vision and have recently made an impact in the area of medical image registration. However, the use of Transformer in most registration networks is straightforward. These networks often merely use the attention mechanism to boost the feature learning as the segmentation networks do, but do not sufficiently design to be adapted for the registration task. In this paper, we propose a novel motion decomposition Transformer (ModeT) to explicitly model multiple motion modalities by fully exploiting the intrinsic capability of the Transformer structure for deformation estimation. The proposed ModeT naturally transforms the multi-head neighborhood attention relationship into the multi-coordinate relationship to model multiple motion modes. Then the competitive weighting module (CWM) fuses multiple deformation sub-fields to generate the resulting deformation field. Extensive experiments on two public brain magnetic resonance imaging (MRI) datasets show that our method outperforms current state-of-the-art registration networks and Transformers, demonstrating the potential of our ModeT for the challenging non-rigid deformation estimation problem. The benchmarks and our code are publicly available at https://github.com/ZAX130/SmileCode.

中文翻译：

ModeT：通过运动分解变换器学习可变形图像配准

Transformer 结构已广泛应用于计算机视觉，并且最近在医学图像配准领域产生了影响。然而，在大多数注册网络中使用 Transformer 是很简单的。这些网络通常仅像分割网络那样使用注意力机制来促进特征学习，但没有充分设计以适应注册任务。在本文中，我们提出了一种新颖的运动分解 Transformer (ModeT)，通过充分利用 Transformer 结构的内在能力来进行变形估计，从而显式地对多种运动模式进行建模。所提出的 ModeT 自然地将多头邻域注意关系转换为多坐标关系，以对多种运动模式进行建模。然后竞争加权模块（CWM）融合多个变形子场以生成最终的变形场。在两个公共脑磁共振成像 (MRI) 数据集上进行的大量实验表明，我们的方法优于当前最先进的配准网络和 Transformers，证明了我们的 ModeT 在解决具有挑战性的非刚性变形估计问题方面的潜力。基准测试和我们的代码可在 https://github.com/ZAX130/SmileCode 上公开获得。

更新日期：2023-06-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>