当前位置:
X-MOL 学术
›
IEEE Trans. Image Process.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Transforming Image Super-Resolution: A ConvFormer-Based Efficient Approach
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2024-10-18 , DOI: 10.1109/tip.2024.3477350 Gang Wu, Junjun Jiang, Junpeng Jiang, Xianming Liu
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2024-10-18 , DOI: 10.1109/tip.2024.3477350 Gang Wu, Junjun Jiang, Junpeng Jiang, Xianming Liu
Recent progress in single-image super-resolution (SISR) has achieved remarkable performance, yet the computational costs of these methods remain a challenge for deployment on resource-constrained devices. In particular, transformer-based methods, which leverage self-attention mechanisms, have led to significant breakthroughs but also introduce substantial computational costs. To tackle this issue, we introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR), offering an effective and efficient solution for lightweight image super-resolution. The proposed method inherits the advantages of both convolution-based and transformer-based approaches. Specifically, CFSR utilizes large kernel convolutions as a feature mixer to replace the self-attention module, efficiently modeling long-range dependencies and extensive receptive fields with minimal computational overhead. Furthermore, we propose an edge-preserving feed-forward network (EFN) designed to achieve local feature aggregation while effectively preserving high-frequency information. Extensive experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance compared to existing lightweight SR methods. When benchmarked against state-of-the-art methods such as ShuffleMixer, the proposed CFSR achieves a gain of 0.39 dB on the Urban100 dataset for the x2 super-resolution task while requiring 26% and 31% fewer parameters and FLOPs, respectively. The code and pre-trained models are available at https://github.com/Aitical/CFSR
.
中文翻译:
转换图像超分辨率:一种基于 ConvFormer 的高效方法
单图像超分辨率 (SISR) 的最新进展取得了显着的性能,但这些方法的计算成本仍然是在资源受限的设备上部署的挑战。特别是,利用自我注意机制的基于 transformer 的方法带来了重大突破,但也带来了大量的计算成本。为了解决这个问题,我们引入了卷积变换器层 (ConvFormer) 并提出了一种基于 ConvFormer 的超分辨率网络 (CFSR),为轻量级图像超分辨率提供了一种有效且高效的解决方案。所提出的方法继承了基于卷积和基于转换器的方法的优点。具体来说,CFSR 利用大型内核卷积作为特征混合器来取代自我注意模块,以最小的计算开销有效地对长距离依赖关系和广泛的感受野进行建模。此外,我们提出了一种边缘保留前馈网络 (EFN),旨在实现局部特征聚合,同时有效地保留高频信息。大量实验表明,与现有的轻量级 SR 方法相比,CFSR 在计算成本和性能之间取得了最佳平衡。当与 ShuffleMixer 等最先进的方法进行基准测试时,所提出的 CFSR 在 Urban100 数据集上实现了 x2 超分辨率任务的 0.39 dB 增益,同时需要的参数和 FLOP 分别减少了 26% 和 31%。代码和预训练模型可在 https://github.com/Aitical/CFSR 获取。
更新日期:2024-10-18
中文翻译:
转换图像超分辨率:一种基于 ConvFormer 的高效方法
单图像超分辨率 (SISR) 的最新进展取得了显着的性能,但这些方法的计算成本仍然是在资源受限的设备上部署的挑战。特别是,利用自我注意机制的基于 transformer 的方法带来了重大突破,但也带来了大量的计算成本。为了解决这个问题,我们引入了卷积变换器层 (ConvFormer) 并提出了一种基于 ConvFormer 的超分辨率网络 (CFSR),为轻量级图像超分辨率提供了一种有效且高效的解决方案。所提出的方法继承了基于卷积和基于转换器的方法的优点。具体来说,CFSR 利用大型内核卷积作为特征混合器来取代自我注意模块,以最小的计算开销有效地对长距离依赖关系和广泛的感受野进行建模。此外,我们提出了一种边缘保留前馈网络 (EFN),旨在实现局部特征聚合,同时有效地保留高频信息。大量实验表明,与现有的轻量级 SR 方法相比,CFSR 在计算成本和性能之间取得了最佳平衡。当与 ShuffleMixer 等最先进的方法进行基准测试时,所提出的 CFSR 在 Urban100 数据集上实现了 x2 超分辨率任务的 0.39 dB 增益,同时需要的参数和 FLOP 分别减少了 26% 和 31%。代码和预训练模型可在 https://github.com/Aitical/CFSR 获取。