Exploiting Diffusion Prior for Real-World Image Super-Resolution,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Exploiting Diffusion Prior for Real-World Image Super-Resolution
International Journal of Computer Vision ( IF 11.6 ) Pub Date : 2024-07-11 , DOI: 10.1007/s11263-024-02168-7
Jianyi Wang , Zongsheng Yue , Shangchen Zhou , Kelvin C. K. Chan , Chen Change Loy

We present a novel approach to leverage prior knowledge encapsulated in pre-trained text-to-image diffusion models for blind super-resolution. Specifically, by employing our time-aware encoder, we can achieve promising restoration results without altering the pre-trained synthesis model, thereby preserving the generative prior and minimizing training cost. To remedy the loss of fidelity caused by the inherent stochasticity of diffusion models, we employ a controllable feature wrapping module that allows users to balance quality and fidelity by simply adjusting a scalar value during the inference process. Moreover, we develop a progressive aggregation sampling strategy to overcome the fixed-size constraints of pre-trained diffusion models, enabling adaptation to resolutions of any size. A comprehensive evaluation of our method using both synthetic and real-world benchmarks demonstrates its superiority over current state-of-the-art approaches. Code and models are available at https://github.com/IceClear/StableSR.

中文翻译：

利用扩散先验实现真实世界图像超分辨率

我们提出了一种新颖的方法来利用封装在预先训练的文本到图像扩散模型中的先验知识来实现盲超分辨率。具体来说，通过使用我们的时间感知编码器，我们可以在不改变预先训练的合成模型的情况下实现有希望的恢复结果，从而保留生成先验并最小化训练成本。为了弥补扩散模型固有的随机性造成的保真度损失，我们采用了可控的特征包装模块，该模块允许用户通过在推理过程中简单地调整标量值来平衡质量和保真度。此外，我们开发了一种渐进聚合采样策略来克服预训练扩散模型的固定大小限制，从而能够适应任何大小的分辨率。使用合成基准和现实基准对我们的方法进行全面评估，证明了其相对于当前最先进方法的优越性。代码和模型可在 https://github.com/IceClear/StableSR 获取。

更新日期：2024-07-12

点击分享查看原文

点击收藏

阅读更多本刊新发论文本刊介绍/投稿指南