UPR-Net: A Unified Pyramid Recurrent Network for Video Frame Interpolation,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

UPR-Net: A Unified Pyramid Recurrent Network for Video Frame Interpolation
International Journal of Computer Vision ( IF 11.6 ) Pub Date : 2024-07-17 , DOI: 10.1007/s11263-024-02164-x
Xin Jin , Longhai Wu , Jie Chen , Youxin Chen , Jayoon Koo , Cheul-Hee Hahm , Zhao-Min Chen

Flow-guided synthesis provides a popular framework for video frame interpolation, where optical flow is firstly estimated to warp input frames, and then the intermediate frame is synthesized from warped representations. Within this framework, optical flow is typically estimated from coarse-to-fine by a pyramid network, but the intermediate frame is commonly synthesized in a single pass, missing the opportunity of refining possible imperfect synthesis for high-resolution and large motion cases. While cascading several synthesis networks is a natural idea, it is nontrivial to unify iterative estimation of both optical flow and intermediate frame into a compact, flexible, and general framework. In this paper, we present UPR-Net, a novel Unified Pyramid Recurrent Network for frame interpolation. Cast in a flexible pyramid framework, UPR-Net exploits lightweight recurrent modules for both bi-directional flow estimation and intermediate frame synthesis. At each pyramid level, it leverages estimated bi-directional flow to generate forward-warped representations for frame synthesis; across pyramid levels, it enables iterative refinement for both optical flow and intermediate frame. We show that our iterative synthesis significantly improves the interpolation robustness on large motion cases, and the recurrent module design enables flexible resolution-aware adaptation in testing. When trained on low-resolution data, UPR-Net can achieve excellent performance on both low- and high-resolution benchmarks. Despite being extremely lightweight (1.7M parameters), the base version of UPR-Net competes favorably with many methods that rely on much heavier architectures. Code and trained models are publicly available at: https://github.com/srcn-ivl/UPR-Net.

中文翻译：

UPR-Net：用于视频帧插值的统一金字塔循环网络

流引导合成为视频帧插值提供了一种流行的框架，其中首先估计光流以扭曲输入帧，然后从扭曲的表示合成中间帧。在此框架内，光流通常由金字塔网络从粗到细估计，但中间帧通常在单通道中合成，错过了针对高分辨率和大运动情况细化可能的不完美合成的机会。虽然级联多个合成网络是一个自然的想法，但将光流和中间帧的迭代估计统一到一个紧凑、灵活和通用的框架中并非易事。在本文中，我们提出了 UPR-Net，一种用于帧插值的新型统一金字塔循环网络。 UPR-Net 采用灵活的金字塔框架，利用轻量级循环模块进行双向流估计和中间帧合成。在每个金字塔级别，它利用估计的双向流来生成用于帧合成的前向扭曲表示；跨金字塔级别，它可以实现光流和中间帧的迭代细化。我们表明，我们的迭代综合显着提高了大运动情况下的插值鲁棒性，并且循环模块设计可以在测试中实现灵活的分辨率感知适应。当使用低分辨率数据进行训练时，UPR-Net 可以在低分辨率和高分辨率基准上实现出色的性能。尽管非常轻量级（1.7M 参数），UPR-Net 的基本版本与许多依赖更重架构的方法相比具有优势。代码和训练模型可在以下网址公开获取：https://github.com/srcn-ivl/UPR-Net。

更新日期：2024-07-17

点击分享查看原文

点击收藏

阅读更多本刊新发论文本刊介绍/投稿指南