当前位置:
X-MOL 学术
›
ACM Comput. Surv.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
A Survey on Video Diffusion Models
ACM Computing Surveys ( IF 23.8 ) Pub Date : 2024-09-18 , DOI: 10.1145/3696415 Zhen Xing, Qijun Feng, Haoran Chen, Qi Dai, Han Hu, Hang Xu, Zuxuan Wu, Yu-Gang Jiang
ACM Computing Surveys ( IF 23.8 ) Pub Date : 2024-09-18 , DOI: 10.1145/3696415 Zhen Xing, Qijun Feng, Haoran Chen, Qi Dai, Han Hu, Hang Xu, Zuxuan Wu, Yu-Gang Jiang
The recent wave of AI-generated content (AIGC) has witnessed substantial success in computer vision, with the diffusion model playing a crucial role in this achievement. Due to their impressive generative capabilities, diffusion models are gradually superseding methods based on GANs and auto-regressive Transformers, demonstrating exceptional performance not only in image generation and editing, but also in the realm of video-related research. However, existing surveys mainly focus on diffusion models in the context of image generation, with few up-to-date reviews on their application in the video domain. To address this gap, this paper presents a comprehensive review of video diffusion models in the AIGC era. Specifically, we begin with a concise introduction to the fundamentals and evolution of diffusion models. Subsequently, we present an overview of research on diffusion models in the video domain, categorizing the work into three key areas: video generation, video editing, and other video understanding tasks. We conduct a thorough review of the literature in these three key areas, including further categorization and practical contributions in the field. Finally, we discuss the challenges faced by research in this domain and outline potential future developmental trends. A comprehensive list of video diffusion models studied in this survey is available at https://github.com/ChenHsing/Awesome-Video-Diffusion-Models.
中文翻译:
视频扩散模型调查
最近的 AI 生成内容 (AIGC) 浪潮见证了计算机视觉的巨大成功,扩散模型在这一成就中发挥了关键作用。由于其令人印象深刻的生成能力,扩散模型正逐渐取代基于 GAN 和自回归 Transformer 的方法,不仅在图像生成和编辑方面,而且在视频相关研究领域都表现出卓越的性能。然而,现有的调查主要集中在图像生成背景下的扩散模型上,很少有关于它们在视频领域应用的最新评论。为了解决这一差距,本文对 AIGC 时代的视频扩散模型进行了全面回顾。具体来说,我们首先简要介绍了扩散模型的基本原理和演变。随后,我们概述了视频领域扩散模型的研究,将工作分为三个关键领域:视频生成、视频编辑和其他视频理解任务。我们对这三个关键领域的文献进行了全面回顾,包括进一步的分类和在该领域的实际贡献。最后,我们讨论了该领域研究面临的挑战,并概述了潜在的未来发展趋势。本调查中研究的视频扩散模型的完整列表可在 https://github.com/ChenHsing/Awesome-Video-Diffusion-Models 上获得。
更新日期:2024-09-18
中文翻译:
视频扩散模型调查
最近的 AI 生成内容 (AIGC) 浪潮见证了计算机视觉的巨大成功,扩散模型在这一成就中发挥了关键作用。由于其令人印象深刻的生成能力,扩散模型正逐渐取代基于 GAN 和自回归 Transformer 的方法,不仅在图像生成和编辑方面,而且在视频相关研究领域都表现出卓越的性能。然而,现有的调查主要集中在图像生成背景下的扩散模型上,很少有关于它们在视频领域应用的最新评论。为了解决这一差距,本文对 AIGC 时代的视频扩散模型进行了全面回顾。具体来说,我们首先简要介绍了扩散模型的基本原理和演变。随后,我们概述了视频领域扩散模型的研究,将工作分为三个关键领域:视频生成、视频编辑和其他视频理解任务。我们对这三个关键领域的文献进行了全面回顾,包括进一步的分类和在该领域的实际贡献。最后,我们讨论了该领域研究面临的挑战,并概述了潜在的未来发展趋势。本调查中研究的视频扩散模型的完整列表可在 https://github.com/ChenHsing/Awesome-Video-Diffusion-Models 上获得。