当前位置:
X-MOL 学术
›
Inform. Fusion
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Image colorization: A survey and dataset
Information Fusion ( IF 14.7 ) Pub Date : 2024-09-30 , DOI: 10.1016/j.inffus.2024.102720 Saeed Anwar, Muhammad Tahir, Chongyi Li, Ajmal Mian, Fahad Shahbaz Khan, Abdul Wahab Muzaffar
Information Fusion ( IF 14.7 ) Pub Date : 2024-09-30 , DOI: 10.1016/j.inffus.2024.102720 Saeed Anwar, Muhammad Tahir, Chongyi Li, Ajmal Mian, Fahad Shahbaz Khan, Abdul Wahab Muzaffar
Image colorization estimates RGB colors for grayscale images or video frames to improve their aesthetic and perceptual quality. Over the last decade, deep learning techniques for image colorization have significantly progressed, necessitating a systematic survey and benchmarking of these techniques. This article presents a comprehensive survey of recent state-of-the-art deep learning-based image colorization techniques, describing their fundamental block architectures, inputs, optimizers, loss functions, training protocols, training data, etc. It categorizes the existing colorization techniques into seven classes and discusses important factors governing their performance, such as benchmark datasets and evaluation metrics. We highlight the limitations of existing datasets and introduce a new dataset specific to colorization. We perform an extensive experimental evaluation of existing image colorization methods using both existing datasets and our proposed one. Finally, we discuss the limitations of existing methods and recommend possible solutions and future research directions for this rapidly evolving topic of deep image colorization. The dataset and codes for evaluation are publicly available at https://github.com/saeed-anwar/ColorSurvey .
中文翻译:
图像着色:调查和数据集
图像着色会估计灰度图像或视频帧的 RGB 颜色,以提高其美学和感知质量。在过去十年中,用于图像着色的深度学习技术取得了重大进步,因此需要对这些技术进行系统调查和基准测试。本文全面介绍了最近最先进的基于深度学习的图像着色技术,描述了它们的基本块架构、输入、优化器、损失函数、训练协议、训练数据等。它将现有的着色技术分为七类,并讨论了控制其性能的重要因素,例如基准数据集和评估指标。我们强调了现有数据集的局限性,并引入了一个特定于着色的新数据集。我们使用现有数据集和我们提议的数据集对现有的图像着色方法进行了广泛的实验评估。最后,我们讨论了现有方法的局限性,并为这个快速发展的深度图像着色主题推荐了可能的解决方案和未来的研究方向。用于评估的数据集和代码在 https://github.com/saeed-anwar/ColorSurvey 上公开提供。
更新日期:2024-09-30
中文翻译:
图像着色:调查和数据集
图像着色会估计灰度图像或视频帧的 RGB 颜色,以提高其美学和感知质量。在过去十年中,用于图像着色的深度学习技术取得了重大进步,因此需要对这些技术进行系统调查和基准测试。本文全面介绍了最近最先进的基于深度学习的图像着色技术,描述了它们的基本块架构、输入、优化器、损失函数、训练协议、训练数据等。它将现有的着色技术分为七类,并讨论了控制其性能的重要因素,例如基准数据集和评估指标。我们强调了现有数据集的局限性,并引入了一个特定于着色的新数据集。我们使用现有数据集和我们提议的数据集对现有的图像着色方法进行了广泛的实验评估。最后,我们讨论了现有方法的局限性,并为这个快速发展的深度图像着色主题推荐了可能的解决方案和未来的研究方向。用于评估的数据集和代码在 https://github.com/saeed-anwar/ColorSurvey 上公开提供。