当前位置:
X-MOL 学术
›
IEEE Trans. Image Process.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Learning Content-Weighted Pseudocylindrical Representation for 360° Image Compression
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2024-10-17 , DOI: 10.1109/tip.2024.3477356 Mu Li, Youneng Bao, Xiaohang Sui, Jinxing Li, Guangming Lu, Yong Xu
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2024-10-17 , DOI: 10.1109/tip.2024.3477356 Mu Li, Youneng Bao, Xiaohang Sui, Jinxing Li, Guangming Lu, Yong Xu
Learned 360° image compression methods using equirectangular projection (ERP) often confront a non-uniform sampling issue, inherent to sphere-to-rectangle projection. While uniformly or nearly uniformly sampling representations, along with their corresponding convolution operations, have been proposed to mitigate this issue, these methods often concentrate solely on uniform sampling rates, thus neglecting the content of the image. In this paper, we urge that different contents within 360° images have varying significance and advocate for the adoption of a content-adaptive parametric representation in 360° image compression, which takes into account both the content and sampling rate. We first introduce the parametric pseudocylindrical representation and corresponding convolution operation, upon which we build a learned 360° image codec. Then, we model the hyperparameter of the representation as the output of a network, derived from the image’s content and its spherical coordinates. We treat the optimization of hyperparameters for different 360° images as distinct compression tasks and propose a meta-learning algorithm to jointly optimize the codec and the metaknowledge, i.e., the hyperparameter estimation network. A significant challenge is the lack of a direct derivative from the compression loss to the hyperparameter network. To address this, we present a novel method to relax the rate-distortion loss as a function of the hyperparameters, enabling gradient-based optimization of the metaknowledge. Experimental results on omnidirectional images demonstrate that our method achieves state-of-the-art performance and superior visual quality.
中文翻译:
学习用于 360° 图像压缩的内容加权伪圆柱表示
使用等距柱状投影 (ERP) 的 360° 图像压缩方法经常面临球体到矩形投影所固有的不均匀采样问题。虽然已经提出了均匀或几乎均匀采样表示及其相应的卷积运算来缓解这个问题,但这些方法通常只关注均匀的采样率,从而忽略了图像的内容。在本文中,我们敦促 360° 图像中的不同内容具有不同的意义,并倡导在 360° 图像压缩中采用内容自适应参数表示,同时考虑内容和采样率。我们首先介绍了参数伪圆柱表示和相应的卷积操作,在此基础上构建了一个学习的 360° 图像编解码器。然后,我们将表示的超参数建模为网络的输出,该网络源自图像的内容及其球坐标。我们将不同 360° 图像的超参数优化视为不同的压缩任务,并提出了一种元学习算法来联合优化编解码器和元知识,即超参数估计网络。一个重大挑战是缺乏从压缩损失到超参数网络的直接导数。为了解决这个问题,我们提出了一种新的方法来放宽作为超参数函数的速率失真损失,从而实现基于梯度的元知识优化。全向图像的实验结果表明,我们的方法实现了最先进的性能和卓越的视觉质量。
更新日期:2024-10-17
中文翻译:
学习用于 360° 图像压缩的内容加权伪圆柱表示
使用等距柱状投影 (ERP) 的 360° 图像压缩方法经常面临球体到矩形投影所固有的不均匀采样问题。虽然已经提出了均匀或几乎均匀采样表示及其相应的卷积运算来缓解这个问题,但这些方法通常只关注均匀的采样率,从而忽略了图像的内容。在本文中,我们敦促 360° 图像中的不同内容具有不同的意义,并倡导在 360° 图像压缩中采用内容自适应参数表示,同时考虑内容和采样率。我们首先介绍了参数伪圆柱表示和相应的卷积操作,在此基础上构建了一个学习的 360° 图像编解码器。然后,我们将表示的超参数建模为网络的输出,该网络源自图像的内容及其球坐标。我们将不同 360° 图像的超参数优化视为不同的压缩任务,并提出了一种元学习算法来联合优化编解码器和元知识,即超参数估计网络。一个重大挑战是缺乏从压缩损失到超参数网络的直接导数。为了解决这个问题,我们提出了一种新的方法来放宽作为超参数函数的速率失真损失,从而实现基于梯度的元知识优化。全向图像的实验结果表明,我们的方法实现了最先进的性能和卓越的视觉质量。