样式: 排序: IF: - GO 导出 标记为已读
-
Exploring Multi-modal Spatial-Temporal Contexts for High-performance RGB-T Tracking IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-19 Tianlu Zhang, Qiang Jiao, Qiang Zhang, Jungong Han
-
Unsupervised Domain Adaptation via Domain-Adaptive Diffusion IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-15 Duo Peng, Qiuhong Ke, ArulMurugan Ambikapathi, Yasin Yazici, Yinjie Lei, Jun Liu
-
OTAMatch: Optimal Transport Assignment with PseudoNCE for Semi-supervised Learning IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-15 Jinjin Zhang, Junjie Liu, Debang Li, Qiuyu Huang, Jiaxin Chen, Di Huang
-
Enhanced Long-Tailed Recognition with Contrastive CutMix Augmentation IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-15 Haolin Pan, Yong Guo, Mianjie Yu, Jian Chen
-
HAFormer: Unleashing the Power of Hierarchy-Aware Features for Lightweight Semantic Segmentation IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-15 Guoan Xu, Wenjing Jia, Tao Wu, Ligeng Chen, Guangwei Gao
-
Remote Sensing Change Detection With Bitemporal and Differential Feature Interactive Perception IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-15 Hao Chang, Peijin Wang, Wenhui Diao, Guangluan Xu, Xian Sun
-
Cayley Rotation Averaging: Multiple Camera Averaging Under the Cayley Framework IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-11 Qiulei Dong, Shuang Deng, Yuzhen Liu
Rotation averaging, which aims to calculate the absolute rotations of a set of cameras from a redundant set of their relative rotations, is an important and challenging topic arising in the study of structure from motion. A central problem in rotation averaging is how to alleviate the influence of noise and outliers. Addressing this problem, we investigate rotation averaging under the Cayley framework
-
Scalable Deep Color Quantization: a Cluster Imitation Approach IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-11 Yunzhong Hou, Stephen Gould, Liang Zheng
-
Learning Virtual View Selection for 3D Scene Semantic Segmentation IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-10 Tai-Jiang Mu, Ming-Yuan Shen, Yu-Kun Lai, Shi-Min Hu
2D-3D joint learning is essential and effective for fundamental 3D vision tasks, such as 3D semantic segmentation, due to the complementary information these two visual modalities contain. Most current 3D scene semantic segmentation methods process 2D images “as they are”, i.e., only real captured 2D images are used. However, such captured 2D images may be redundant, with abundant occlusion and/or
-
Enhancing Low-Light Light Field Images With a Deep Compensation Unfolding Network IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-04 Xianqiang Lyu, Junhui Hou
This paper presents a novel and interpretable end-to-end learning framework, called the deep compensation unfolding network (DCUNet), for restoring light field (LF) images captured under low-light conditions. DCUNet is designed with a multi-stage architecture that mimics the optimization process of solving an inverse imaging problem in a data-driven fashion. The framework uses the intermediate enhanced
-
Spectral Embedding Fusion for Incomplete Multiview Clustering IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-04 Jie Chen, Yingke Chen, Zhu Wang, Haixian Zhang, Xi Peng
Incomplete multiview clustering (IMVC) aims to reveal the underlying structure of incomplete multiview data by partitioning data samples into clusters. Several graph-based methods exhibit a strong ability to explore high-order information among multiple views using low-rank tensor learning. However, spectral embedding fusion of multiple views is ignored in low-rank tensor learning. In addition, addressing
-
One-Shot Any-Scene Crowd Counting With Local-to-Global Guidance IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-04 Jiwei Chen, Zengfu Wang
-
Feature Mixture on Pre-Trained Model for Few-Shot Learning IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-02 Shuo Wang, Jinda Lu, Haiyang Xu, Yanbin Hao, Xiangnan He
Few-shot learning (FSL) aims at recognizing a novel object under limited training samples. A robust feature extractor (backbone) can significantly improve the recognition performance of the FSL model. However, training an effective backbone is a challenging issue since 1) designing and validating structures of backbones are time-consuming and expensive processes, and 2) a backbone trained on the known
-
Dynamic Spatio-Temporal Graph Reasoning for VideoQA With Self-Supervised Event Recognition IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-02 Jie Nie, Xin Wang, Runze Hou, Guohao Li, Hong Chen, Wenwu Zhu
Video question answering (VideoQA) requires the ability of comprehensively understanding visual contents in videos. Existing VideoQA models mainly focus on scenarios involving a single event with simple object interactions and leave event-centric scenarios involving multiple events with dynamically complex object interactions largely unexplored. These conventional VideoQA models are usually based on
-
Multiple Riemannian Kernel Hashing for Large-scale Image Set Classification and Retrieval IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-02 Xiaobo Shen, Wei Wu, Xiaxin Wang, Yuhui Zheng
-
Learning Kernel-Modulated Neural Representation for Efficient Light Field Compression IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-01 Jinglei Shi, Yihong Xu, Christine Guillemot
Light fields capture 3D scene information by recording light rays emitted from a scene at various orientations. They offer a more immersive perception, compared with classic 2D images, but at the cost of huge data volumes. In this paper, we design a compact neural network representation for the light field compression task. In the same vein as the deep image prior, the neural network takes randomly
-
Learning to Discover Knowledge: A Weakly-Supervised Partial Domain Adaptation Approach IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-07-01 Mengcheng Lan, Min Meng, Jun Yu, Jigang Wu
Domain adaptation has shown appealing performance by leveraging knowledge from a source domain with rich annotations. However, for a specific target task, it is cumbersome to collect related and high-quality source domains. In real-world scenarios, large-scale datasets corrupted with noisy labels are easy to collect, stimulating a great demand for automatic recognition in a generalized setting, i.e
-
5-D Epanechnikov Mixture-of-Experts in Light Field Image Compression IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-28 Boning Liu, Yan Zhao, Xiaomeng Jiang, Xingguang Ji, Shigang Wang, Yebin Liu, Jian Wei
In this study, we propose a modeling-based compression approach for dense/lenslet light field images captured by Plenoptic 2.0 with square microlenses. This method employs the 5-D Epanechnikov Kernel (5-D EK) and its associated theories. Owing to the limitations of modeling larger image block using the Epanechnikov Mixture Regression (EMR), a 5-D Epanechnikov Mixture-of-Experts using Gaussian Initialization
-
Single-Subject Deep-Learning Image Reconstruction With a Neural Optimization Transfer Algorithm for PET-Enabled Dual-Energy CT Imaging IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-28 Siqi Li, Yansong Zhu, Benjamin A. Spencer, Guobao Wang
Combining dual-energy computed tomography (DECT) with positron emission tomography (PET) offers many potential clinical applications but typically requires expensive hardware upgrades or increases radiation doses on PET/CT scanners due to an extra X-ray CT scan. The recent PET-enabled DECT method allows DECT imaging on PET/CT without requiring a second X-ray CT scan. It combines the already existing
-
Image Quality Assessment: Measuring Perceptual Degradation via Distribution Measures in Deep Feature Spaces IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-28 Xingran Liao, Xuekai Wei, Mingliang Zhou, Zhengguo Li, Sam Kwong
This study aims to develop advanced and training-free full-reference image quality assessment (FR-IQA) models based on deep neural networks. Specifically, we investigate measures that allow us to perceptually compare deep network features and reveal their underlying factors. We find that distribution measures enjoy advanced perceptual awareness and test the Wasserstein distance (WSD), Jensen-Shannon
-
Siamese-DETR for Generic Multi-Object Tracking IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-25 Qiankun Liu, Yichen Li, Yuqi Jiang, Ying Fu
The ability to detect and track the dynamic objects in different scenes is fundamental to real-world applications, e.g., autonomous driving and robot navigation. However, traditional Multi-Object Tracking (MOT) is limited to track objects belonging to the pre-defined closed-set categories. Recently, Generic MOT (GMOT) is proposed to track interested objects beyond pre-defined categories and it can
-
Self-Supervised Representation Learning With Spatial-Temporal Consistency for Sign Language Recognition IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-25 Weichao Zhao, Wengang Zhou, Hezhen Hu, Min Wang, Houqiang Li
Recently, there have been efforts to improve the performance in sign language recognition by designing self-supervised learning methods. However, these methods capture limited information from sign pose data in a frame-wise learning manner, leading to sub-optimal solutions. To this end, we propose a simple yet effective self-supervised contrastive learning framework to excavate rich context via spatial-temporal
-
PIG: Prompt Images Guidance for Night-Time Scene Parsing IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-24 Zhifeng Xie, Rui Qiu, Sen Wang, Xin Tan, Yuan Xie, Lizhuang Ma
Night-time scene parsing aims to extract pixel-level semantic information in night images, aiding downstream tasks in understanding scene object distribution. Due to limited labeled night image datasets, unsupervised domain adaptation (UDA) has become the predominant method for studying night scenes. UDA typically relies on paired day-night image pairs to guide adaptation, but this approach hampers
-
Causality-Enhanced Multiple Instance Learning With Graph Convolutional Networks for Parkinsonian Freezing-of-Gait Assessment IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-24 Rui Guo, Zheng Xie, Chencheng Zhang, Xiaohua Qian
Freezing of gait (FoG) is a common disabling symptom of Parkinson’s disease (PD). It is clinically characterized by sudden and transient walking interruptions for specific human body parts, and it presents the localization in time and space. Due to the difficulty in extracting global fine-grained features from lengthy videos, developing an automated five-point FoG scoring system is quite challenging
-
Generalization Beyond Feature Alignment: Concept Activation-Guided Contrastive Learning IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-24 Yibing Liu, Chris Xing Tian, Haoliang Li, Shiqi Wang
-
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-24 Zhenyu Li, Xuyang Wang, Xianming Liu, Junjun Jiang
Monocular depth estimation (MDE) is a fundamental task in computer vision and has drawn increasing attention. Recently, some methods reformulate it as a classification-regression task to boost the model performance, where continuous depth is estimated via a linear combination of predicted probability distributions and discrete bins. In this paper, we present a novel framework called BinsFormer, tailored
-
Incrementally Adapting Pretrained Model Using Network Prior for Multi-Focus Image Fusion IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-21 Xingyu Hu, Junjun Jiang, Chenyang Wang, Xianming Liu, Jiayi Ma
Multi-focus image fusion can fuse the clear parts of two or more source images captured at the same scene with different focal lengths into an all-in-focus image. On the one hand, previous supervised learning-based multi-focus image fusion methods relying on synthetic datasets have a clear distribution shift with real scenarios. On the other hand, unsupervised learning-based multi-focus image fusion
-
L₀ Gradient-Regularization and Scale Space Representation Model for Cartoon and Texture Decomposition IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-20 Huan Pan, You-Wei Wen, Ya Huang
In this paper, we consider decomposing an image into its cartoon and texture components. Traditional methods, which mainly rely on the gradient amplitude of images to distinguish between these components, often show limitations in decomposing small-scale, high-contrast texture patterns and large-scale, low-contrast structural components. Specifically, these methods tend to decompose the former to the
-
Multi-Condition Latent Diffusion Network for Scene-Aware Neural Human Motion Prediction IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-20 Xuehao Gao, Yang Yang, Yang Wu, Shaoyi Du, Guo-Jun Qi
Inferring 3D human motion is fundamental in many applications, including understanding human activity and analyzing one’s intention. While many fruitful efforts have been made to human motion prediction, most approaches focus on pose-driven prediction and inferring human motion in isolation from the contextual environment, thus leaving the body location movement in the scene behind. However, real-world
-
Semi-Supervised Learning With Heterogeneous Distribution Consistency for Visible Infrared Person Re-Identification IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-20 Ziyu Wei, Xi Yang, Nannan Wang, Xinbo Gao
Visible infrared person re-identification (VI-ReID) exposes considerable challenges because of the modality gaps between the person images captured by daytime visible cameras and nighttime infrared cameras. Several fully-supervised VI-ReID methods have improved the performance with extensive labeled heterogeneous images. However, the identity of the person is difficult to obtain in real-world situations
-
SemiRS-COC: Semi-Supervised Classification for Complex Remote Sensing Scenes With Cross-Object Consistency IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-19 Qiang Liu, Jun Yue, Yang Kuang, Weiying Xie, Leyuan Fang
Semi-supervised learning (SSL), which aims to learn with limited labeled data and massive amounts of unlabeled data, offers a promising approach to exploit the massive amounts of satellite Earth observation images. The fundamental concept underlying most state-of-the-art SSL methods involves generating pseudo-labels for unlabeled data based on image-level predictions. However, complex remote sensing
-
A Single-Frame Deflectometry Method for Online Inspection of Light-Transmitting Components IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-19 Ning Yan, Dongxue Wang, Lei Liu, Zhuotong Li, Shuaipeng Yuan, Xiaodong Zhang
Transparent materials are widely used in industrial applications, such as construction, transportation, and optics. However, the complex optical properties of these materials make it difficult to achieve precise surface form measurements, especially for bulk surface form inspection in industrial environments. Traditional structured light-based measurement methods often struggle with suboptimal signal-to-noise
-
FF-LPD: A Real-Time Frame-by-Frame License Plate Detector With Knowledge Distillation and Feature Propagation IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-19 Haoxuan Ding, Junyu Gao, Yuan Yuan, Qi Wang
With the increasing availability of cameras in vehicles, obtaining license plate (LP) information via on-board cameras has become feasible in traffic scenarios. LPs play a pivotal role in vehicle identification, making automatic LP detection (ALPD) a crucial area within traffic analysis. Recent advancements in deep learning have spurred a surge of studies in ALPD. However, the computational limitations
-
Learnable Feature Augmentation Framework for Temporal Action Localization IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-18 Yepeng Tang, Weining Wang, Chunjie Zhang, Jing Liu, Yao Zhao
Temporal action localization (TAL) has drawn much attention in recent years, however, the performance of previous methods is still far from satisfactory due to the lack of annotated untrimmed video data. To deal with this issue, we propose to improve the utilization of current data through feature augmentation. Given an input video, we first extract video features with pre-trained video encoders, and
-
Learning Spherical Radiance Field for Efficient 360° Unbounded Novel View Synthesis IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-10 Minglin Chen, Longguang Wang, Yinjie Lei, Zilong Dong, Yulan Guo
Novel view synthesis aims at rendering any posed images from sparse observations of the scene. Recently, neural radiance fields (NeRF) have demonstrated their effectiveness in synthesizing novel views of a bounded scene. However, most existing methods cannot be directly extended to 360° unbounded scenes where the camera orientations and scene depths are unconstrained with large variations. In this
-
REACT: Remainder Adaptive Compensation for Domain Adaptive Object Detection IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-10 Haochen Li, Rui Zhang, Hantao Yao, Xin Zhang, Yifan Hao, Xinkai Song, Ling Li
Domain adaptive object detection (DAOD) aims to infer a robust detector on the target domain with the labelled source datasets. Recent studies utilize a feature extractor shared on the source and target domains to capture the domain-invariant features and the task-relevant information with both feature-alignment constraint and source annotations. However, the feature extractor shared across domains
-
Multi-Label Adversarial Attack With New Measures and Self-Paced Constraint Weighting IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-14 Fengguang Su, Ou Wu, Weiyao Zhu
An adversarial attack is typically implemented by solving a constrained optimization problem. In top-k adversarial attacks implementation for multi-label learning, the attack failure degree (AFD) and attack cost (AC) of a possible attack are major concerns. According to our experimental and theoretical analysis, existing methods are negatively impacted by the coarse measures for AFD/AC and the indiscriminate
-
IMU-Assisted Accurate Blur Kernel Re-Estimation in Non-Uniform Camera Shake Deblurring IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-14 Jianxiang Rong, Hua Huang, Jia Li
Image deblurring for camera shake is a highly regarded problem in the field of computer vision. A promising solution is the patch-wise non-uniform image deblurring algorithms, where a linear transformation model is typically established between different blur kernels to re-estimate poorly estimated blur kernels. However, the linear model struggles to effectively describe the nonlinear transformation
-
E-Calib: A Fast, Robust, and Accurate Calibration Toolbox for Event Cameras IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-13 Mohammed Salah, Abdulla Ayyad, Muhammad Humais, Daniel Gehrig, Abdelqader Abusafieh, Lakmal Seneviratne, Davide Scaramuzza, Yahya Zweiri
Event cameras triggered a paradigm shift in the computer vision community delineated by their asynchronous nature, low latency, and high dynamic range. Calibration of event cameras is always essential to account for the sensor intrinsic parameters and for 3D perception. However, conventional image-based calibration techniques are not applicable due to the asynchronous, binary output of the sensor.
-
Angular Isotonic Loss Guided Multi-Layer Integration for Few-Shot Fine-Grained Image Classification IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-13 Li-Jun Zhao, Zhen-Duo Chen, Zhen-Xiang Ma, Xin Luo, Xin-Shun Xu
Recent research on few-shot fine-grained image classification (FSFG) has predominantly focused on extracting discriminative features. The limited attention paid to the role of loss functions has resulted in weaker preservation of similarity relationships between query and support instances, thereby potentially limiting the performance of FSFG. In this regard, we analyze the limitations of widely adopted
-
Expanding and Refining Hybrid Compressors for Efficient Object Re-Identification IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-12 Yi Xie, Hanxiao Wu, Jianqing Zhu, Huanqiang Zeng, Jing Zhang
Recent object re-identification (Re-ID) methods gain high efficiency via lightweight student models trained by knowledge distillation (KD). However, the huge architectural difference between lightweight students and heavy teachers causes students to have difficulties in receiving and understanding teachers’ knowledge, thus losing certain accuracy. To this end, we propose a refiner-expander-refiner
-
Analysis of Coding Gain Due to In-Loop Reshaping IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-10 Chau-Wai Wong, Chang-Hong Fu, Mengting Xu, Guan-Ming Su
Reshaping, a point operation that alters the characteristics of signals, has been shown capable of improving the compression ratio in video coding practices. Out-of-loop reshaping that directly modifies the input video signal was first adopted as the supplemental enhancement information (SEI) for the HEVC/H.265 without the need to alter the core design of the video codec. VVC/H.266 further improves
-
Learning Discriminative Features for Crowd Counting IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-07 Yuehai Chen, Qingzhong Wang, Jing Yang, Badong Chen, Haoyi Xiong, Shaoyi Du
Crowd counting models in highly congested areas confront two main challenges: weak localization ability and difficulty in differentiating between foreground and background, leading to inaccurate estimations. The reason is that objects in highly congested areas are normally small and high-level features extracted by convolutional neural networks are less discriminative to represent small objects. To
-
Complete Region of Interest for Unconstrained Palmprint Recognition IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-05 Le Su, Lunke Fei, Bob Zhang, Shuping Zhao, Jie Wen, Yong Xu
Unconstrained palmprint images have shown great potential for recognition applications due to their lower restrictions regarding hand poses and backgrounds during contactless image acquisition. However, they face two challenges: 1) unclear palm contours and finger-valley points of unconstrained palmprint images make it difficult to locate landmarks to crop the palmprint region of interest (ROI); and
-
Bi-Fusion of Structure and Deformation at Multi-Scale for Joint Segmentation and Registration IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-05 Jiaju Zhang, Tianyu Fu, Deqiang Xiao, Jingfan Fan, Hong Song, Danni Ai, Jian Yang
Medical image segmentation and registration are two fundamental and highly related tasks. However, current works focus on the mutual promotion between the two at the loss function level, ignoring the feature information generated by the encoder-decoder network during the task-specific feature mapping process and the potential inter-task feature relationship. This paper proposes a unified multi-task
-
StructLane: Leveraging Structural Relations for Lane Detection IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-05 Linqing Zhao, Wenzhao Zheng, Yunpeng Zhang, Jie Zhou, Jiwen Lu
Accurately detecting the lanes plays a significant role in various autonomous and assistant driving scenarios. It is a highly structured task as lanes in the 3D world are continuous and parallel to each other. While most existing methods focus on how to inject structural priors into the representation of each lane, we propose a StructLane method to further leverage the structural relations among lanes
-
FATE: Learning Effective Binary Descriptors With Group Fairness IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-06-04 Fan Zhang, Chong Chen, Xian-Sheng Hua, Xiao Luo
Hashing has received significant interest in large-scale data retrieval due to its outstanding computational efficiency. Of late, numerous deep hashing approaches have emerged, which have obtained impressive performance. However, these approaches can contain ethical risks during image retrieval. To address this, we are the first to study the problem of group fairness within learning to hash and introduce
-
Multi-Person Pose Tracking With Sparse Key-Point Flow Estimation and Hierarchical Graph Distance Minimization IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-31 Yalong Jiang, Wenrui Ding, Hongguang Li, Zheru Chi
In this paper, we propose a novel framework for multi-person pose estimation and tracking on challenging scenarios. In view of occlusions and motion blurs which hinder the performance of pose tracking, we proposed to model humans as graphs and perform pose estimation and tracking by concentrating on the visible parts of human bodies which are informative about complete skeletons under incomplete observations
-
HDR or SDR? A Subjective and Objective Study of Scaled and Compressed Videos IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-30 Joshua P. Ebenezer, Zaixi Shang, Yixu Chen, Yongjun Wu, Hai Wei, Sriram Sethuraman, Alan C. Bovik
We conducted a large-scale study of human perceptual quality judgments of High Dynamic Range (HDR) and Standard Dynamic Range (SDR) videos subjected to scaling and compression levels and viewed on three different display devices. While conventional expectations are that HDR quality is better than SDR quality, we have found subject preference of HDR versus SDR depends heavily on the display device,
-
Disentangled Generation With Information Bottleneck for Enhanced Few-Shot Learning IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-30 Zhuohang Dang, Minnan Luo, Jihong Wang, Chengyou Jia, Caixia Yan, Guang Dai, Xiaojun Chang, Qinghua Zheng
Few-shot learning (FSL) poses a significant challenge in classifying unseen classes with limited samples, primarily stemming from the scarcity of data. Although numerous generative approaches have been investigated for FSL, their generation process often results in entangled outputs, exacerbating the distribution shift inherent in FSL. Consequently, this considerably hampers the overall quality of
-
High-Quality Fusion and Visualization for MR-PET Brain Tumor Images via Multi-Dimensional Features IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-30 Jinyu Wen, Asad Khan, Amei Chen, Weilong Peng, Meie Fang, C. L. Philip Chen, Ping Li
The fusion of magnetic resonance imaging and positron emission tomography can combine biological anatomical information and physiological metabolic information, which is of great significance for the clinical diagnosis and localization of lesions. In this paper, we propose a novel adaptive linear fusion method for multi-dimensional features of brain magnetic resonance and positron emission tomography
-
Learning Compact Hyperbolic Representations of Latent Space for Old Photo Restoration IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-30 Rui Chen, Tao Guo, Yang Mu, Li Shen
Recent restoration methods for handling real old photos have achieved significant improvements using generative networks. However, the restoration quality under the usual generative architectures is greatly affected by the encoded properties of latent space, which reflect pivotal semantic information in the recovery process. Therefore, how to find the suitable latent space and identify its semantic
-
Perception-Aware Texture Similarity Prediction IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-30 Weibo Wang, Xinghui Dong
Texture similarity plays important roles in texture analysis and material recognition. However, perceptually-consistent fine-grained texture similarity prediction is still challenging. The discrepancy between the texture similarity data obtained using algorithms and human visual perception has been demonstrated. This dilemma is normally attributed to the texture representation and similarity metric
-
Gloss Prior Guided Visual Feature Learning for Continuous Sign Language Recognition IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-30 Leming Guo, Wanli Xue, Bo Liu, Kaihua Zhang, Tiantian Yuan, Dimitris Metaxas
Continuous sign language recognition (CSLR) is to recognize the glosses in a sign language video. Enhancing the generalization ability of CSLR’s visual feature extractor is a worthy area of investigation. In this paper, we model glosses as priors that help to learn more generalizable visual features. Specifically, the signer-invariant gloss feature is extracted by a pre-trained gloss BERT model. Then
-
Shared Latent Membership Enables Joint Shape Abstraction and Segmentation With Deformable Superquadrics IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-29 Jiaxin Li, Hongxing Wang, Jiawei Tan, Junsong Yuan
Part-level 3D shape representations are crucial to shape reasoning and understanding. Two key sub-tasks are: 1) shape abstraction, creating primitive-based object parts; and 2) shape segmentation, finding partition-based object parts. However, for 3D object point clouds, most advanced methods produce parts relying on task-specific priors, such as similarity metrics and primitive geometries, resulting
-
Sparsely-Supervised Object Tracking IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-29 Jilai Zheng, Wenxi Li, Chao Ma, Xiaokang Yang
Recent years have witnessed the incredible performance boost of data-driven deep visual object trackers. Despite the success, these trackers require millions of sequential manual labels on videos for supervised training, implying the heavy burden of human annotating. This raises a crucial question: how to train a powerful tracker from abundant videos using limited manual annotations? In this paper
-
Plug-and-Play Split Gibbs Sampler: Embedding Deep Generative Priors in Bayesian Inference IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-29 Florentin Coeurdoux, Nicolas Dobigeon, Pierre Chainais
This paper introduces a stochastic plug-and-play (PnP) sampling algorithm that leverages variable splitting to efficiently sample from a posterior distribution. The algorithm based on split Gibbs sampling (SGS) draws inspiration from the half quadratic splitting method (HQS) and the alternating direction method of multipliers (ADMM). It divides the challenging task of posterior sampling into two simpler
-
Learning Attention in the Frequency Domain for Flexible Real Photograph Denoising IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-29 Ruijun Ma, Yaoxuan Zhang, Bob Zhang, Leyuan Fang, Dong Huang, Long Qi
Recent advancements in deep learning techniques have pushed forward the frontiers of real photograph denoising. However, due to the inherent pooling operations in the spatial domain, current CNN-based denoisers are biased towards focusing on low-frequency representations, while discarding the high-frequency components. This will induce a problem for suboptimal visual quality as the image denoising
-
Learning a Deep Demosaicing Network for Spike Camera With Color Filter Array IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-29 Yanchen Dong, Ruiqin Xiong, Jing Zhao, Jian Zhang, Xiaopeng Fan, Shuyuan Zhu, Tiejun Huang
For capturing dynamic scenes with ultra-fast motion, neuromorphic cameras with extremely high temporal resolution have demonstrated their great capability and potential. Different from the event cameras that only record relative changes in light intensity, spike camera fires a stream of spikes according to a full-time accumulation of photons so that it can recover the texture details for both static
-
INSURE: An Information Theory iNspired diSentanglement and pURification modEl for Domain Generalization IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-05-29 Xi Yu, Huan-Hsin Tseng, Shinjae Yoo, Haibin Ling, Yuewei Lin
Domain Generalization (DG) aims to learn a generalizable model on the unseen target domain by only training on the multiple observed source domains. Although a variety of DG methods have focused on extracting domain-invariant features, the domain-specific class-relevant features have attracted attention and been argued to benefit generalization to the unseen target domain. To take into account the