样式: 排序: IF: - GO 导出 标记为已读
-
Learning Discriminative Features for Visual Tracking via Scenario Decoupling Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-19 Yinchao Ma, Qianjin Yu, Wenfei Yang, Tianzhu Zhang, Jinpeng Zhang
Visual tracking aims to estimate object state automatically in a video sequence, which is challenging especially in complex scenarios. Recent Transformer-based trackers enable the interaction between the target template and search region in the feature extraction phase for target-aware feature learning, which have achieved superior performance. However, visual tracking is essentially a task to discriminate
-
Polynomial Implicit Neural Framework for Promoting Shape Awareness in Generative Models Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-20 Utkarsh Nath, Rajhans Singh, Ankita Shukla, Kuldeep Kulkarni, Pavan Turaga
Polynomial functions have been employed to represent shape-related information in 2D and 3D computer vision, even from the very early days of the field. In this paper, we present a framework using polynomial-type basis functions to promote shape awareness in contemporary generative architectures. The benefits of using a learnable form of polynomial basis functions as drop-in modules into generative
-
Hard-Normal Example-Aware Template Mutual Matching for Industrial Anomaly Detection Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-18 Zixuan Chen, Xiaohua Xie, Lingxiao Yang, Jian-Huang Lai
Anomaly detectors are widely used in industrial manufacturing to detect and localize unknown defects in query images. These detectors are trained on anomaly-free samples and have successfully distinguished anomalies from most normal samples. However, hard-normal examples are scattered and far apart from most normal samples, and thus they are often mistaken for anomalies by existing methods. To address
-
Beyond Talking – Generating Holistic 3D Human Dyadic Motion for Communication Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-17 Mingze Sun, Chao Xu, Xinyu Jiang, Yang Liu, Baigui Sun, Ruqi Huang
In this paper, we introduce an innovative task focused on human communication, aiming to generate 3D holistic human motions for both speakers and listeners. Central to our approach is the incorporation of factorization to decouple audio features and the combination of textual semantic information, thereby facilitating the creation of more realistic and coordinated movements. We separately train VQ-VAEs
-
Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-16 Donglin Di, Jiahui Yang, Chaofan Luo, Zhou Xue, Wei Chen, Xun Yang, Yue Gao
Text-to-3D generation represents an exciting field that has seen rapid advancements, facilitating the transformation of textual descriptions into detailed 3D models. However, current progress often neglects the intricate high-order correlation of geometry and texture within 3D objects, leading to challenges such as over-smoothness, over-saturation and the Janus problem. In this work, we propose a method
-
Relation-Guided Adversarial Learning for Data-Free Knowledge Transfer Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-13 Yingping Liang, Ying Fu
Data-free knowledge distillation transfers knowledge by recovering training data from a pre-trained model. Despite the recent success of seeking global data diversity, the diversity within each class and the similarity among different classes are largely overlooked, resulting in data homogeneity and limited performance. In this paper, we introduce a novel Relation-Guided Adversarial Learning method
-
CMAE-3D: Contrastive Masked AutoEncoders for Self-Supervised 3D Object Detection Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-11 Yanan Zhang, Jiaxin Chen, Di Huang
LiDAR-based 3D object detection is a crucial task for autonomous driving, owing to its accurate object recognition and localization capabilities in the 3D real-world space. However, existing methods heavily rely on time-consuming and laborious large-scale labeled LiDAR data, posing a bottleneck for both performance improvement and practical applications. In this paper, we propose Contrastive Masked
-
Structured Generative Models for Scene Understanding Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-12 Christopher K. I. Williams
This position paper argues for the use of structured generative models (SGMs) for the understanding of static scenes. This requires the reconstruction of a 3D scene from an input image (or a set of multi-view images), whereby the contents of the image(s) are causally explained in terms of models of instantiated objects, each with their own type, shape, appearance and pose, along with global variables
-
MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-12 Yupeng Zhou, Daquan Zhou, Yaxing Wang, Jiashi Feng, Qibin Hou
Recent advancements in diffusion models have showcased their impressive capacity to generate visually striking images. However, ensuring a close match between the generated image and the given prompt remains a persistent challenge. In this work, we identify that a crucial factor leading to the erroneous generation of objects and their attributes is the inadequate cross-modality relation learning between
-
MoDA: Modeling Deformable 3D Objects from Casual Videos Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-12 Chaoyue Song, Jiacheng Wei, Tianyi Chen, Yiwen Chen, Chuan-Sheng Foo, Fayao Liu, Guosheng Lin
In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of NeRF, many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical space. Recent works rely on linear blend skinning (LBS) to achieve the canonical-observation transformation
-
Language-Guided Hierarchical Fine-Grained Image Forgery Detection and Localization Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-10 Xiao Guo, Xiaohong Liu, Iacopo Masi, Xiaoming Liu
Differences in forgery attributes of images generated in CNN-synthesized and image-editing domains are large, and such differences make a unified image forgery detection and localization (IFDL) challenging. To this end, we present a hierarchical fine-grained formulation for IFDL representation learning. Specifically, we first represent forgery attributes of a manipulated image with multiple labels
-
Image-Based Virtual Try-On: A Survey Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-10 Dan Song, Xuanpu Zhang, Juan Zhou, Weizhi Nie, Ruofeng Tong, Mohan Kankanhalli, An-An Liu
Image-based virtual try-on aims to synthesize a naturally dressed person image with a clothing image, which revolutionizes online shopping and inspires related topics within image generation, showing both research significance and commercial potential. However, there is a gap between current research progress and commercial applications and an absence of comprehensive overview of this field to accelerate
-
InfoPro: Locally Supervised Deep Learning by Maximizing Information Propagation Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-11 Yulin Wang, Zanlin Ni, Yifan Pu, Cai Zhou, Jixuan Ying, Shiji Song, Gao Huang
End-to-end (E2E) training has become the de-facto standard for training modern deep networks, e.g., ConvNets and vision Transformers (ViTs). Typically, a global error signal is generated at the end of a model and back-propagated layer-by-layer to update the parameters. This paper shows that the reliance on back-propagating global errors may not be necessary for deep learning. More precisely, deep networks
-
Occlusion-Preserved Surveillance Video Synopsis with Flexible Object Graph Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-09 Yongwei Nie, Wei Ge, Siming Zeng, Qing Zhang, Guiqing Li, Ping Li, Hongmin Cai
Video synopsis is a technique that condenses a long surveillance video to a short summary. It faces challenges to process objects originally occluding each other in the source video. Previous approaches either treat occlusion objects as a single object, which however reduce compression ratio; or have to separate occlusion objects individually, but destroy interactions between them and yield visual
-
An Evaluation of Zero-Cost Proxies - from Neural Architecture Performance Prediction to Model Robustness Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-09 Jovita Lukasik, Michael Moeller, Margret Keuper
Zero-cost proxies are nowadays frequently studied and used to search for neural architectures. They show an impressive ability to predict the performance of architectures by making use of their untrained weights. These techniques allow for immense search speed-ups. So far the joint search for well performing and robust architectures has received much less attention in the field of NAS. Therefore, the
-
On Mitigating Stability-Plasticity Dilemma in CLIP-guided Image Morphing via Geodesic Distillation Loss Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-10 Yeongtak Oh, Saehyung Lee, Uiwon Hwang, Sungroh Yoon
Large-scale language-vision pre-training models, such as CLIP, have achieved remarkable results in text-guided image morphing by leveraging several unconditional generative models. However, existing CLIP-guided methods face challenges in achieving photorealistic morphing when adapting the generator from the source to the target domain. Specifically, current guidance methods fail to provide detailed
-
Modality-missing RGBT Tracking: Invertible Prompt Learning and High-quality Benchmarks Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-07 Andong Lu, Chenglong Li, Jiacong Zhao, Jin Tang, Bin Luo
Current RGBT tracking research relies on the complete multi-modality input, but modal information might miss due to some factors such as thermal sensor self-calibration and data transmission error, called modality-missing challenge in this work. To address this challenge, we propose a novel invertible prompt learning approach, which integrates the content-preserving prompts into a well-trained tracking
-
Object Pose Estimation Based on Multi-precision Vectors and Seg-Driven PnP Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-07 Yulin Wang, Hongli Li, Chen Luo
Object pose estimation based on a single RGB image has wide application potential but is difficult to achieve. Existing pose estimation involves various inference pipelines. One popular pipeline is to first use Convolutional Neural Networks (CNN) to predict 2D projections of 3D keypoints in a single RGB image and then calculate the 6D pose via a Perspective-n-Point (PnP) solver. Due to the gap between
-
CLIP-Powered TASS: Target-Aware Single-Stream Network for Audio-Visual Question Answering Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-05 Yuanyuan Jiang, Jianqin Yin
While vision-language pretrained models (VLMs) excel in various multimodal understanding tasks, their potential in fine-grained audio-visual reasoning, particularly for audio-visual question answering (AVQA), remains largely unexplored. AVQA presents specific challenges for VLMs due to the requirement of visual understanding at the region level and seamless integration with audio modality. Previous
-
Instance-dependent Label Distribution Estimation for Learning with Label Noise Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-02 Zehui Liao, Shishuai Hu, Yutong Xie, Yong Xia
Noise transition matrix estimation is a promising approach for learning with label noise. It can infer clean posterior probabilities, known as Label Distribution (LD), based on noisy ones and reduce the impact of noisy labels. However, this estimation is challenging, since the ground truth labels are not always available. Most existing methods estimate a global noise transition matrix using either
-
ReFusion: Learning Image Fusion from Reconstruction with Learnable Loss Via Meta-Learning Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-02 Haowen Bai, Zixiang Zhao, Jiangshe Zhang, Yichen Wu, Lilun Deng, Yukun Cui, Baisong Jiang, Shuang Xu
Image fusion aims to combine information from multiple source images into a single one with more comprehensive informational content. Deep learning-based image fusion algorithms face significant challenges, including the lack of a definitive ground truth and the corresponding distance measurement. Additionally, current manually defined loss functions limit the model’s flexibility and generalizability
-
Draw Sketch, Draw Flesh: Whole-Body Computed Tomography from Any X-Ray Views Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-27 Yongsheng Pan, Yiwen Ye, Yanning Zhang, Yong Xia, Dinggang Shen
Stereoscopic observation is a common foundation of medical image analysis and is generally achieved by 3D medical imaging based on settled scanners, such as CT and MRI, that are not as convenient as X-ray machines in some flexible scenarios. However, X-ray images can only provide perspective 2D observation and lack view in the third dimension. If 3D information can be deduced from X-ray images, it
-
DiffLLE: Diffusion-based Domain Calibration for Weak Supervised Low-light Image Enhancement Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-27 Shuzhou Yang, Xuanyu Zhang, Yinhuai Wang, Jiwen Yu, Yuhan Wang, Jian Zhang
Existing weak supervised low-light image enhancement methods lack enough effectiveness and generalization in practical applications. We suppose this is because of the absence of explicit supervision and the inherent gap between real-world low-light domain and the training low-light domain. For example, low-light datasets are well-designed, but real-world night scenes are plagued with sophisticated
-
ICEv2: Interpretability, Comprehensiveness, and Explainability in Vision Transformer Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-26 Hoyoung Choi, Seungwan Jin, Kyungsik Han
Vision transformers use [CLS] token to predict image classes. Their explainability visualization has been studied using relevant information from the [CLS] token or focusing on attention scores during self-attention. However, such visualization is challenging because of the dependence of the interpretability of a vision transformer on skip connections and attention operators, the instability of non-linearities
-
IEBins: Iterative Elastic Bins for Monocular Depth Estimation and Completion Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-25 Shuwei Shao, Zhongcai Pei, Weihai Chen, Peter C. Y. Chen, Zhengguo Li
Monocular depth estimation and completion are fundamental aspects of geometric computer vision, serving as essential techniques for various downstream applications. In recent developments, several methods have reformulated these two tasks as a classification-regression problem, deriving depth with a linear combination of predicted probabilistic distribution and bin centers. In this paper, we introduce
-
Globally Correlation-Aware Hard Negative Generation Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-25 Wenjie Peng, Hongxiang Huang, Tianshui Chen, Quhui Ke, Gang Dai, Shuangping Huang
Hard negative generation aims to generate informative negative samples that help to determine the decision boundaries and thus facilitate advancing deep metric learning. Current works select pair/triplet samples, learn their correlations, and fuse them to generate hard negatives. However, these works merely consider the local correlations of selected samples, ignoring global sample correlations that
-
NAFT and SynthStab: A RAFT-Based Network and a Synthetic Dataset for Digital Video Stabilization Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-22 Marcos Roberto e Souza, Helena de Almeida Maia, Helio Pedrini
Multiple deep learning-based stabilization methods have been proposed recently. Some of them directly predict the optical flow to warp each unstable frame into its stabilized version, which we called direct warping. These methods primarily perform online or semi-online stabilization, prioritizing lower computational cost while achieving satisfactory results in certain scenarios. However, they fail
-
Reliable Evaluation of Attribution Maps in CNNs: A Perturbation-Based Approach Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-23 Lars Nieradzik, Henrike Stephani, Janis Keuper
In this paper, we present an approach for evaluating attribution maps, which play a central role in interpreting the predictions of convolutional neural networks (CNNs). We show that the widely used insertion/deletion metrics are susceptible to distribution shifts that affect the reliability of the ranking. Our method proposes to replace pixel modifications with adversarial perturbations, which provides
-
Transformer for Object Re-identification: A Survey Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-23 Mang Ye, Shuoyi Chen, Chenyue Li, Wei-Shi Zheng, David Crandall, Bo Du
Object Re-identification (Re-ID) aims to identify specific objects across different times and scenes, which is a widely researched task in computer vision. For a prolonged period, this field has been predominantly driven by deep learning technology based on convolutional neural networks. In recent years, the emergence of Vision Transformers has spurred a growing number of studies delving deeper into
-
One-Shot Generative Domain Adaptation in 3D GANs Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-22 Ziqiang Li, Yi Wu, Chaoyue Wang, Xue Rui, Bin Li
3D-aware image generation necessitates extensive training data to ensure stable training and mitigate the risk of overfitting. This paper first consider a novel task known as One-shot 3D Generative Domain Adaptation (GDA), aimed at transferring a pre-trained 3D generator from one domain to a new one, relying solely on a single reference image. One-shot 3D GDA is characterized by the pursuit of specific
-
CS-CoLBP: Cross-Scale Co-occurrence Local Binary Pattern for Image Classification Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-19 Bin Xiao, Danyu Shi, Xiuli Bi, Weisheng Li, Xinbo Gao
The local binary pattern (LBP) is an effective feature, describing the size relationship between the neighboring pixels and the current pixel. While individual LBP-based methods yield good results, co-occurrence LBP-based methods exhibit a better ability to extract structural information. However, most of the co-occurrence LBP-based methods excel mainly in dealing with rotated images, exhibiting limitations
-
Warping the Residuals for Image Editing with StyleGAN Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-18 Ahmet Burak Yildirim, Hamza Pehlivan, Aysegul Dundar
StyleGAN models show editing capabilities via their semantically interpretable latent organizations which require successful GAN inversion methods to edit real images. Many works have been proposed for inverting images into StyleGAN’s latent space. However, their results either suffer from low fidelity to the input image or poor editing qualities, especially for edits that require large transformations
-
Pulling Target to Source: A New Perspective on Domain Adaptive Semantic Segmentation Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-16 Haochen Wang, Yujun Shen, Jingjing Fei, Wei Li, Liwei Wu, Yuxi Wang, Zhaoxiang Zhang
Domain-adaptive semantic segmentation aims to transfer knowledge from a labeled source domain to an unlabeled target domain. However, existing methods primarily focus on directly learning categorically discriminative target features for segmenting target images, which is challenging in the absence of target labels. This work provides a new perspective. We ob serve that the features learned with source
-
Feature Matching via Graph Clustering with Local Affine Consensus Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-15 Yifan Lu, Jiayi Ma
This paper studies graph clustering with application to feature matching and proposes an effective method, termed as GC-LAC, that can establish reliable feature correspondences and simultaneously discover all potential visual patterns. In particular, we regard each putative match as a node and encode the geometric relationships into edges where a visual pattern sharing similar motion behaviors corresponds
-
Learning to Detect Novel Species with SAM in the Wild Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-13 Garvita Allabadi, Ana Lucic, Yu-Xiong Wang, Vikram Adve
This paper tackles the limitation of a closed-world object detection model that was trained on one species. The expectation for this model is that it will not generalize well to recognize the instances of new species if they were present in the incoming data stream. We propose a novel object detection framework for this open-world setting that is suitable for applications that monitor wildlife, ocean
-
MVTN: Learning Multi-view Transformations for 3D Understanding Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-11 Abdullah Hamdi, Faisal AlZahrani, Silvio Giancola, Bernard Ghanem
-
Adaptive Middle Modality Alignment Learning for Visible-Infrared Person Re-identification Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-09 Yukang Zhang, Yan Yan, Yang Lu, Hanzi Wang
-
Rethinking Contemporary Deep Learning Techniques for Error Correction in Biometric Data Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-06 YenLung Lai, XingBo Dong, Zhe Jin, Wei Jia, Massimo Tistarelli, XueJun Li
-
Day2Dark: Pseudo-Supervised Activity Recognition Beyond Silent Daylight Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-06 Yunhua Zhang, Hazel Doughty, Cees G. M. Snoek
-
Few Annotated Pixels and Point Cloud Based Weakly Supervised Semantic Segmentation of Driving Scenes Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-04 Huimin Ma, Sheng Yi, Shijie Chen, Jiansheng Chen, Yu Wang
-
Achieving Procedure-Aware Instructional Video Correlation Learning Under Weak Supervision from a Collaborative Perspective Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-04 Tianyao He, Huabin Liu, Zelin Ni, Yuxi Li, Xiao Ma, Cheng Zhong, Yang Zhang, Yingxue Wang, Weiyao Lin
-
EfficientDeRain+: Learning Uncertainty-Aware Filtering via RainMix Augmentation for High-Efficiency Deraining Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-04 Qing Guo, Hua Qi, Jingyang Sun, Felix Juefei-Xu, Lei Ma, Di Lin, Wei Feng, Song Wang
-
APPTracker+: Displacement Uncertainty for Occlusion Handling in Low-Frame-Rate Multiple Object Tracking Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-03 Tao Zhou, Qi Ye, Wenhan Luo, Haizhou Ran, Zhiguo Shi, Jiming Chen
-
Anti-Fake Vaccine: Safeguarding Privacy Against Face Swapping via Visual-Semantic Dual Degradation Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-11-01 Jingzhi Li, Changjiang Luo, Hua Zhang, Yang Cao, Xin Liao, Xiaochun Cao
-
Basis Restricted Elastic Shape Analysis on the Space of Unregistered Surfaces Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-30 Emmanuel Hartman, Emery Pierson, Martin Bauer, Mohamed Daoudi, Nicolas Charon
-
Improving 3D Finger Traits Recognition via Generalizable Neural Rendering Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-30 Hongbin Xu, Junduan Huang, Yuer Ma, Zifeng Li, Wenxiong Kang
-
A Memory-Assisted Knowledge Transferring Framework with Curriculum Anticipation for Weakly Supervised Online Activity Detection Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-28 Tianshan Liu, Kin-Man Lam, Bing-Kun Bao
-
Sample Correlation for Fingerprinting Deep Face Recognition Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-25 Jiyang Guan, Jian Liang, Yanbo Wang, Ran He
-
Dynamic Attention Vision-Language Transformer Network for Person Re-identification Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-26 Guifang Zhang, Shijun Tan, Zhe Ji, Yuming Fang
-
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-24 David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, Rui Zhao, Lingmin Ran, Yuchao Gu, Difei Gao, Mike Zheng Shou
-
StyleAdapter: A Unified Stylized Image Generation Model Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-25 Zhouxia Wang, Xintao Wang, Liangbin Xie, Zhongang Qi, Ying Shan, Wenping Wang, Ping Luo
-
Learning Text-to-Video Retrieval from Image Captioning Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-22 Lucas Ventura, Cordelia Schmid, Gül Varol
-
Neural Vector Fields for Implicit Surface Representation and Inference Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-22 Edoardo Mello Rella, Ajad Chhatkuli, Ender Konukoglu, Luc Van Gool
-
AgMTR: Agent Mining Transformer for Few-Shot Segmentation in Remote Sensing Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-21 Hanbo Bi, Yingchao Feng, Yongqiang Mao, Jianning Pei, Wenhui Diao, Hongqi Wang, Xian Sun
-
CogCartoon: Towards Practical Story Visualization Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-21 Zhongyang Zhu, Jie Tang
-
On the Generalization and Causal Explanation in Self-Supervised Learning Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-19 Wenwen Qiang, Zeen Song, Ziyin Gu, Jiangmeng Li, Changwen Zheng, Fuchun Sun, Hui Xiong
-
Interweaving Insights: High-Order Feature Interaction for Fine-Grained Visual Recognition Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-20 Arindam Sikdar, Yonghuai Liu, Siddhardha Kedarisetty, Yitian Zhao, Amr Ahmed, Ardhendu Behera
-
Towards Data-Centric Face Anti-spoofing: Improving Cross-Domain Generalization via Physics-Based Data Synthesis Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-17 Rizhao Cai, Cecelia Soh, Zitong Yu, Haoliang Li, Wenhan Yang, Alex C. Kot
-
Facial Action Unit Detection by Adaptively Constraining Self-Attention and Causally Deconfounding Sample Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-17 Zhiwen Shao, Hancheng Zhu, Yong Zhou, Xiang Xiang, Bing Liu, Rui Yao, Lizhuang Ma
-
Blind Multimodal Quality Assessment of Low-Light Images Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-10-16 Miaohui Wang, Zhuowei Xu, Mai Xu, Weisi Lin