-
Uncertainty-informed regional deformation diagnosis of arch dams Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-20 Xudong Chen, Wenhao Sun, Shaowei Hu, Liuyang Li, Chongshi Gu, Jinjun Guo, Bowen Wei, Bo Xu
Accurately predicting dam deformation is crucial for understanding its operational status. However, existing models struggle to effectively capture the spatiotemporal correlations in monitoring data and quantify uncertainty within dam systems. This paper presents an innovative uncertainty quantification model for evaluating regional deformation in arch dams. First, a method to extract the spatiotemporal
-
Learning Discriminative Features for Visual Tracking via Scenario Decoupling Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-19 Yinchao Ma, Qianjin Yu, Wenfei Yang, Tianzhu Zhang, Jinpeng Zhang
Visual tracking aims to estimate object state automatically in a video sequence, which is challenging especially in complex scenarios. Recent Transformer-based trackers enable the interaction between the target template and search region in the feature extraction phase for target-aware feature learning, which have achieved superior performance. However, visual tracking is essentially a task to discriminate
-
Polynomial Implicit Neural Framework for Promoting Shape Awareness in Generative Models Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-20 Utkarsh Nath, Rajhans Singh, Ankita Shukla, Kuldeep Kulkarni, Pavan Turaga
Polynomial functions have been employed to represent shape-related information in 2D and 3D computer vision, even from the very early days of the field. In this paper, we present a framework using polynomial-type basis functions to promote shape awareness in contemporary generative architectures. The benefits of using a learnable form of polynomial basis functions as drop-in modules into generative
-
Hard-Normal Example-Aware Template Mutual Matching for Industrial Anomaly Detection Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-18 Zixuan Chen, Xiaohua Xie, Lingxiao Yang, Jian-Huang Lai
Anomaly detectors are widely used in industrial manufacturing to detect and localize unknown defects in query images. These detectors are trained on anomaly-free samples and have successfully distinguished anomalies from most normal samples. However, hard-normal examples are scattered and far apart from most normal samples, and thus they are often mistaken for anomalies by existing methods. To address
-
Beyond Talking – Generating Holistic 3D Human Dyadic Motion for Communication Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-17 Mingze Sun, Chao Xu, Xinyu Jiang, Yang Liu, Baigui Sun, Ruqi Huang
In this paper, we introduce an innovative task focused on human communication, aiming to generate 3D holistic human motions for both speakers and listeners. Central to our approach is the incorporation of factorization to decouple audio features and the combination of textual semantic information, thereby facilitating the creation of more realistic and coordinated movements. We separately train VQ-VAEs
-
Key-Axis-based Localization of Symmetry Axes in 3D Objects Utilizing Geometry and Texture IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-17 Yulin Wang, Chen Luo
-
Real-World Low-Dose CT Image Denoising by Patch Similarity Purification IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-17 Zeya Song, Liqi Xue, Jun Xu, Baoping Zhang, Chao Jin, Jian Yang, Changliang Zou
-
Two‐step rapid inspection of underwater concrete bridge structures combining sonar, camera, and deep learning Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-17 Weihao Sun, Shitong Hou, Gang Wu, Yujie Zhang, Luchang Zhao
Underwater defects in piers pose potential hazards to the safety and durability of river‐crossing bridges. The concealment and difficulty in detecting underwater defects often result in their oversight. Acoustic methods face challenges in directly achieving accurate measurements of underwater defects, while optical methods are time‐consuming. This study proposes a two‐step rapid inspection method for
-
Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-16 Donglin Di, Jiahui Yang, Chaofan Luo, Zhou Xue, Wei Chen, Xun Yang, Yue Gao
Text-to-3D generation represents an exciting field that has seen rapid advancements, facilitating the transformation of textual descriptions into detailed 3D models. However, current progress often neglects the intricate high-order correlation of geometry and texture within 3D objects, leading to challenges such as over-smoothness, over-saturation and the Janus problem. In this work, we propose a method
-
Diffusion Models as Strong Adversaries IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-16 Xuelong Dai, Yanjie Li, Mingxing Duan, Bin Xiao
-
Exploiting Unlabeled Videos for Video-Text Retrieval via Pseudo-Supervised Learning IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-16 Yu Lu, Ruijie Quan, Linchao Zhu, Yi Yang
-
Unsupervised Learning of Intrinsic Semantics with Diffusion Model for Person Re-Identification IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-16 Xuefeng Tao, Jun Kong, Min Jiang, Ming Lu, Ajmal Mian
-
A semi‐supervised approach for building wall layout segmentation based on transformers and limited data Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-14 Hao Xie, Xiao Ma, Qipei Mei, Ying Hei Chui
In structural design, accurately extracting information from floor plan drawings of buildings is essential for building 3D models and facilitating design automation. However, deep learning models often face challenges due to their dependence on large labeled datasets, which are labor and time‐intensive to generate. And floor plan drawings often present challenges, such as overlapping elements and similar
-
Integration of industry 4.0 technologies for agri-food supply chain resilience Comput. Ind. (IF 8.2) Pub Date : 2024-12-14 Rohit Sharma, Balan Sundarakani, Ioannis Manikas
The agri-food supply chain (AFSC) operations are becoming challenging due to globalization, constantly shifting consumer demands, and intensive disruptions leading to inefficient production and distribution of safe and high-quality food. Technological advancements are the most promising ways to ensure firms’ survival and supply chains. To enhance the resilience of AFSCs, the present study aims to identify
-
Relation-Guided Adversarial Learning for Data-Free Knowledge Transfer Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-13 Yingping Liang, Ying Fu
Data-free knowledge distillation transfers knowledge by recovering training data from a pre-trained model. Despite the recent success of seeking global data diversity, the diversity within each class and the similarity among different classes are largely overlooked, resulting in data homogeneity and limited performance. In this paper, we introduce a novel Relation-Guided Adversarial Learning method
-
Training of construction robots using imitation learning and environmental rewards Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-13 Kangkang Duan, Zhengbo Zou, T. Y. Yang
Construction robots are challenging the paradigm of labor‐intensive construction tasks. Imitation learning (IL) offers a promising approach, enabling robots to mimic expert actions. However, obtaining high‐quality expert demonstrations is a major bottleneck in this process as teleoperated robot motions may not align with optimal kinematic behavior. In this paper, two innovations have been proposed
-
Genetic algorithm optimized frequency‐domain convolutional blind source separation for multiple leakage locations in water supply pipeline Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-13 Hongjin Liu, Hongyuan Fang, Xiang Yu, Yangyang Xia
In the realm of using acoustic methods for locating leakages in water supply pipelines, existing research predominantly focuses on single leak localization, with limited exploration into the challenges posed by multiple leak scenarios. To address this gap, a genetic algorithm‐optimized frequency‐domain convolutional blind source separation algorithm is proposed for the precise localization of multiple
-
CMAE-3D: Contrastive Masked AutoEncoders for Self-Supervised 3D Object Detection Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-11 Yanan Zhang, Jiaxin Chen, Di Huang
LiDAR-based 3D object detection is a crucial task for autonomous driving, owing to its accurate object recognition and localization capabilities in the 3D real-world space. However, existing methods heavily rely on time-consuming and laborious large-scale labeled LiDAR data, posing a bottleneck for both performance improvement and practical applications. In this paper, we propose Contrastive Masked
-
Structured Generative Models for Scene Understanding Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-12 Christopher K. I. Williams
This position paper argues for the use of structured generative models (SGMs) for the understanding of static scenes. This requires the reconstruction of a 3D scene from an input image (or a set of multi-view images), whereby the contents of the image(s) are causally explained in terms of models of instantiated objects, each with their own type, shape, appearance and pose, along with global variables
-
MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-12 Yupeng Zhou, Daquan Zhou, Yaxing Wang, Jiashi Feng, Qibin Hou
Recent advancements in diffusion models have showcased their impressive capacity to generate visually striking images. However, ensuring a close match between the generated image and the given prompt remains a persistent challenge. In this work, we identify that a crucial factor leading to the erroneous generation of objects and their attributes is the inadequate cross-modality relation learning between
-
MoDA: Modeling Deformable 3D Objects from Casual Videos Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-12 Chaoyue Song, Jiacheng Wei, Tianyi Chen, Yiwen Chen, Chuan-Sheng Foo, Fayao Liu, Guosheng Lin
In this paper, we focus on the challenges of modeling deformable 3D objects from casual videos. With the popularity of NeRF, many works extend it to dynamic scenes with a canonical NeRF and a deformation model that achieves 3D point transformation between the observation space and the canonical space. Recent works rely on linear blend skinning (LBS) to achieve the canonical-observation transformation
-
VDMUFusion: A Versatile Diffusion Model-Based Unsupervised Framework for Image Fusion IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-12 Yu Shi, Yu Liu, Juan Cheng, Z. Jane Wang, Xun Chen
-
A Self-Adaptive Feature Extraction Method for Aerial-view Geo-localization IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-12 Jinliang Lin, Zhiming Luo, Dazhen Lin, Shaozi Li, Zhun Zhong
-
Learning Lossless Compression for High Bit-Depth Volumetric Medical Image IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-12 Kai Wang, Yuanchao Bai, Daxin Li, Deming Zhai, Junjun Jiang, Xianming Liu
-
Language-Guided Hierarchical Fine-Grained Image Forgery Detection and Localization Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-10 Xiao Guo, Xiaohong Liu, Iacopo Masi, Xiaoming Liu
Differences in forgery attributes of images generated in CNN-synthesized and image-editing domains are large, and such differences make a unified image forgery detection and localization (IFDL) challenging. To this end, we present a hierarchical fine-grained formulation for IFDL representation learning. Specifically, we first represent forgery attributes of a manipulated image with multiple labels
-
Image-Based Virtual Try-On: A Survey Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-10 Dan Song, Xuanpu Zhang, Juan Zhou, Weizhi Nie, Ruofeng Tong, Mohan Kankanhalli, An-An Liu
Image-based virtual try-on aims to synthesize a naturally dressed person image with a clothing image, which revolutionizes online shopping and inspires related topics within image generation, showing both research significance and commercial potential. However, there is a gap between current research progress and commercial applications and an absence of comprehensive overview of this field to accelerate
-
InfoPro: Locally Supervised Deep Learning by Maximizing Information Propagation Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-11 Yulin Wang, Zanlin Ni, Yifan Pu, Cai Zhou, Jixuan Ying, Shiji Song, Gao Huang
End-to-end (E2E) training has become the de-facto standard for training modern deep networks, e.g., ConvNets and vision Transformers (ViTs). Typically, a global error signal is generated at the end of a model and back-propagated layer-by-layer to update the parameters. This paper shows that the reliance on back-propagating global errors may not be necessary for deep learning. More precisely, deep networks
-
DREAM-PCD: Deep Reconstruction and Enhancement of mmWave Radar Pointcloud IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-11 Ruixu Geng, Yadong Li, Dongheng Zhang, Jincheng Wu, Yating Gao, Yang Hu, Yan Chen
-
Subjective and Objective Analysis of Indian Social Media Video Quality IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-11 Sandeep Mishra, Mukul Jha, Alan C. Bovik
-
Advancing Video Anomaly Detection: A Bi-Directional Hybrid Framework for Enhanced Single- and Multi-Task Approaches IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-11 Guodong Shen, Yuqi Ouyang, Junru Lu, Yixuan Yang, Victor Sanchez
-
HEOI: Human Attention Prediction in Natural Daily Life with Fine-Grained Human-Environment-Object Interaction Model IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-11 Zhixiong Nan, Leiyu Jia, Bin Xiao
-
SALENet: Structure-Aware Lighting Estimations from a Single Image for Indoor Environments IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-11 Junhong Zhao, Bing Xue, Mengjie Zhang
-
SEGSID: A Semantic-Guided Framework for Sonar Image Despeckling IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-11 Shaohua Liu, Junzhe Lu, Hongkun Dou, Jiajun Li, Yue Deng
-
Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural Network IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-11 Zizhuo Li, Jiayi Ma
-
Learning Frame-Event Fusion for Motion Deblurring IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-11 Wen Yang, Jinjian Wu, Jupo Ma, Leida Li, Weisheng Dong, Guangming Shi
-
Intelligent prediction and soft-sensing of comprehensive production indicators for iron ore sintering: A review Comput. Ind. (IF 8.2) Pub Date : 2024-12-11 Sheng Du, Xian Ma, Haipeng Fan, Jie Hu, Weihua Cao, Min Wu, Witold Pedrycz
Iron ore sintering is a critical process in iron and steel production, with a substantial impact on overall energy consumption and the emission of various environmental pollutants. Enhancing the efficiency of this process is crucial for achieving sustainability in the iron and steel industry. Accurate prediction and real-time monitoring of comprehensive production indicators are essential for optimizing
-
Occlusion-Preserved Surveillance Video Synopsis with Flexible Object Graph Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-09 Yongwei Nie, Wei Ge, Siming Zeng, Qing Zhang, Guiqing Li, Ping Li, Hongmin Cai
Video synopsis is a technique that condenses a long surveillance video to a short summary. It faces challenges to process objects originally occluding each other in the source video. Previous approaches either treat occlusion objects as a single object, which however reduce compression ratio; or have to separate occlusion objects individually, but destroy interactions between them and yield visual
-
An Evaluation of Zero-Cost Proxies - from Neural Architecture Performance Prediction to Model Robustness Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-09 Jovita Lukasik, Michael Moeller, Margret Keuper
Zero-cost proxies are nowadays frequently studied and used to search for neural architectures. They show an impressive ability to predict the performance of architectures by making use of their untrained weights. These techniques allow for immense search speed-ups. So far the joint search for well performing and robust architectures has received much less attention in the field of NAS. Therefore, the
-
On Mitigating Stability-Plasticity Dilemma in CLIP-guided Image Morphing via Geodesic Distillation Loss Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-10 Yeongtak Oh, Saehyung Lee, Uiwon Hwang, Sungroh Yoon
Large-scale language-vision pre-training models, such as CLIP, have achieved remarkable results in text-guided image morphing by leveraging several unconditional generative models. However, existing CLIP-guided methods face challenges in achieving photorealistic morphing when adapting the generator from the source to the target domain. Specifically, current guidance methods fail to provide detailed
-
Integrating spatial and channel attention mechanisms with domain knowledge in convolutional neural networks for friction coefficient prediction Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-10 Zihang Weng, Chenglong Liu, Yuchuan Du, Difei Wu, Zhen Leng
The pavement skid resistance is crucial for ensuring driving safety. However, the reproducibility and comparability of field measurements are constrained by various influencing factors. One solution to these constraints is utilizing laser‐based 3D pavement data, which are notably stable and can be employed to estimate pavement skid resistance indirectly. However, the integration of tire–road friction
-
A K‐Net‐based deep learning framework for automatic rock quality designation estimation Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-10 Sihao Yu, Louis Ngai Yuen Wong
Rock quality designation (RQD) plays a crucial role in the design and analysis of rock engineering. The traditional method of measuring RQD relies on manual logging by geologists, which is often labor‐intensive and time‐consuming. Thus, this study presents an autonomous framework for expeditious RQD estimation based on two‐dimensional corebox photographs. The scale‐invariant feature transform (SIFT)
-
Event‐based supervisor control for a cyber‐physical waterway lock system Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-09 D. G. Fragkoulis, F. N. Koumboulis, M. P. Tzamtzi, P. G. Totomis
An event‐based supervisory control scheme, in the Ramdage–Wonham framework, will be proposed for the cyber‐physical Waterway Lock system, known as Lock III, in Tilburg, the Netherlands. The proposed control scheme imposes desired behavior, by appropriately disabling controllable events, so as to avoid activation of actuator commands that may lead to undesired and potentially hazardous operating states
-
Uncertainty‐guided U‐Net for soil boundary segmentation using Monte Carlo dropout Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-09 X. Zhou, B. Sheil, S. Suryasentana, P. Shi
Accurate soil stratification is essential for geotechnical engineering design. Owing to its effectiveness and efficiency, the cone penetration test (CPT) has been widely applied for subsurface stratigraphy, which relies heavily on empiricism for correlations to soil type. Recently, deep learning techniques have shown great promise in learning the relationship between CPT data and soil boundaries automatically
-
Maize precision seeding scheme based on multi-sensor information fusion J. Ind. Inf. Integr. (IF 10.4) Pub Date : 2024-12-08 Chunji Xie, Li Yang, Xiantao He, Tao Cui, Dongxing Zhang, Hongsheng Li, Tianpu Xiao, Haoyu Wang
Seeding plays a crucial role in agricultural production. The traditional mechanized seeding suffers from inefficiencies, low precision, and lack of control, which makes it inadequate for the high demands of the modern precision agriculture, such as the high speed, high precision, and real-time control. Therefore, this study proposes a precision seeding scheme based on multi-sensor information fusion
-
Modality-missing RGBT Tracking: Invertible Prompt Learning and High-quality Benchmarks Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-07 Andong Lu, Chenglong Li, Jiacong Zhao, Jin Tang, Bin Luo
Current RGBT tracking research relies on the complete multi-modality input, but modal information might miss due to some factors such as thermal sensor self-calibration and data transmission error, called modality-missing challenge in this work. To address this challenge, we propose a novel invertible prompt learning approach, which integrates the content-preserving prompts into a well-trained tracking
-
Object Pose Estimation Based on Multi-precision Vectors and Seg-Driven PnP Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-07 Yulin Wang, Hongli Li, Chen Luo
Object pose estimation based on a single RGB image has wide application potential but is difficult to achieve. Existing pose estimation involves various inference pipelines. One popular pipeline is to first use Convolutional Neural Networks (CNN) to predict 2D projections of 3D keypoints in a single RGB image and then calculate the 6D pose via a Perspective-n-Point (PnP) solver. Due to the gap between
-
Multimodal-information-based optimized agricultural prescription recommendation system of crop electronic medical records J. Ind. Inf. Integr. (IF 10.4) Pub Date : 2024-12-07 Chang Xu, Junqi Ding, Bo Wang, Yan Qiao, Lingxian Zhang, Yiding Zhang
Multimodal Crop Electronic Medical Records (CEMRs) contain complex information, including disease symptoms, crop conditions, environmental factors, and diagnostic prescriptions, making them crucial for intelligent prescription recommendations. However, effectively integrating complementary features from different CEMRs modalities has remained a key challenge. Current CEMRs research primarily focuses
-
Integrated end-to-end multilingual method for low-resource agglutinative languages using Cyrillic scripts J. Ind. Inf. Integr. (IF 10.4) Pub Date : 2024-12-06 Akbayan Bekarystankyzy, Abdul Razaque, Orken Mamyrbayev
Millions of individuals across the world use automatic speech recognition (ASR) systems every day to dictate messages, operate gadgets, begin searches, and enable data entry in tiny devices. The engagement in these circumstances is determined by the accuracy of the voice transcriptions and the system's response. A second barrier to natural engagement for multilingual users is the monolingual nature
-
CLIP-Powered TASS: Target-Aware Single-Stream Network for Audio-Visual Question Answering Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-05 Yuanyuan Jiang, Jianqin Yin
While vision-language pretrained models (VLMs) excel in various multimodal understanding tasks, their potential in fine-grained audio-visual reasoning, particularly for audio-visual question answering (AVQA), remains largely unexplored. AVQA presents specific challenges for VLMs due to the requirement of visual understanding at the region level and seamless integration with audio modality. Previous
-
Portrait Shadow Removal Using Context-aware Illumination Restoration Network IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-05 Jiangjian Yu, Ling Zhang, Qing Zhang, Qifei Zhang, Daiguo Zhou, Chao Liang, Chunxia Xiao
-
Computational modeling of reinforced concrete dapped‐end beams Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-04 Danilo D'Angela, Gennaro Magliulo, Chiara Di Salvatore, Edoardo Cosenza
The structural response of reinforced concrete dapped‐end beams is simulated through finite element analysis. The case study consists in experimental tests performed in the framework of an Italian research project on bridges. The study assesses both the local and global behavior of the beam and characterizes the damage patterns. A blind prediction is initially performed inputting the main basic material
-
Cover Image, Volume 39, Issue 24 Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-04
-
Cover Image, Volume 39, Issue 24 Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-04
-
Issue Information Comput. Aided Civ. Infrastruct. Eng. (IF 8.5) Pub Date : 2024-12-04
Click on the article title to read more.
-
-
Instance-dependent Label Distribution Estimation for Learning with Label Noise Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-02 Zehui Liao, Shishuai Hu, Yutong Xie, Yong Xia
Noise transition matrix estimation is a promising approach for learning with label noise. It can infer clean posterior probabilities, known as Label Distribution (LD), based on noisy ones and reduce the impact of noisy labels. However, this estimation is challenging, since the ground truth labels are not always available. Most existing methods estimate a global noise transition matrix using either
-
CrossEI: Boosting Motion-oriented Object Tracking with An Event Camera IEEE Trans. Image Process. (IF 10.8) Pub Date : 2024-12-03 Zhiwen Chen, Jinjian Wu, Weisheng Dong, Leida Li, Guangming Shi
-
Empowering robotic training with kinesthetic learning and digital twins in human–centric industrial systems J. Ind. Inf. Integr. (IF 10.4) Pub Date : 2024-12-03 Thien Tran, Quang Nguyen, Toan Luu, Minh Tran, Jonathan Kua, Thuong Hoang, Man Dien
This paper presents a human-centric mixed reality (MR) collaborative training platform that employs a kinesthetic learning technique in industrial robotic training, specifically focusing on robot pick–and–place (RPP) operations. Collaborating with ABB Robotics Vietnam, we conducted a user study to investigate the user experiences and practical perceptions of university students and novice trainees
-
A robotic skill transfer learning framework of dynamic manipulation for fabric placement Comput. Ind. (IF 8.2) Pub Date : 2024-12-03 Tianyu Fu, Cheng Li, Yunfeng Bai, Fengming Li, Jiang Wu, Chaoqun Wang, Rui Song
Placing fabric poses a challenge to robots since fabric with high dimensional configuration space can deform during manipulation. Existing methods for placing fabric mostly rely on static operations, which are inefficient and require a large workspace. Therefore, this study applies dynamic manipulation (manipulating uncontrollable parts of the fabric by swinging) to fabric placement, proposing a novel
-
ReFusion: Learning Image Fusion from Reconstruction with Learnable Loss Via Meta-Learning Int. J. Comput. Vis. (IF 11.6) Pub Date : 2024-12-02 Haowen Bai, Zixiang Zhao, Jiangshe Zhang, Yichen Wu, Lilun Deng, Yukun Cui, Baisong Jiang, Shuang Xu
Image fusion aims to combine information from multiple source images into a single one with more comprehensive informational content. Deep learning-based image fusion algorithms face significant challenges, including the lack of a definitive ground truth and the corresponding distance measurement. Additionally, current manually defined loss functions limit the model’s flexibility and generalizability