IEEE Transactions on Image Processing

Paper

Active Style-Content Dual-Branch Domain Adaptation for Semi-Supervised SAR Object Detection

Xi Yang·...·Quantao Xie

1/1/2026

Synthetic Aperture Radar (SAR) images offer unique advantages in all-weather, all-day remote sensing, but the high acquisition costs and time-consuming annotation processes limit their widespread implementation. Semi-supervised domain adaptation leverages abundant annotated optical images and a small number of labeled SAR images to achieve great performance on SAR images. However, existing semi-s…

Advanced SAR Imaging TechniquesAerospace EngineeringEngineeringPhysical Sciences

Paper

Extending the field of view in modulation-based X-ray phase microtomography

Dominik John·+11 more

1/1/2026

Recent advances in propagation-based phase-contrast imaging, such as hierarchical imaging, have enabled the visualization of internal structures in large biological specimens and material samples. However, modulation-based techniques, which provide quantitative electron density information, face challenges when imaging larger objects due to stringent beam stability requirements and detector disto…

Advanced X-ray Imaging TechniquesPhysical SciencesPhysics and AstronomyRadiation

Paper

Unfolding High-order Correlations for Interpretable Multi-contrast MRI Super-resolution

Qiangqiang Shen·+5 more

1/1/2026

Deep unfolding network has gained significant attention for magnetic resonance imaging super-resolution (MRI SR) due to its performance and interpretability. However, 1) existing methods predominantly focus on cross-contrast correlations while neglecting high-order correlations embedded within spatially adjacent slices in volumetric MRI data. 2) Their degradation models are optimized via the prox…

Advanced Image Processing TechniquesComputer ScienceComputer Vision and Pattern RecognitionPhysical Sciences

Paper

Com-PCQA: No-Reference Point Cloud Quality Assessment via Complex-valued Feature Learning

Jingxuan Su·+5 more

1/1/2026

The visual quality of point clouds is critical for perception-centric immersive media. Point Cloud Quality Assessment (PCQA) is crucial for reducing costs associated with human evaluation, optimizing compression pipeline and enhancing human visual perception. However, real-valued PCQA methods often struggle to capture the coupled geometric and perceptual cues that govern quality. Com-PCQA, a nove…

Artificial IntelligenceComputer SciencePhysical SciencesStochastic Gradient Optimization Techniques

Paper

POSITION: Open World 3D Scene CAD Recomposition

Rongkun Yang·+8 more

1/1/2026

3D scene CAD recomposition aims to reconstruct a given scene by retrieving and assembling CAD models from a database, so as to accurately simulate the geometric properties and spatial arrangement of the original environment. Recent methods learn this task through training on limited scan-to-CAD annotation data, which hinders their generalization to diverse real-world scenes. In this paper, we pro…

Advanced Vision and ImagingComputer ScienceComputer Vision and Pattern RecognitionPhysical Sciences

Paper

Multi-granularity Facial Emotional Representation with Unlabeled Data and Textual Supervision

Kaishen Yuan·+7 more

1/1/2026

Facial expressions (FEs) and action units (AUs) are facial emotional representations at different levels of granularity. In the past, recognizing them has often been treated as two separate tasks. There are also some methods that use the knowledge of one to aid in recognizing the other, but currently, unified models capable of recognizing both FEs and AUs simultaneously remain rare. In this paper…

Emotion and Mood RecognitionExperimental and Cognitive PsychologyPsychologySocial Sciences

Paper

Toward Generalizable Forgery Detection and Reasoning

Yang Gao·+7 more

1/1/2026

Accurate and interpretable detection of AI-generated images is essential for mitigating risks associated with AI misuse. However, the substantial domain gap among generative models makes it challenging to develop a generalizable forgery detection model. Moreover, since every pixel in an AI-generated image is synthesized, traditional saliency-based forgery explanation methods are not well suited f…

Adversarial Robustness in Machine LearningArtificial IntelligenceComputer SciencePhysical Sciences

Paper

Less is More: Infrared and Visible Images Fusion via Semantic-Guided Mixture of Multi-Feature Experts

Yinghui Xing·+4 more

1/1/2026

Infrared (IR) and visible image fusion (IVIF) has become prevalent in recent years. By leveraging the complementary characteristics of infrared and visible images, we can obtain visually-appealing fused images, which further facilitate subsequent scene understanding and object detection from day to night. Integrating complementary information while simultaneously eliminating redundancy is a cruci…

Advanced Image Fusion TechniquesEngineeringMedia TechnologyPhysical Sciences

Paper

High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model

Wuning Zhong·+5 more

1/1/2026

Audio-driven talking face video generation has attracted increasing attention due to its huge industrial potential. Some previous methods focus on learning a direct mapping from audio to visual content. Despite progress, they often struggle with the ambiguity of the mapping process, leading to flawed results. An alternative strategy involves facial structural representations (e.g., facial landmar…

Computer ScienceComputer Vision and Pattern RecognitionFace recognition and analysisPhysical Sciences

Paper

Diff-MEF: Cross-Modal Diffusion Framework With Text Prompts and Semantic Perception for Multi-Exposure Image Fusion

X. Xu·+4 more

1/1/2026

The absence of real-world ground truth (GT) remains a challenge in multi-exposure image fusion (MEF). Benchmarks synthesizing pseudo GT through algorithm ensembles. Existing methods, hampered by inherent imperfections of pseudo GT and fixed mapping relationships, show limited performance and robustness. To address the limitations, we propose a novel cross-modal diffusion framework that synergizes…

Advanced Image Fusion TechniquesEngineeringMedia TechnologyPhysical Sciences

Paper

Symmetric Image-Text Tuning With Entropy-Guided Fusion for Online Continual Learning in Non-Stationary Visual Streams

Leyuan Wang·+5 more

1/1/2026

Online continual learning studies how models learn from continuous and non-stationary data streams. In this paper, we observe that CLIP models exhibit an asymmetric image-text interaction under online continual learning. Specifically, text features of previously seen classes may introduce unfavorable supervision when paired with visual features of newly observed data, leading to catastrophic forg…

Artificial IntelligenceComputer ScienceDomain Adaptation and Few-Shot LearningPhysical Sciences

Paper

Causality-Based Modality- and Platform-Invariant Representation Learning for Dynamic RGBT Tracking and a Benchmark

Zhaodong Ding·...·Shun Miao

1/1/2026

Each sequence in existing RGBT tracking datasets is typically captured from a single platform equipped with both RGB (visible light) and TIR (thermal infrared) sensors. In real-world applications, tracking some objects requires cross-platform collaboration and these platforms might be equipped with different sensors. However, changes in modalities and platforms may cause significant variations in…

Adversarial Robustness in Machine LearningArtificial IntelligenceComputer SciencePhysical Sciences

Paper

Decoupling Target Semantics via Text-Anchored Visual Contrast for Semi-Supervised Medical Image Segmentation

Qingjie Zeng·+7 more

1/1/2026

Semi-supervised learning (SSL) provides an effective means of reducing reliance on large-scale annotated datasets by leveraging unlabeled data. However, existing SSL methods often struggle with semantic ambiguity, especially under limited supervision. Recent studies have incorporated textual information to provide contextual guidance, yet most focus on feature fusion rather than emphasizing targe…

Computer ScienceComputer Vision and Pattern RecognitionMultimodal Machine Learning ApplicationsPhysical Sciences

Paper

Second-Order Robust Iterative Pose Optimization for Fine-Grained Cross-View Localization

Mingtao Feng·+5 more

1/1/2026

Fine-grained cross-view localization seeks to estimate precise camera poses by matching ground images with GPS-tagged aerial imagery. Existing methods typically employ first-order iterative optimization to progressively update the camera pose based on cross-view feature correspondences. However, they rely on local features and neglect global and complementary contextual information, making them p…

Control and Systems EngineeringEngineeringPhysical SciencesRobotic Mechanisms and Dynamics

Paper

Dark-EvGS: Event Camera as an Eye for Radiance Field in the Dark

Jingqian Wu·+5 more

1/1/2026

In low-light environments, conventional cameras often struggle to capture clear multi-view images of objects due to dynamic range limitations and motion blur caused by long exposure. Event cameras, with their high-dynamic range and high-speed properties, have the potential to mitigate these issues. Additionally, 3D Gaussian Splatting (GS) enables radiance field reconstruction, facilitating bright…

Nuclear and High Energy PhysicsParticle Detector Development and PerformancePhysical SciencesPhysics and Astronomy

Paper

View-Adaptive Multi-Granularity Anchor Learning for Multi-View Clustering

Xiaohui Wei·+5 more

1/1/2026

Multi-view clustering (MVC) based on anchor learning has been proven to be effective in improving clustering accuracy and efficiency. Existing MVC methods are mainly based on single-granularity anchor learning, that is, the number of anchors corresponding to different views is constant and consistent, which will lead to information redundancy or insufficient mining. In addition, aggregating ancho…

Advanced Clustering Algorithms ResearchArtificial IntelligenceComputer SciencePhysical Sciences

Paper

MCIB: Multi-Modal Complementary Information Bottleneck for Hyperspectral and LiDAR Classification

Xiao Pan·+5 more

1/1/2026

The effective fusion of multi-modal remote sensing images, particularly hyperspectral imagery (HSI) and light detection and ranging (LiDAR) data, is pivotal for accurate land use and land cover (LULC) classification. However, this process is hindered by two inherent challenges: pervasive data redundancy and the underutilization of cross-modal complementarity, largely due to the lack of a unifying…

EngineeringMedia TechnologyPhysical SciencesRemote-Sensing Image Classification

Paper

SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing

Rong-Cheng Tu·+5 more

1/1/2026

Although video generation and editing models have advanced significantly, individual models remain restricted to specific tasks, often failing to meet diverse user needs. Effectively coordinating these models in pipelines can unlock a wide range of video generation and editing capabilities. However, manual orchestration is complex, time-consuming, and requires deep expertise in model performance …

Computer ScienceComputer Vision and Pattern RecognitionGenerative Adversarial Networks and Image SynthesisPhysical Sciences

Paper

DCL: Dynamic Causal Learning for Cross-Modality Cardiac Image Segmentation

Saidi Guo·+7 more

1/1/2026

Accurate cross-modality cardiac image segmentation is essential for effectively diagnosing and treating heart disease. Different imaging modalities help to determine suitable pre-procedure planning. However, most methods face the difficulty of spatial-temporal confounding, where the anatomy element and modality element of cardiac images are intertwined across both spatial and temporal dimensions.…

Computer ScienceComputer Vision and Pattern RecognitionMedical Image Segmentation TechniquesPhysical Sciences

Paper

Underdetermined Blind Source Separation via Weighted Simplex Shrinkage Regularization and Quantum Deep Image Prior

Chia-Hsiang Lin·Si-Sheng Young

1/1/2026

As most optical satellites remotely acquire multispectral images (MSIs) with limited spatial resolution, multispectral unmixing (MU) becomes a critical signal processing technology for analyzing the pure material spectra for high-precision classification and identification. Unlike the widely investigated hyperspectral unmixing (HU) problem, MU is much more challenging as it corresponds to the und…

EngineeringMedia TechnologyPhysical SciencesRemote-Sensing Image Classification

research.io

Sign up to keep scrolling

Create your feed subscriptions, save articles, keep scrolling.

Already have an account?