IEEE Transactions on Pattern Analysis and Machine Intelligence

Paper

Armor: Shielding Unlearnable Examples Against Data Augmentation

Xueluan Gong·+7 more

1/12/2026

Private data, when published online, may be collected by unauthorized parties to train deep neural networks (DNNs). To protect privacy, defensive noises can be added to original samples to degrade their learnability by DNNs. Recently, unlearnable examples (Huang et al., 2021) are proposed to minimize the training loss such that the model learns almost nothing. However, raw data are often pre-proc…

Adversarial Robustness in Machine LearningArtificial IntelligenceComputer SciencePhysical Sciences

Paper

CFSM: A Novel Causal Feature Selection Module for Two-Dimensional Out-of-Distribution Generalization

Lin Zhu·+7 more

1/12/2026

In real-world scenarios, training and test data are often collected in diverse settings, leading to domain shifts arising from evolving environments and selection bias. While causality-inspired methods have shown promising results in tackling the out-of-distribution (OOD) generalization issue, prior methods treat the discovered differences across domains as confounding variables. While effective …

Artificial IntelligenceBayesian Modeling and Causal InferenceComputer SciencePhysical Sciences

Paper

Learning Physics-Informed Noise Models from Dark Frames for Low-Light Raw Image Denoising

Hansen Feng·+5 more

1/12/2026

Recently, the mainstream practice for training low-light raw image denoising methods has shifted towards employing synthetic data. Noise modeling, which focuses on characterizing the noise distribution of real-world sensors, profoundly influences the effectiveness and practicality of synthetic data. Currently, physics-based noise modeling struggles to characterize the entire real noise distributi…

Computer ScienceComputer Vision and Pattern RecognitionImage and Signal Denoising MethodsPhysical Sciences

Paper

Prompt is All You Need: Prompting Foundation Models for Large-scale Self-supervised Semantic Segmentation

Jiaojiao Su·+7 more

1/1/2026

This paper addresses the important and challenging task of large-scale unsupervised semantic segmentation (LUSS). We present the first attempt to unleash the power of foundation models (FMs) for the challenging, dense prediction task LUSS, and our main objective is to present simple, effective yet efficient solutions for LUSS, namely Prompting foundation models for LUSS (PLUSS). Firstly, we propo…

Artificial IntelligenceComputer SciencePhysical SciencesTopic Modeling

Paper

Stability and Generalization for Distributed SGDA

Miaoxi Zhu·+4 more

1/1/2026

Minimax optimization is gaining increasing attention in modern machine learning applications. Driven by large-scale models and massive volumes of data collected from edge devices, as well as the concern to preserve client privacy, distributed minimax optimization algorithms become popular, such as Local Stochastic Gradient Descent Ascent (Local-SGDA), and Local Decentralized SGDA (Local-DSGDA). W…

Artificial IntelligenceComputer SciencePhysical SciencesStochastic Gradient Optimization Techniques

Paper

Out-of-Sight Embodied Agents: Multimodal Tracking, Sensor Fusion, and Trajectory Forecasting

J. Zhang·...·Yi Xu

1/1/2026

Trajectory prediction is a fundamental problem in computer vision, vision-language-action models, world models, and autonomous systems, with broad impact on applications including autonomous driving, robotics, and surveillance. Most existing approaches assume observations are complete and relatively clean, and thus do not adequately address out-ofsight agents or the intrinsic noise in sensing mod…

Artificial IntelligenceComputer SciencePhysical SciencesTarget Tracking and Data Fusion in Sensor Networks

Paper

Modality Equilibrium Matters: Minor-Modality-Aware Adaptive Alternating for Cross-Modal Memory Enhancement

Xiang Shi·+6 more

1/1/2026

Multimodal fusion is susceptible to modality imbalance, where dominant modalities overshadow weak ones, easily leading to biased learning and suboptimal fusion, especially for incomplete modality conditions. To address this problem, we introduce an Equilibrium Deviation Metric (EDM) to quantify this imbalance and verify, in both theoretical and empirical terms, that the optimization order of moda…

Artificial IntelligenceComputer ScienceNeural Networks and Reservoir ComputingPhysical Sciences

Paper

One-Step Diffusion and Flow Distillation through Implicit Generator Matching

Zemin Huang·...·Zhengyang Geng

1/1/2026

Despite strong performances on many generative tasks, diffusion and flow matching models require a large number of sampling steps to generate high-quality images. This has motivated the community to develop effective methods to distill pre-trained models into more efficient models. In this paper, we present Implicit Generator Matching (IGM), a systematic approach to distill both pre-trained diffu…

Control and Systems EngineeringEngineeringPhysical SciencesProcess Optimization and Integration

Paper

Fast and Scalable Hashing-Based Universal Graph Coarsening

Mohit Kataria·...·Nikita Malik

1/1/2026

Large graphs are becoming ubiquitous, presenting significant computational hurdles in data processing and analysis. Graph Coarsening algorithms are frequently employed to condense large graphs while preserving key graph properties. Real-world graphs also have features or contexts associated with each node. However, existing coarsening methods often overlook simultaneity across node features and s…

Computer ScienceComputer Vision and Pattern RecognitionGraph Theory and AlgorithmsPhysical Sciences

Paper

Graph Condensation via Homophily Node Refining and Fine-Grained Distribution Matching

Ruiwen Yuan·...·Y. A. Tang

1/1/2026

The remarkable success of GNNs has provoked the challenge of high computational and memory overhead when training with large-scale graphs. As a promising solution, graph condensation is committed to constructing synthetic graphs with significantly smaller size, which are expected to preserve the essential characteristics of the original ones. During this process, a core problem is how to accurate…

Computer ScienceComputer Vision and Pattern RecognitionGraph Theory and AlgorithmsPhysical Sciences

Paper

Principled Multimodal Representation Learning

Xiaohao Liu·...·Xiaobo Xia

1/1/2026

Multimodal representation learning seeks to create a unified representation space by integrating diverse data modalities to improve multimodal understanding. Traditional methods often depend on pairwise contrastive learning, which relies on a predefined anchor modality, restricting alignment across all modalities. Recent advances have investigated the simultaneous alignment of multiple modalities…

Advanced Graph Neural NetworksArtificial IntelligenceComputer SciencePhysical Sciences

Paper

Track-On2: Enhancing Online Point Tracking with Memory

Görkay Aydemir·...·Weidi Xie

1/1/2026

In this paper, we consider the problem of long-term point tracking, which requires consistent identification of points across video frames under significant appearance changes, motion, and occlusion. We target the online setting, i.e., tracking points frameby- frame, making it suitable for real-time and streaming applications. We extend our prior model Track-On into Track-On2, a simple and effici…

Computer ScienceComputer Vision and Pattern RecognitionPhysical SciencesVideo Surveillance and Tracking Methods

Paper

Class-Distribution-Aware Pseudo-Labeling for Semi-Supervised Multi-Label Learning

Min Xie·+5 more

1/1/2026

Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data. However, in the context of semi-supervised multi-label learning (SSMLL), conventional pseudo-labeling methods encounter difficulties when dealing with instances associated with multiple labels and an unknown label count. These limitations often result in the introduction of false positive labels or the n…

Artificial IntelligenceComputer SciencePhysical SciencesText and Document Classification Technologies

Paper

Improved and Accelerated Text-to-Image Generation with Collect, Reflect, and Refine

Shitong Shao·+7 more

1/1/2026

Recently, enhancing the generative capability of text-to-image (T2I) models has become a promising direction in both academia and industry. Prior studies often focused on either improving generative quality or reducing inference latency, but typically failed to improve both quality and speed simultaneously. Moreover, existing inference-enhancement methods do not achieve significant improvements s…

Computer ScienceComputer Vision and Pattern RecognitionGenerative Adversarial Networks and Image SynthesisPhysical Sciences

Paper

Bridging Datasets and Hyperparameters: GCN-Based Link Prediction for Recommendation

Liping Deng·Mingqing Xiao

1/1/2026

Hyperparameter recommendation through meta-learning (HPR-MtL) has proven effective in a wide range of studies. At its core, HPR-MtL constructs a recommendation model using metadata extracted from historical learning tasks, such as dataset characteristics and the empirical performance of hyperparameter configurations. Existing approaches-typically based on k-nearest neighbors (KNN), linear regress…

Advanced Graph Neural NetworksArtificial IntelligenceComputer SciencePhysical Sciences

Paper

Concept Drift and Long-Tailed Distribution in Fine-Grained Visual Categorization: Benchmark and Method

Shuo Ye·+6 more

1/1/2026

Data is the foundation for the development of computer vision, and the establishment of datasets plays an important role in advancing the techniques of fine-grained visual categorization (FGVC). In the existing FGVC datasets used in computer vision, it is generally assumed that each collected instance has fixed characteristics and the distribution of different categories is relatively balanced. I…

Artificial IntelligenceComputer ScienceData Stream Mining TechniquesPhysical Sciences

Paper

Sparse Variational Information Bottleneck Gaussian Processes for Uncertainty Estimation

Liang Mao·Shiliang Sun

1/1/2026

Inducing-point-based sparse variational approximation scales Gaussian process models to large datasets but tends to overestimate observation noise and underestimate posterior variance. Parametric predictive Gaussian process regressor (PPGPR) improve on point-wise uncertainty estimations, especially for heteroskedastic data, by repairing an mismatch between the training loss and the predictive met…

Artificial IntelligenceComputer ScienceGaussian Processes and Bayesian InferencePhysical Sciences

Paper

Mirror Descent Safe Policy Optimization for Reinforcement Learning Agents

Renzhi Lu·+7 more

1/1/2026

Embodied intelligence and related disciplines have identified several mechanisms that help embodied agents learn how to solve complex problems. Reinforcement learning (RL) is one of the most promising computational approaches toward enhancement of the learning-based problem-solving abilities of such agents. Given the recent rapid evolution of artificial intelligence, RL has become a keystone tech…

Artificial IntelligenceComputer SciencePhysical SciencesReinforcement Learning in Robotics

Paper

HIMO: Cross-Arbitrary-Modality Image Invariant Feature Transform with Hierarchical Intrinsic Major Orientation

Chenzhong Gao·+5 more

1/1/2026

Invariant feature extraction is a critical challenge in intelligent image processing, particularly with the rapid advancement of multi-source/modal imaging. Cross-modal matching has attracted considerable attention, yet current studies primarily focus on targeted modalities rather than realizing a general approach. In this paper, cross-arbitrary-modal image invariant feature extraction and matchi…

Computer ScienceComputer Vision and Pattern RecognitionMedical Image Segmentation TechniquesPhysical Sciences

Paper

Revisiting Face Forgery Detection: From Facial Representation to Forgery Detection

Zonghui Guo·+4 more

1/1/2026

Face Forgery Detection (FFD), or Deepfake detection, aims to determine whether a digital face is real or fake. Due to different face synthesis algorithms with diverse forgery patterns, FFD models often overfit specific patterns in training datasets, resulting in poor generalization to other unseen forgeries. Existing FFD methods primarily leverage pre-trained backbones with general image represen…

Computer ScienceComputer Vision and Pattern RecognitionDigital Media Forensic DetectionPhysical Sciences

research.io

Sign up to keep scrolling

Create your feed subscriptions, save articles, keep scrolling.

Already have an account?