Apple Machine Learning Research

PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning

1d ago

Multi-tool-integrated reasoning enables LLM-empowered tool-use agents to solve complex tasks by interleaving natural-language reasoning with calls to external tools. However, training such agents using outcome-only rewards suffers from credit-assignment ambiguity, obscuring which intermediate steps (or tool-use decisions) lead to success or failure. In this paper, we propose PORTool, an importanc…

aimachine-learningreinforcement-learning

Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents

4d ago

This paper was accepted at the Fifth Workshop on Natural Language Generation, Evaluation, and Metrics at ACL 2026. Tool-calling agents are evaluated on tool selection, parameter accuracy, and scope recognition, yet LLM trajectory assessments remain inherently post-hoc. Disconnected from the active execution loop, such assessments identify errors that are usually addressed through prompt-tuning or…

aimachine-learningnlpreinforcement-learning

STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flows

5d ago

Normalizing flows (NFs) are end-to-end likelihood-based generative models for continuous data, and have recently regained attention with encouraging progress on image generation. Yet in the video generation domain, where spatiotemporal complexity and computational cost are substantially higher, state-of-the-art systems almost exclusively rely on diffusion-based models. In this work, we revisit th…

aigenerative-aimachine-learning

Bootstrapping Sign Language Annotations with Sign Language Models

5d ago

AI-driven sign language interpretation is limited by a lack of high-quality annotated data. New datasets including ASL STEM Wiki and FLEURS-ASL contain professional interpreters and 100s of hours of data but remain only partially annotated and thus underutilized, in part due to the prohibitive costs of annotating at this scale. In this work, we develop a pseudo-annotation pipeline that takes sign…

aigenerative-aimachine-learningnlp

International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2026

5d ago

Apple is presenting new research at the annual International Conference on Acoustics, Speech and Signal Processing (ICASSP) , which takes place in person in Barcelona, Spain, from May 4 to 8. We are proud to again sponsor the conference, which brings together the scientific and industrial research communities focused on signal processing and its applications. Below is an overview of Apple’s parti…

acoustics

Adaptive Thinking: Large Language Models Know When to Think in Latent Space

6d ago

Recent advances in large language models (LLMs) test-time computing have introduced the capability to perform intermediate chain-of-thought (CoT) reasoning (thinking) before generating answers. While increasing the thinking budget yields smooth performance improvements at inference time, the relationship between LLM capability, query complexity, and optimal budget allocation remains poorly unders…

aimachine-learningnlp

DSO: Direct Steering Optimization for Bias Mitigation

6d ago

Generative models are often deployed to make decisions on behalf of users, such as vision-language models (VLMs) identifying which person in a room is a doctor to help visually impaired individuals. Yet, VLM decisions are influenced by the perceived demographic attributes of people in the input, which can lead to biased outcomes like failing to identify women as doctors. Moreover, when reducing b…

aigenerative-aimachine-learningnlp

Local Mechanisms of Compositional Generalization in Conditional Diffusion

7d ago

Conditional diffusion models appear capable of compositional generalization, i.e., generating convincing samples for out-of-distribution combinations of conditioners, but the mechanisms underlying this ability remain unclear. To make this concrete, we study length generalization, the ability to generate images with more objects than seen during training. In a controlled CLEVR setting (Johnson et …

aideep-learningmachine-learning

StereoFoley: Object-Aware Stereo Audio Generation from Video

7d ago

We present StereoFoley, a video-to-audio generation framework that produces semantically aligned, temporally synchronized, and spatially accurate stereo sound at 48 kHz. While recent generative video-to-audio models achieve strong semantic and temporal fidelity, they largely remain limited to mono or fail to deliver object-aware stereo imaging, constrained by the lack of professionally mixed, spa…

aicomputer-visiondeep-learningmachine-learning

LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

7d ago

Large Language Models (LLMs) demonstrate their reasoning ability through chain-of-thought (CoT) generation. However, LLM’s autoregressive decoding may limit the ability to revisit and refine earlier tokens in a holistic manner, which can also lead to inefficient exploration for diverse solutions. In this paper, we propose LaDiR (Latent Diffusion Reasoner), a novel reasoning framework that unifies…

aigenerative-aimachine-learningnlp

Learning Long-Term Motion Embeddings for Efficient Kinematics Generation

11d ago

Understanding and predicting motion is a fundamental component of visual intelligence. Although modern video models exhibit strong comprehension of scene dynamics, exploring multiple possible futures through full video synthesis remains prohibitively inefficient. We model scene dynamics orders of magnitude more efficiently by directly operating on a long-term motion embedding that is learned from…

aicomputer-visiondeep-learningmachine-learning

ParaRNN: Large-Scale Nonlinear RNNs, Trainable in Parallel

12d ago

Recurrent Neural Networks (RNNs) are naturally suited to efficient inference, requiring far less memory and compute than attention-based architectures, but the sequential nature of their computation has historically made it impractical to scale up RNNs to billions of parameters. A new advancement from Apple researchers makes RNN training dramatically more efficient — enabling large-scale training…

aideep-learningmachine-learning

Apple Machine Learning Research at ICLR 2026

13d ago

Apple is advancing AI and ML with fundamental research, much of which is shared through publications and engagement at conferences in order to accelerate progress in this important field and support the broader community. This week, the Fourteenth International Conference on Learning Representations (ICLR) will be held in Rio de Janeiro, Brazil, and Apple is proud to again participate in this imp…

aimachine-learning

Can Large Language Models Understand Context?

14d ago

Understanding context is key to understanding human language, an ability which Large Language Models (LLMs) have been increasingly seen to demonstrate to an impressive extent. However, though the evaluation of LLMs encompasses various domains within the realm of Natural Language Processing, limited attention has been paid to probing their linguistic capability of understanding contextual features…

aimachine-learningnlp

What Do Your Logits Know? (The Answer May Surprise You!)

15d ago

Recent work has shown that probing model internals can reveal a wealth of information not apparent from the model generations. This poses the risk of unintentional or malicious information leakage, where model users are able to learn information that the model owner assumed was inaccessible. Using vision-language models as a testbed, we present the first systematic comparison of information retai…

aimachine-learningnlp

International Conference on Learning Representations (ICLR) 2026

18d ago

Apple is presenting new research at the annual International Conference on Learning Representations (ICLR) , which takes place in person in Rio de Janeiro, Brazil, from April 23 to 27. We are proud to again sponsor the conference, which brings together the scientific and industrial research communities focused on deep learning. Below is an overview of Apple’s participation at ICLR 2026.

aideep-learning

MixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM Midtraining

19d ago

This paper was accepted at the Workshop on Navigating and Addressing Data Problems for Foundation Models (NADPFM) at ICLR 2026. Principled domain reweighting can substantially improve sample efficiency and downstream generalization; however, data-mixture optimization for multimodal pretraining remains underexplored. Current multimodal training recipes tune mixtures from only a single perspective …

aideep-learningmachine-learning

Cram Less to Fit More: Training Data Pruning Improves Memorization of Facts

22d ago

This paper was accepted at the Workshop on Navigating and Addressing Data Problems for Foundation Models at ICLR 2026. Large language models (LLMs) can struggle to memorize factual knowledge in their parameters, often leading to hallucinations and poor performance on knowledge-intensive tasks. In this paper, we formalize fact memorization from an information-theoretic perspective and study how tr…

aimachine-learning

Efficient Privacy Loss Accounting for Subsampling and Random Allocation

22d ago

We consider the privacy amplification properties of a sampling scheme in which a user’s data is used in k steps chosen randomly and uniformly from a sequence (or set) of t steps. This sampling scheme has been recently applied in the context of differentially private optimization (Chua et al., 2024a; Choquette-Choo et al., 2025) and communication-efficient high-dimensional private aggregation (Asi…

ACM Human-Computer Interaction Conference (CHI) 2026

25d ago

Apple is presenting new research at the annual ACM (Association of Computing Machinery) CHI Conference on Human Factors in Computing Systems , which takes place in person in Barcelona, Spain, from April 13 to 17. We are proud to again sponsor the conference, which brings together the scientific and industrial research communities focused on human-computer interaction. Below is an overview of Appl…

computer-sciencehci

research.io

Sign up to keep scrolling

Create your feed subscriptions, save articles, keep scrolling.

Already have an account?