Out-of-Sight Embodied Agents: Multimodal Tracking, Sensor Fusion, and Trajectory Forecasting

Trajectory prediction is a fundamental problem in computer vision, vision-language-action models, world models, and autonomous systems, with broad impact on applications including autonomous driving, robotics, and surveillance. Most existing approaches assume observations are complete and relatively clean, and thus do not adequately address out-ofsight agents or the intrinsic noise in sensing modalities (e.g., sensor measurements) caused by restricted camera coverage, occlusions, and the lack of