Oracle-grasp: zero-shot affordance-aligned robotic grasping using large multimodal models

Abstract Grasping unknown objects in unstructured environments is a critical challenge for service robots, which must operate in dynamic, real-world settings such as homes, hospitals, and warehouses. Success in these environments requires both semantic understanding and spatial reasoning. Traditional methods often rely on dense training datasets or detailed geometric modeling, which demand extensive data collection and do not generalize well to novel objects or affordances. We present ORACLE-Gra