Abstract Artificial intelligence (AI) is accelerating the evolution of robotics from task-specific automation to general-purpose autonomy, enabling robots to perform high-level tasks in unstructured and dynamic environments. One of the key enablers in this evolution is the integration of AI with robotic vision systems, which provide accurate perception and contextual interpretation of complex surroundings. An important challenge for this goal is to ensure computational efficiency while robust in
