We’re introducing an efficient, on-device robotics model with general-purpose dexterity and fast task adaptation.
In March, we introduced Gemini Robotics, our most advanced VLA (vision language action) model, bringing Gemini 2.0’s multimodal reasoning and real-world understanding into the physical world.
Today, we’re introducing Gemini Robotics On-Device, our most powerful VLA model optimized to run locally on robotic devices. Gemini Robotics On-Device shows strong general-purpose dexterity and task generalization, and it’s optimized to run efficiently on the robot itself.
Since the model operates independent of a data network, it’s helpful for latency sensitive applications, and ensures robustness in environments with intermittent or zero connectivity.
We’re also sharing a Gemini Robotics SDK to help developers easily evaluate Gemini Robotics On-Device on their tasks and environments, test our model in our MuJoCo physics simulator, and quickly adapt it to new domains, with as few as 50 to 100 demonstrations. Developers can access the SDK by signing up to our trusted tester program.
Model capabilities and performance
Gemini Robotics On-Device is a robotics foundation model for bi-arm robots, engineered to require minimal computational resources. It builds on the task generalization and dexterity capabilities of Gemini Robotics and is:
Designed for rapid experimentation with dexterous manipulation.
Adaptable to new tasks through fine-tuning to improve performance.
Optimized to run locally with low-latency inference.
Gemini Robotics On-Device achieves strong visual, semantic and behavioral generalization across a wide range of testing scenarios, follows natural language instructions, and completes highly-dexterous tasks like unzipping bags or folding clothes — all while operating directly on the robot.