Find Related products on Amazon

Shop on Amazon

Cloud Run GPUs, now GA, makes running AI workloads easier for everyone

Published on: 2025-06-11 10:28:46

Developers love Cloud Run, Google Cloud’s serverless runtime, for its simplicity, flexibility, and scalability. And today, we’re thrilled to announce that NVIDIA GPU support for Cloud Run is now generally available, offering a powerful runtime for a variety of use cases that’s also remarkably cost-efficient. Now, you can enjoy the following benefits across both GPUs and CPUs: Pay-per-second billing : You are only charged for the GPU resources you consume, down to the second. Scale to zero : Cloud Run automatically scales your GPU instances down to zero when no requests are received, eliminating idle costs. This is a game-changer for sporadic or unpredictable workloads. Rapid startup and scaling Go from zero to an instance with a GPU and drivers installed in under 5 seconds, allowing your applications to respond to demand very quickly. For example, when scaling from zero (cold start), we achieved an impressive Time-to-First-Token of approximately 19 seconds for a gemma3:4b model (th ... Read full article.