Batch Mode in the Gemini API: Process More for Less

Gemini models are now available in Batch Mode

Today, we’re excited to introduce a batch mode in the Gemini API, a new asynchronous endpoint designed specifically for high-throughput, non-latency-critical workloads. The Gemini API Batch Mode allows you to submit large jobs, offload the scheduling and processing, and retrieve your results within 24 hours—all at a 50% discount compared to our synchronous APIs.

Process more for less

Batch Mode is the perfect tool for any task where you have your data ready upfront and don’t need an immediate response. By separating these large jobs from your real-time traffic, you unlock three key benefits:

Cost savings: Batch jobs are priced at 50% less than the standard rate for a given model

Higher throughput: Batch Mode has even higher rate limits

Easy API calls: No need to manage complex client-side queuing or retry logic. Available results are returned within a 24-hour window.

... continue reading