Gemini models are now available in Batch Mode Today, we’re excited to introduce a batch mode in the Gemini API, a new asynchronous endpoint designed specifically for high-throughput, non-latency-critical workloads. The Gemini API Batch Mode allows you to submit large jobs, offload the scheduling and processing, and retrieve your results within 24 hours—all at a 50% discount compared to our synchronous APIs. Process more for less Batch Mode is the perfect tool for any task where you have your data ready upfront and don’t need an immediate response. By separating these large jobs from your real-time traffic, you unlock three key benefits: Cost savings: Batch jobs are priced at 50% less than the standard rate for a given model Higher throughput: Batch Mode has even higher rate limits Easy API calls: No need to manage complex client-side queuing or retry logic. Available results are returned within a 24-hour window. A simple workflow for large jobs We’ve designed the API to be simple and intuitive. You package all your requests into a single file, submit it, and retrieve your results once the job is complete. Here are some ways developers are leveraging Batch Mode for tasks today: