Our eighth generation TPUs: two chips for the agentic era

Today at Google Cloud Next, we are introducing the eighth generation of Google's custom Tensor Processor Unit (TPU), coming soon with two distinct, purpose-built architectures for training and inference: TPU 8t and TPU 8i. These two chips are designed to power our custom-built supercomputers, to drive everything from cutting-edge model training and agent development, to massive inference workloads. TPUs have been powering leading foundation models, including Gemini, for years. These 8th generation TPUs together will deliver scale, efficiency and capabilities across training, serving and agentic workloads.

In this age of AI agents, models must reason through problems, execute multi-step workflows and learn from their own actions in continuous loops. This places a new set of demands on infrastructure, and TPU 8t and TPU 8i were designed in partnership with Google DeepMind to take on the most demanding AI workloads and adapt to evolving model architectures at scale.

TPUs set the standard for a number of ML supercomputing components including custom numerics, liquid cooling, custom interconnects and more, and our eighth generation TPUs are the culmination of more than a decade of development. The key insight behind the original TPU design continues to hold today: by customizing and co-designing silicon with hardware, networking and software, including model architecture and application requirements, we can deliver dramatically more power efficiency and absolute performance.

We are thrilled to see how a decade of innovation translates into real-world breakthroughs. Today, pioneering organizations like Citadel Securities are pushing the boundaries of what's possible, choosing TPUs to power their cutting-edge AI workloads: