Most of the companies that have fully committed to building AI models are gobbling up every Nvidia AI accelerator they can get, but Google has taken a different approach. Most of its cloud AI infrastructure is based on its line of custom Tensor processing units (TPUs). After announcing the seventh-gen Ironwood TPU in 2025, the company has moved on to the eighth-gen version, but it’s not just a faster iteration of the same chip.
The new TPUs come in two flavors, providing Google and its customers with an AI platform that is faster and more efficient, the company says. Google is pushing the idea that the “agent era” is fundamentally different from the AI systems that came before, necessitating a new approach to the hardware. So engineers have devised the TPU8t (for training) and the TPU 8i (for inference).
Before AI models become something you can use to analyze data or make silly memes, they need to be trained. The TPU 8t was designed specifically for this part of the AI lifecycle to reduce the training time for frontier AI models from months to weeks.
Updated Tensor 8t server clusters, which Google calls “pods,” now house 9600 chips with two petabytes of shared high-bandwidth memory. Google claims TPU 8t can even scale linearly, with up to a million chips in a single logical cluster. It’s innovations like this that are making super-sized AI models much faster while also driving up RAM prices for everyone else. But if you’re involved in building those giant AI models, all this hardware saves time, with an impressive 121 FP4 EFlops of compute per pod. That’s almost three times higher than Ironwood’s training compute ceiling.