Today, Google Cloud introduced new AI-oriented instances, powered by its own Axion CPUs and Ironwood TPUs. The new instances are aimed at both training and low-latency inference of large-scale AI models, the key feature of these new instances is efficient scaling of AI models, enabled by a very large scale-up world size of Google's Ironwood-based systems.
Millions of Ironwood TPUs for training and inference.
Ironwood is Google's 7th Generation tensor processing unit (TPU), which delivers 4,614 FP8 TFLOPS of performance and is equipped with 192 GB of HBM3E memory, offering a bandwidth of up to 7.37 TB/s. Ironwood pods scale up to 9,216 AI accelerators, delivering a total of 42.5 FP8 ExaFLOPS for training and inference, which by far exceeds the FP8 capabilities of Nvidia's GB300 NVL72 system that stands at 0.36 ExaFLOPS. The pod is interconnected using a proprietary 9.6 Tb/s Inter-Chip Interconnect network, and carries roughly 1.77 PB of HBM3E memory in total, once again exceeding what Nvidia's competing platform can offer.
(Image credit: Google)
Ironwood pods — based on Axion CPUs and Ironwood TPUs — can be joined into clusters running hundreds of thousands of TPUs, which form part of Google's adequately dubbed AI Hypercomputer. This is an integrated supercomputing platform uniting compute, storage, and networking under one management layer. To boost the reliability of both ultra-large pods and the AI Hypercomputer, Google uses its reconfigurable fabric, named Optical Circuit Switching, which instantly routes around any hardware interruption to sustain continuous operation.
IDC data credits the AI Hypercomputer model with an average 353% three-year ROI, 28% lower IT spending, and 55% higher operational efficiency for enterprise customers.
Several companies are already adopting Google's Ironwood-based platform. Anthropic plans to use as many as one million TPUs to operate and expand its Claude model family, citing major cost-to-performance gains. Lightricks has also begun deploying Ironwood to train and serve its LTX-2 multimodal system.
Axion CPUs: Google finally deploys in-house designed processors
Although AI accelerators like Google's Ironwood tend to steal all the thunder in the AI era of computing, CPUs are still crucially important for application logic and service hosting as well as running some of AI workloads, such as data ingestion. So, along with its 7th Generation TPUs, Google is also deploying its first Armv9-based general-purpose processors, named Axion.
(Image credit: Google)
... continue reading