Microsoft CTO says he wants to swap most AMD and Nvidia GPUs for homemade chips

Microsoft buys a lot of GPUs from both Nvidia and AMD. But moving forward, Redmond's leaders want to shift the majority of its AI workloads from GPUs to its own homegrown accelerators. The software titan is rather late to the custom silicon party. While Amazon and Google have been building custom CPUs and AI accelerators for years, Microsoft only revealed its Maia AI accelerators in late 2023. Driving the transition is a focus on performance per dollar, which for a hyperscale cloud provider is arguably the only metric that really matters. Speaking during a fireside chat moderated by CNBC on Wednesday, Microsoft CTO Kevin Scott said that up to this point, Nvidia has offered the best price-performance, but he's willing to entertain anything in order to meet demand. Going forward, Scott suggested Microsoft hopes to use its homegrown chips for the majority of its datacenter workloads. When asked, "Is the longer term idea to have mainly Microsoft silicon in the data center?" Scott responded, "Yeah, absolutely." Later, he told CNBC, "It's about the entire system design. It's the networks and cooling, and you want to be able to have the freedom to make decisions that you need to make in order to really optimize your compute for the workload." With its first in-house AI accelerator, the Maia 100, Microsoft was able to free up GPU capacity by shifting OpenAI's GPT-3.5 to its own silicon back in 2023. However, with just 800 teraFLOPS of BF16 performance, 64GB of HBM2e, and 1.8TB/s of memory bandwidth, the chip fell well short of competing GPUs from Nvidia and AMD. Microsoft is reportedly in the process of bringing a second-generation Maia accelerator to market next year that will no doubt offer more competitive compute, memory, and interconnect performance. But while we may see a change in the mix of GPUs to AI ASICs in Microsoft data centers moving forward, they're unlikely to replace Nvidia and AMD's chips entirely. Over the past few years, Google and Amazon have deployed tens of thousands of their TPUs and Trainium accelerators. While these chips have helped them secure some high-profile customer wins, Anthropic for example, these chips are more often used to accelerate the company's own in-house workloads. As such, we continue to see large-scale Nvidia and AMD GPU deployments on these cloud platforms, in part because customers still want them. It should be noted that AI accelerators aren't the only custom chips Microsoft has been working on. Redmond also has its own CPU called Cobalt and a whole host of platform security silicon designed to accelerate cryptography and safeguard key exchanges across its vast datacenter domains. ®

Microsoft CTO says he wants to swap most AMD and Nvidia GPUs for homemade chips

Share this article

Related Articles