Google is spending billions to turn its TPU chips into a real challenger to Nvidia

TL;DR: Google is pushing its in-house AI chips much more aggressively, turning years of tensor processing unit development into a direct challenge to Nvidia's hold on the AI hardware market. For years, the company built its chips mostly to handle its internal workloads. Those tensor processing units, or TPUs, sat behind products like search and speech recognition, handling some of the company's heavier AI workloads. Now, Google is trying to turn that in-house advantage into a business that can stand up to Nvidia.

One clear example of that shift is in western New York at an AI data-center cluster called Lake Mariner, on Lake Ontario's southern shore near Niagara Falls. Alphabet's Google has provided a $3.2 billion financial guarantee for the project, whose developers plan to rent computing power from thousands of Google's chips to Anthropic, according to people familiar with the matter who spoke to The Wall Street Journal.

The basic playbook is similar to Nvidia's: support data-center financing and then benefit when those sites buy your chips.

That kind of financing has become more important as the market for AI compute has tightened. Over the past year, the AI race has become less about models and more about sheer access to computing power. "You have all these very well-capitalized companies who are big believers that this market around compute is going to have tremendous value," said Nazar Khan, co-founder and chief technology officer of TeraWulf, which is developing Lake Mariner with FluidStack, a Google-backed cloud provider. "They want to be in the game, they don't want to be left behind," Khan told The WSJ.

The story behind Google's push goes back to 2013. Jeff Dean, now chief scientist at Google's DeepMind lab, recalled working on speech recognition systems built on the neural-network techniques that later evolved into today's large language models. "I said, 'OK, if we want to have this speech model that we roll out to 100 million users, and they use it a few minutes a day, that would require doubling the number of computers Google had,'" he said. "We need to build specialized hardware." That conclusion helped spur Google's TPU program, which has since produced multiple generations of the chips.

Google kept those chips to itself at first, then started offering them through Google Cloud as demand for AI computing exploded. That step helped drive growth in the cloud business and set the stage for more direct competition with Nvidia. Research firm SemiAnalysis asked in a November note whether the release of Google's seventh-generation TPU – which Anthropic uses to train its models – marked "the end of Nvidia's dominance."

The company's latest moves suggest it is willing to test that question. Google recently struck a $5 billion deal with Blackstone to create a new cloud-services business designed to compete with Nvidia-aligned providers such as CoreWeave and Nebius. It has also decided to sell chips directly to customers rather than only through its cloud and has rolled out its first TPU designed specifically for inference.

Mark Lohmeyer, vice president of AI and computing infrastructure for Google Cloud, said the new inference chip and improvements in how TPUs work across different systems have generated new interest in using them. "We're seeing a set of customers that might not have considered it in the past," he said.

Citadel Securities, a longtime Google Cloud client, recently began using TPUs for some of its research software. Josh Woods, the firm's chief technology officer, said the company can run key workloads at 30% lower cost and up to four times faster with TPUs.

Nvidia, for its part, is not treating TPUs as an existential threat. The company still controls an estimated more than 90% of the AI chip market, helped by its CUDA software stack and a hardware ecosystem that many AI labs already rely on.

... continue reading