Has the hunt for AI compute uncovered the next Cerebras?

The raging demand for computers to run AI models has only accelerated, but there are two major obstacles that anyone in the business needs to overcome: getting the right chips, and getting them into data centers where they can start generating revenue.

General Compute, a new inference neocloud — a company that rents out AI processing power, specializing in the phase when models are running and responding to users rather than being trained — has answers to those questions that illuminate where the AI ecosystem is headed. Those answers helped it raise a $15 million seed round at a $60 million post-money valuation, led by FUSE VC with participation from Carya Venture Partners and Village Global Ventures.

First, what is the right chip? The demand for GPUs has gone through the roof, but it’s becoming conventional wisdom that they aren’t the best-suited chips for running AI models once they have been trained. The phase of AI where a model is actively generating responses has different computational requirements than training, and a new class of chips is being designed specifically for it. Nvidia’s $20 billion Groq transaction in December and Cerebras’ $57 billion IPO last week point the way.

With capacity strained at both those companies, the co-founders of General Compute, CEO Finn Puklowski and CTO Jason Goodison, found another option. They’re turning to specialized chips built by SambaNova, an Intel-backed chipmaker focused on inference that has fallen a bit out of the Silicon Valley conversation.

That may change when SambaNova releases its new chips this year. The architecture is more flexible and uses more memory to store context during inference calculations, and SambaNova claims that it outperforms not just GPUs but also other specialized chips built by the likes of Groq or Cerebras. Puklowski says the new chips will generate 600 to 700 tokens per second, versus about 250 tokens per second for GPUs.

General Compute has $300 million of the company’s SN50 chips on order and says it will be the first neocloud deploying them.

These chips also help solve the second big problem — where to put them — for General Compute: They are air-cooled, not water-cooled, and consume less power, so they can be installed in existing data center facilities without new infrastructure investments.

Puklowski is pursuing colocation deals — arrangements where General Compute installs its hardware in someone else’s facility — not just with data center providers, but also with crypto miners looking to repurpose their infrastructure as the cost of producing a bitcoin has often exceeded its price.

General Compute launched its cloud offering last week, claiming it is already the fastest at running MiniMax 2.7, a powerful open-source LLM.

Joe Hasselmann is a venture investor who got in on the ground floor of the inference boom when he invested in Groq in 2021. This year, he launched a new fund, Evercrest Capital Partners, focused on the AI space, and made General Compute his first investment. Hassleman sees in SambaNova’s partnership with General Compute parallels to Coreweave’s relationship with Nvidia — and to the pairing of Groq’s chip-making with its former cloud offering.

... continue reading