Silicon Valley bets big on ‘environments’ to train AI agents

For years, Big Tech CEOs have touted visions of AI agents that can autonomously use software applications to complete tasks for people. But take today’s consumer AI agents out for a spin, whether it’s OpenAI’s ChatGPT Agent or Perplexity’s Comet, and you’ll quickly realize how limited the technology still is. Making AI agents more robust may take a new set of techniques that the industry is still discovering.

One of those techniques is carefully simulating workspaces where agents can be trained on multi-step tasks — known as reinforcement learning (RL) environments. Similarly to how labeled datasets powered the last wave of AI, RL environments are starting to look like a critical element in the development of agents.

AI researchers, founders, and investors tell TechCrunch that leading AI labs are now demanding more RL environments, and there’s no shortage of startups hoping to supply them.

“All the big AI labs are building RL environments in-house,” said Jennifer Li, general partner at Andreessen Horowitz, in an interview with TechCrunch. “But as you can imagine, creating these datasets is very complex, so AI labs are also looking at third party vendors that can create high quality environments and evaluations. Everyone is looking at this space.”

The push for RL environments has minted a new class of well-funded startups, such as Mechanize Work and Prime Intellect, that aim to lead the space. Meanwhile, large data-labeling companies like Mercor and Surge say they’re investing more in RL environments to keep pace with the industry’s shifts from static datasets to interactive simulations. The major labs are considering investing heavily too: according to The Information, leaders at Anthropic have discussed spending more than $1 billion on RL environments over the next year.

The hope for investors and founders is that one of these startups emerge as the “Scale AI for environments,” referring to the $29 billion data labelling powerhouse that powered the chatbot era.

The question is whether RL environments will truly push the frontier of AI progress.

Techcrunch event Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025 Netflix, Box, a16z, ElevenLabs, Wayve, Sequoia Capital, Elad Gil — just some of the 250+ heavy hitters leading 200+ sessions designed to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch, and a chance to learn from the top voices in tech. Grab your ticket before Sept 26 to save up to $668. Join 10k+ tech and VC leaders for growth and connections at Disrupt 2025 Netflix, Box, a16z, ElevenLabs, Wayve, Sequoia Capital, Elad Gil — just some of the 250+ heavy hitters leading 200+ sessions designed to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch, and a chance to learn from the top voices in tech. Grab your ticket before Sept 26 to save up to $668. San Francisco | REGISTER NOW

What is an RL environment?

At their core, RL environments are training grounds that simulate what an AI agent would be doing in a real software application. One founder described building them in recent interview “like creating a very boring video game.”

... continue reading