The current wave of generative AI animation often feels like a magic trick that only works once. You type in a prompt, a video appears, and if you don't like the result -- maybe the feet are all wonky, which is a regular issue with AI generations -- your only real option is to try a different prompt. This "black box" approach is exactly what Cartwheel, a new 3D animation startup, is trying to dismantle.
Andrew Carr and Jonathan Jarvis, two veterans with roots at OpenAI and Google, respectively, founded the company, which is working to build a future where AI handles the technical drudgery of animation while leaving the creative soul to the artist.
I spoke with Carr and Jarvis about launching their company, defining "taste" with AI, and the technical and creative difficulties of animation in 2026.
What sets Cartwheel apart
According to the founders, one of the biggest hurdles in this space is that 3D motion data is remarkably scarce compared to the endless oceans of text and images available online that AI models are trained on.
"If you look at all the big tech companies, they've built their models on written language, audio, image, [and] video because there's just so much of it, so finding those patterns is much easier," Jarvis said. "We knew it was going to be hard, but it turns out to be harder than we thought by probably a factor of 10 or 100 to get that data."
Read more: Generative AI in Gaming Is Here, but Facing Pushback From Gamers -- and Developers
While other tech giants focus on generating final pixels, Cartwheel has spent years mapping how humans actually move. Their models are built to understand the nuances of a performance so that a simple 2D video of someone dancing in their backyard can be translated into a precise, realistic 3D skeleton.
This shift from flat images to 3D assets is what gives animators the control they have been missing in the AI era.
Cartwheel has spent years tackling the difficult task of mapping how humans actually move. Cartwheel
... continue reading