AI company Runway has announced what it calls its first world model, GWM-1. It’s a significant step in a new direction for a company that has made its name primarily on video generation, and it’s part of a wider gold rush to build a new frontier of models as large language models and image and video generation move into a refinement phase, no longer an untapped frontier.
GWM-1 is a blanket term for a trio of autoregression models, each built on top Runway’s Gen-4.5 text-to-video generation model and then post-trained with domain-specific data for different kinds of applications. Here’s what each does.
Runway’s world model announcement livestream video.
GWM Worlds
GWM Worlds offers an interface for digital environment exploration with real-time user input that affects the generation of coming frames, which Runway suggests can remain consistent and coherent “across long sequences of movement.”
Users can define the nature of the world—what it contains and how it appears—as well as rules like physics. They can give it actions or changes that will be reflected in real-time, like camera movements or descriptions of changes to the environment or the objects in it. As the methodology here is basically an advanced form of frame prediction, it might be a stretch to say these are full-on world simulations, but the claim is that they’re reliable enough to be usable as such.
Potential applications include pre-visualization and early iteration for game design and development, generation of virtual reality environments, or educational explorations of historical spaces.
There’s also a major use case that takes this outside Runway’s usual area of focus: World models like this can be used to train AI agents of various types, including robots.
GWM Robotics
The second model, GWM Robotics, does just that. It can be used “to generate synthetic training data that augments your existing robotics datasets across multiple dimensions, including novel objects, task instructions, and environmental variations.”