Hollywood’s pivot to AI video has a prompting problem

is a reporter focusing on film, TV, and pop culture. Before The Verge, he wrote about comic books, labor, race, and more at io9 and Gizmodo for almost five years.

It has become almost impossible to browse the internet without having an AI-generated video thrust upon you. Open basically any social media platform, and it won’t be long until an uncanny-looking clip of a fake natural disaster or animals doing impossible things slides across your screen. Most of the videos look absolutely terrible. But they’re almost always accompanied by hundreds, if not thousands, of likes and comments from people insisting that AI-generated content is a new art form that’s going to change the world.

That has been especially true of AI clips that are meant to appear realistic. No matter how strange or aesthetically inconsistent the footage may be, there is usually someone proclaiming that it’s something the entertainment industry should be afraid of. The idea that AI-generated video is both the future of filmmaking and an existential threat to Hollywood has caught on like wildfire among boosters for the relatively new technology.

The thought of major studios embracing this technology as is feels dubious when you consider that, oftentimes, AI models’ output simply isn’t the kind of stuff that could be fashioned into a quality movie or series. That’s an impression that filmmaker Bryn Mooser wants to change with Asteria, a new production house he launched last year, as well as a forthcoming AI-generated feature film from Natasha Lyonne (also Mooser’s partner and an advisor at Late Night Labs, a studio focused on generative AI that Mooser’s film and TV company XTR acquired last year).

Asteria’s big selling point is that, unlike most other AI outfits, the generative model it built with research company Moonvalley is “ethical,” meaning it has only been trained on properly licensed material. Especially in the wake of Disney and Universal suing Midjourney for copyright infringement, the concept of ethical generative AI may become an important part of how AI is more widely adopted throughout the entertainment industry. However, during a recent chat, Mooser stresses to me that the company’s clear understanding of what generative AI is and what it isn’t helps set Asteria apart from other players in the AI space.

“As we started to think about building Asteria, it was obvious to us as filmmakers that there were big problems with the way that AI was being presented to Hollywood,” Mooser says. “It was obvious that the tools weren’t being built by anybody who’d ever made a film before. The text-to-video form factor, where you say ‘make me a new Star Wars movie’ and out it comes, is a thing that Silicon Valley thought people wanted and actually believed was possible.”

In Mooser’s view, part of the reason some enthusiasts have been quick to call generative video models a threat to traditional film workflows boils down to people assuming that footage created from prompts can replicate the real thing as effectively as what we’ve seen with imitative, AI-generated music. It has been easy for people to replicate singers’ voices with generative AI and produce passable songs. But Mooser thinks that, in its rush to normalize gen AI, the tech industry conflated audio and visual output in a way that’s at odds with what actually makes for good films.

“You can’t go and say to Christopher Nolan, ‘Use this tool and text your way to The Odyssey,’” Mooser says. “As people in Hollywood got access to these tools, there were a couple things that were really clear — one being that the form factor can’t work because the amount of control that a filmmaker needs comes down to the pixel level in a lot of cases.”

To give its filmmaking partners more of that granular control, Asteria uses its core generative model, Marey, to create new, project-specific models trained on original visual material. This would, for example, allow an artist to build a model that could generate a variety of assets in their distinct style, and then use it to populate a world full of different characters and objects that adhere to a unique aesthetic. That was the workflow Asteria used in its production of musician Cuco’s animated short “A Love Letter to LA.” By training Asteria’s model on 60 original illustrations drawn by artist Paul Flores, the studio could generate new 2D assets and convert them into 3D models used to build the video’s fictional town. The short is impressive, but its heavy stylization speaks to the way projects with generative AI at their core often have to work within the technology’s visual limitations. It doesn’t feel like this workflow offers control down to the pixel level just yet.

Mooser says that, depending on the financial arrangement between Asteria and its clients, filmmakers can retain partial ownership of the models after they’re completed. In addition to the original licensing fees Asteria pays the creators of the material its core model is trained on, the studio is “exploring” the possibility of a revenue sharing system, too. But for now, Mooser is more focused on winning artists over with the promise of lower initial development and production costs.

... continue reading

Hollywood&#8217;s pivot to AI video has a prompting problem

Hollywood’s pivot to AI video has a prompting problem