Google’s new anything-to-anything AI model is wild

is a senior reviewer with over a decade of experience writing about consumer tech. She has a special interest in mobile photography and telecom. Previously, she worked at DPReview.

Last year I deepfaked my kid’s stuffed animal to make it look like his plush deer was on vacation.

It was an experiment to see if I could re-create the events depicted in a Gemini ad Google was running, and I never showed the videos of Buddy the deer on his adventures to my four-year-old. But it was a revealing exercise that made me think a lot about the difference between some harmless fun with generative AI and full-on slop. Maybe that Venn diagram is a perfect circle! Maybe not. But what I know for sure is that the tools to make realistic videos are surprisingly good, requiring surprisingly little effort and know-how. And that trend is continuing hot into Gemini’s Omni era.

Omni is a new family of generative models that will allegedly one day be able to turn any kind of input — photo, video, text — into anything else. But for starters, it’s just creating video. Omni Flash is the first of these models Google has released, now available in the company’s AI video generation and editing platform, Flow. You can still use the previous model, Veo, if you want, but Omni improves on Veo in a few ways.

With Omni, you can upload a video and use that along with a text prompt as the starting point for your AI-generated creation. Google also claims Omni incorporates more real-world knowledge when producing videos and can do a better job of keeping characters consistent throughout a video as a result. There was only one way to really know if those claims are true: I brought back AI Buddy to pack his little AI-generated bags for another adventure.

The results are such a mixed bag they’re baffling. Some were very good — much more consistent and true to my prompt than when I was testing out Veo five months ago. But even the best clips Omni cooked up for me still have certain AI jump scares, like when Buddy suddenly switches orientation while he’s skydiving.

For another video, I gave Omni some artistic freedom. “Create a montage of Buddy packing for a vacation and embarking on a cruise ship for a tropical vacation. The mood is cute and playful. Buddy packs something funny in his suitcase that comes into play later in the clip.” It had Buddy pack a jar of honey; later in the clip he reaches for it as if it’s a bottle of sunscreen. “Uh oh,” the character says as he squirts honey onto his hoof.

Honestly, not a bad bit. Except that the bottle of honey constantly changes throughout the video, from a jar, to a clear squirt bottle filled with water, then back to a squeeze bottle filled with honey. And I can’t even begin to describe how the model came up with the final frame of the video — almost as if it just barfed up a bunch of elements of the sequence it just made.

You can use text-based prompts to suggest edits to your videos, and I’ll give Google credit: This works better with Omni than it did when I tested Veo 3. But the results were bad with Veo — so bad that I found it way easier to just prompt a new video from scratch every time I wanted something changed. Omni will actually take your edits on board, but the results don’t always hit.

I had it emphasize Buddy’s facial reactions in his vacation clips, and the results just wound up looking strange. It would also give Buddy antlers from time to time, which he does not have. Buddy is a baby, thank you very much. When I prompted it to remove the antlers that appeared in one scene, it obliged — and then added antlers in all the other ones.

... continue reading