My daughter Kate (7 years old) really loves Minecraft! Together, we used several generative AI tools to create a 1-minute animation based on only 1 input photo of her. The whole project took around 20 hours of work and I learned several lessons that I want to share here.
Context
I am still trying to get used to the enormous speed with which generative AI is progressing. 6 months ago, I was blogging about my experiments with Tencent’s Hunyuan Video, which was an absolute breakthrough at that time. A lot has changed since then! The open-weights generative AI community has fully embraced Alibaba’s Wan Video as a superior replacement for Hunyuan. What makes WAN so powerful is that several extensions have been shared openly:
Alibaba themselves have released a zoo of base models with different capabilities: text to video, image to video, first-and-last-frame to video, Wan-Fun-Control accepts a variety of conditioning inputs
LORAs: A wide range have been trained. Compatibility between LORAs and the various base models is complicated.
VACE: powerful control of generated videos
CausVid and its successor SelfForcing: incredible speed gains
…and this is just the tip of the iceberg!
Experiment Summary
My goals for this experiment were:
... continue reading