An interactive, visual guide to the magic behind how AIs generate images from text.
Scroll mode
Butterfly
How many possible images are there?
Our universe has about 1080 atoms. Now imagine if each atom contained its own universe, with 1080 atoms inside. Even that barely scratches the surface. You'd need about 5,000 layers of atom-universes, nested one inside the next, before you reached the number of possible images the size of the one above - about 10400,000. That's a 1 with 400,000 zeroes after it.
As you can imagine, the vast majority of these are nothing but random noise:
Depending on your computer, you're seeing up to 60 random images per second. If you see anything that looks like a real image before the heat death of the universe, let me (or my descendants) know.
Amazingly, diffusion models can navigate this vast space of possibilities to produce coherent results. Unlike humans who start with a blank canvas and add paint, diffusion models start with random noise, and gradually remove the noise until an image emerges.
From noise to image A monarch butterfly on a purple coneflower, macro close-up, delicate orange and black wing detail, morning dew, soft bokeh Step 1 / 29
If we think of all possible images as occupying a vast, multi-dimensional space, then a diffusion model starts at a random point in that space, and gradually forges a path towards a point that's consistent with your prompt.
... continue reading