There's a high energy cost to generative AI. But even the massive amount of energy needed to train and operate the large language models pales in comparison to what's required to run the video models behind tools like OpenAI's viral Sora app, which are flooding our social media feeds with goofy fake clips.
Generative AI models, on the whole, require a lot of energy to power. The servers that are running your ChatGPT query use a compute-intensive process that requires a lot of electricity to maintain. AI is the "biggest driver" of electricity use in North America, one report found. And that might be showing up in your power bill, with AI datacenters cropping up all over the US, raising the electric bills of households nearby. Some estimates say one AI query uses 10 times more energy than a simple Google search.
While the big AI firms are still hesitant to detail exactly how much it takes to train and run AI models, there's a growing field of research searching for answers. Sasha Luccioni, the AI and climate lead at Hugging Face -- one of the most popular AI platforms and research hubs -- is a leading researcher studying the energy demands of artificial intelligence. In a new study, Luccioni and her team examined several open-source AI video models. (Popular video tools such as Sora and Google's Veo 3 were not included in the study because they aren't open source.)
The team used the open-source Hugging Face codebase and created AI videos with a variety of models. They measured the amount of electricity required to create those clips as they changed different factors, including making the videos longer, at a higher resolution and higher quality (something achieved through a process called denoising). They ran the test using an Nvidia H100 SXM GPU, a high-powered computer chip that can be used in AI datacenters.
"Video generation is definitely a more computationally-intensive task -- instead of words, you're generating pixels, and there are multiple frames per second to make the videos flow well," said Luccioni in an email. "It's complex."
Take an AI video that's 10 seconds long and 240 frames per second. That's 240 images that the AI needs to generate, Luccioni explains. Especially for high-dimensional content, "That really adds up in terms of compute power and energy," she said.
AI video energy usage
The study found that video diffusion is 30 times more costly in terms of energy spent than image generation and 2,000 times more costly than text generation. Creating a single AI video uses approximately 90 Watt-hours, compared to the 2.9Wh needed for image generation and 0.047Wh for text generation.
To put those numbers into context, an average energy-efficient LED lightbulb uses between 8-10 watts. LCD televisions can use between 50-200 watts, with newer technology like OLEDs helping run them more efficiently. For example, the 65-inch Samsung S95F, CNET's pick for the best picture quality of 2025, typically draws 146W, according to Samsung. So creating one AI video would be equivalent to running this TV for 37 minutes.
The energy demands of generative AI, particularly for video, are significant. It sets the stage for a huge problem as AI becomes more widely used.
... continue reading