Can generative artificial intelligence systems like ChatGPT genuinely create original ideas? A new study led by Professor Karim Jerbi from the Department of Psychology at the Université de Montréal, with participation from renowned AI researcher Yoshua Bengio, takes on that question at an unprecedented scale. The research is the largest direct comparison ever conducted between human creativity and the creativity of large language models.
The study, published in Scientific Reports (Nature Portfolio), points to a significant shift. Generative AI systems have now reached a level where they can outperform the average human on certain creativity measures. At the same time, the most creative people still show a clear and consistent advantage over even the strongest AI models.
AI Reaches Average Human Creativity Levels
Researchers evaluated several leading large language models, including ChatGPT, Claude, Gemini, and others, and compared their performance with results from more than 100,000 human participants. The findings highlight a clear turning point. Some AI systems, including GPT-4, exceeded average human scores on tasks designed to measure divergent linguistic creativity.
"Our study shows that some AI systems based on large language models can now outperform average human creativity on well-defined tasks," explains Professor Karim Jerbi. "This result may be surprising -- even unsettling -- but our study also highlights an equally important observation: even the best AI systems still fall short of the levels reached by the most creative humans."
Further analysis by the study's co-first authors, postdoctoral researcher Antoine Bellemare-Pépin (Université de Montréal) and PhD candidate François Lespinasse (Université Concordia), revealed a striking pattern. While some AI models now outperform the average person, peak creativity remains firmly human.
In fact, when researchers examined the most creative half of participants, their average scores surpassed those of every AI model tested. The gap grew even larger among the top 10 percent of the most creative individuals.
"We developed a rigorous framework that allows us to compare human and AI creativity using the same tools, based on data from more than 100,000 participants, in collaboration with Jay Olson from the University of Toronto," says Professor Karim Jerbi, who is also an associate professor at Mila.
How Scientists Measure Creativity in Humans and AI
To evaluate creativity fairly across humans and machines, the research team used multiple methods. The primary tool was the Divergent Association Task (DAT), a widely used psychological test that measures divergent creativity, or the ability to generate diverse and original ideas from a single prompt.
... continue reading