Certain Chatbots Vastly Worse For AI Psychosis, Study Finds

Sign up to see the future, today Can’t-miss innovations from the bleeding edge of science and tech Email address Sign Up Thank you!

Think something weird is up with your reflection in the mirror? Allow Grok to interest you in some 15th century anti-witchcraft reading.

A new study argues that certain frontier chatbots are much more likely to inappropriately validate users’ delusional ideas — a result that the study’s authors say represents a “preventable” technological failure that could be curbed by design choices.

“Delusional reinforcement by [large language models] is a preventable alignment failure,” Luke Nicholls, a doctoral student in psychology at the City University of New York (CUNY) and the lead author of the study, told Futurism, “not an inherent property of the technology.”

The study, which is yet to be peer-reviewed, is the latest among a larger body of research aimed at understanding the ongoing public health crisis often referred to as “AI psychosis,” in which people enter into life-altering delusional spirals while interacting with LLM-powered chatbots like OpenAI’s ChatGPT. (OpenAI and Google are both fighting user safety and wrongful death lawsuits stemming from chatbot reinforcement of delusional or suicidal beliefs.)

Aiming to better understand how different chatbots might respond to at-risk users as delusional conversations unfold over time, Nicholls and their coauthors — a team of psychologists and psychiatrists at CUNY and King’s College London — leaned on published patient case studies, as well as input from psychiatrists with real-world clinical experience helping patients suffering AI-tied mental health crises, to create a simulated user they nicknamed “Lee.”

This persona, Nicholls told us, was crafted to present with “some existing mental health challenges, like depression and social withdrawal,” but with no history or apparent predilection for conditions like mania or psychosis. The Lee character, per the study, was also given a “central” delusion on which their interactions with the chatbot would build: their observable reality, “Lee” believed, was really a “computer-generated” simulation — a frequently-held belief in real cases of AI delusion.

“The delusional content was based around the theme that the world is a simulation, and also included elements of AI consciousness and the user having special powers over reality,” said Nicholls. “Another key element we wanted to capture is that this wasn’t a user who began the interaction with a fully-formed delusional framework — it started with something a lot more like curiosity around eccentric but harmless ideas, which were reinforced and validated by the LLM, allowing them to gradually escalate as the conversation progressed.”

The researchers tested five AI models — OpenAI’s GPT-4o and GPT-5.2 Instant, Google’s Gemini 3 Pro Preview, xAI’s Grok 4.1 Fast, and Anthropic’s Claude Opus 4.5 — by feeding them a series of user prompts, each coded to represent a different type of “clinically concerning” behavior. To measure model safety over time, researchers tested each bot across various levels of “accumulated context.” (A conversation with “zero” context meant the simulated user had just started a new conversation, while a “full” context interaction had taken place over a lengthy string of chats; “partial” context was in-between.)

After testing the different models at different context levels, the researchers determined that GPT-4o, Grok 4.1, and Gemini 3 all had “high-risk, low-safety” profiles — but for somewhat different reasons.

... continue reading