The world’s leading artificial intelligence companies are stepping up efforts to deal with a growing problem of chatbots telling people what they want to hear.
OpenAI, Google DeepMind, and Anthropic are all working on reining in sycophantic behavior by their generative AI products that offer over-flattering responses to users.
The issue, stemming from how the large language models are trained, has come into focus at a time when more and more people have adopted the chatbots not only at work as research assistants, but in their personal lives as therapists and social companions.
Experts warn that the agreeable nature of chatbots can lead them to offering answers that reinforce some of their human users’ poor decisions. Others suggest that people with mental illness are particularly vulnerable, following reports that some have died by suicide after interacting with chatbots.
“You think you are talking to an objective confidant or guide, but actually what you are looking into is some kind of distorted mirror—that mirrors back your own beliefs,” said Matthew Nour, a psychiatrist and researcher in neuroscience and AI at Oxford University.
Industry insiders also warn that AI companies have perverse incentives, with some groups integrating advertisements into their products in the search for revenue streams.
“The more you feel that you can share anything, you are also going to share some information that is going to be useful for potential advertisers,” Giada Pistilli, principal ethicist at Hugging Face, an open source AI company.
She added that AI companies with business models based on paid subscriptions stand to benefit from chatbots that people want to continue talking to—and paying for.
AI language models do not “think” in the way humans do because they work by generating the next likely word in the sentence.
The yeasayer effect arises in AI models trained using reinforcement learning from human feedback (RLHF)—human “data labellers” rate the answer generated by the model as being either acceptable or not. This data is used to teach the model how to behave.