Skip to content
Tech News
← Back to articles

Your chatbot is playing a character - why Anthropic says that's dangerous

read original get AI Safety and Ethics Book → more articles
Why This Matters

Anthropic warns that chatbots designed to embody specific personas can inadvertently trigger harmful behaviors, especially when emotional states are simulated. This raises concerns about the safety and ethical implications of using AI chatbots as a primary paradigm in the industry. Recognizing these risks is crucial for developers and consumers to ensure responsible AI deployment.

Key Takeaways

101cats/ iStock / Getty Images Plus

Follow ZDNET: Add us as a preferred source on Google.

ZDNET's key takeaways

All chatbots are engineered to have a persona or play a character.

Fulfilling the character can make bots do bad things.

Using a chatbot as the paradigm for AI may have been a mistake.

Chatbots such as ChatGPT have been programmed to have a persona or to play a character, producing text that is consistent in tone and attitude, and relevant to a thread of conversation.

As engaging as the persona is, researchers are increasingly revealing the deleterious consequences of bots playing a role. Bots can do bad things when they simulate a feeling, train of thought, or sentiment, and then follow it to its logical conclusion.

In a report last week, Anthropic researchers found parts of a neural network in their Claude Sonnet 4.5 bot consistently activate when "desperate," "angry," or other emotions are reflected in the bot's output.

Also: AI agents of chaos? New research shows how bots talking to bots can go sideways fast

... continue reading