Anthropic says some Claude models can now end ‘harmful or abusive’ conversations
Anthropic has announced new capabilities that will allow some of its newest, largest models to end conversations in what the company describes as “rare, extreme cases of persistently harmful or abusive user interactions.” Strikingly, Anthropic says it’s doing this not to protect the human user, but rather the AI model itself. To be clear, the company isn’t claiming that its Claude AI models are sentient or can be harmed by their conversations with users. In its own words, Anthropic remains “hig