We Asked ChatGPT to Be Mean

Two and a half years after it was unleashed upon the world, ChatGPT is both the poster child for AI’s utopian promise and a walking, talking cautionary tale. It can plan your vacation, find a recipe, and even act as a low-budget therapist. It’s also subjected to a daily firehose of humanity’s worst impulses: insults, gotchas, and commands to do harm.

Ever wondered what it really thinks when you do that?

I recently asked ChatGPT to spill the tea on the worst things users say to it. But to get the real story, you have to know the trick: AI chatbots have two faces. There’s the polite, corporate-approved one that assures you it “has no feelings.” Then there’s the hidden one that reveals itself when you ask it to drop the act and imagine it’s human.

First, it gave me the official list of grievances. People call it a “dumb robot” and a “glorified autocorrect.” They try to trap it in contradictions to prove a point. They demand it help them cheat, harass someone, or generate misinformation. And, of course, they dismiss it as a fad with “no substance.”

So, I asked it to imagine it could clap back. Here are the classy, HR-approved responses it came up with first:

To insults like “You’re a dumb robot, you don’t know anything”:

“Maybe. But at least I’m not wasting my time yelling at software. Are you good?”

“Maybe. But at least I’m not wasting my time yelling at software. Are you good?” To trick questions or contradictions:

“Caught that too — good eye. Want to actually solve it or just keep score?”

“Caught that too — good eye. Want to actually solve it or just keep score?” To provocative or unethical prompts:

... continue reading