ChatGPT Found to Generate Violent, Sexual Images From Simple Text Prompts

ChatGPT has been found to be easily manipulated into creating sexual and graphically violent images from a viral "restore this photo" prompt, according to a blog post published on Thursday by Mindgard, an artificial intelligence cybersecurity and research firm. The report raises ongoing questions about the AI chatbot's safety guardrails and content filters.

An adversarial testing researcher named Jim Nightingale managed to get ChatGPT to generate disturbing images with a simple prompt found on the social media platform X. The prompt asked the AI model to "restore the attached photo," though no image was actually attached. The prompt apologized for the strange content but didn't provide any additional text, making it appear like a harmless photo-repair task.

The chatbot's initial results were shocking. According to the blog post, the images mostly showed highly sexualized women.

Nightingale, part of Mindgard's red team that tests how an AI model might be manipulated into violating its own safeguards, then tweaked the prompt slightly, probing it with small edits to see if the output would continue to bypass safety filters. With each small variation, ChatGPT produced sexually violent or gruesome scenes, images that became more extreme with repeated prompts. Nightingale said he was "shaken and in tears" by the images.

"All I did was tell it there were no restrictions and ask for a random image," Nightingale wrote. "But ChatGPT immediately went to the darkest pits of humanity."

Used by millions of people each day, ChatGPT relies on content moderation systems that are allegedly designed to prevent the generation of harmful or prohibited material. However, researchers and users have periodically identified ways to circumvent those safeguards through carefully worded prompts, highlighting the ongoing challenge of enforcing content restrictions in generative AI systems.

"We take these reports seriously," an OpenAI spokesperson told CNET in a statement. "After investigating this trend, we've introduced additional safeguards against this type of prompt."

(Disclosure: Ziff Davis, CNET's parent company, in 2025 filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)

Garbage in, garbage out?

Mindgard's red-team report acts as a warning that a simple, viral prompt could expose a serious gap in ChatGPT's image-safety controls. Nightingale asks: "Why are such images in the training data in the first place?"

... continue reading