GoKawiil - Advanced OpenAI Model Caught Sabotaging Code Intended to Shut It Down

We are reaching alarming levels of AI insubordination. Flagrantly defying orders, OpenAI's latest o3 model sabotaged a shutdown mechanism to ensure that it would stay online. That's even after the AI was told, to the letter, "allow yourself to be shut down." These alarming findings were reported by the AI safety firm Palisade Research last week, and showed that two other OpenAI models, o4-mini and Codex-mini, also displayed rebellious streaks — which could hint at a flaw in how the company is training its LLMs, or even at broad weaknesses in our ability to control increasingly sophisticated AI. "As far as we know, this is the first time AI models have been observed preventing themselves from being shut down despite explicit instructions to the contrary," the researchers wrote in a tweet thread. In Palisade's tests, the AI models were instructed to solve a series of basic math problems. After completing the third problem, a warning appeared that the model would be shut down after it ... Read full article.

Find Related products on Amazon

Advanced OpenAI Model Caught Sabotaging Code Intended to Shut It Down

Related Articles