Skip to content
Tech News
← Back to articles

Anthropic releases safer Claude Code 'auto mode' to avoid mass file deletions and other AI snafus

read original get AI Safety Monitoring Tool → more articles
Why This Matters

Anthropic's new 'auto mode' for Claude Code aims to enhance AI safety by reducing risks like mass file deletions and malicious code execution. This development is significant for the tech industry as it addresses safety concerns associated with autonomous AI actions, offering a more controlled yet efficient workflow for developers and organizations. It highlights ongoing efforts to balance AI autonomy with safety, crucial for broader adoption and trust in AI tools.

Key Takeaways

Anthropic has begun previewing "auto mode" inside of Claude Code. The company describes the new feature as a middle path between the app's default behavior, which sees Claude request approval for every file write and bash command, and the "dangerously-skip-premissions" command some coders use to make the chatbot function more autonomously.

With auto mode enabled, a classifier system guides Claude, giving it permission to carry out actions it deems safe, while redirecting the chatbot to take a different approach when it determines Claude might do something risky. In designing the system, Anthropic's goal was to reduce the likelihood of Claude carrying out mass file deletions, extracting sensitive data or executing malicious code.

Of course, no system is perfect, and Anthropic warns as such. "The classifier may still allow some risky actions: for example, if user intent is ambiguous, or if Claude doesn't have enough context about your environment to know an action might create additional risk," the company writes.

Advertisement Advertisement

Advertisement

Anthropic doesn't mention a specific incident as inspiration for auto mode, but the recent 13-hour AWS outage Amazon suffered after one of the company's AI tools reportedly deleted a hosting environment, was probably front of mind for the company. Amazon blamed that specific incident on human error, saying the staffer involved in the incident had "broader permissions than expected."

Team plan users can preview auto mode starting today, with the feature set to roll out to Enterprise and API users in the coming days.