Tech News
← Back to articles

Turns out, AI can actually build competent Minesweeper clones — Four AI coding agents put to the test reveal OpenAI's Codex as the best, while Google's Gemini CLI as the worst

read original related products more articles

As the world burns around us because of corporations chasing AI with seemingly unlimited resources, we ought to see what all this commotion has bought us. Recently, the folks over at Ars Technica put four of the most popular AI coding agents to the test, with a deceptively simple ask: build Minesweeper for the web. The clone had to include sound effects, mobile touchscreen support, and a "fun" gameplay twist.

For those unaware, Minesweeper relies on logic, which dictates gameplay, along with reasonable enough UI/UX elements that combine to make a decent challenge. It's not exactly hard to make a Minesweeper clone, but its underlying mechanics require at least some level of ingenuity that usually comes from humans — after all, AGI is the goal, right?

The test included (paid versions of) Claude Code from Anthropic, Gemini CLI from Google, Mistral Vibe, and OpenAI's Codex, based on GPT-5. All of these were given the same instructions, and whatever the AI ends up generating in its first run is what'll be used to tally the scores. No human input or second chances beyond the start.

OpenAI Codex - 9/10 🏅

Image 1 of 2 (Image credit: Future) (Image credit: Future)

The best performer by far was Codex, which not only did a decent job with the visuals, but was the only AI to actually include "chording," a technique that reveals all surrounding tiles if you placed your flags right. Chording is a favorite amongst seasoned players, so its omission automatically makes any Minesweeper clone feel less polished.

Codex's build had all the buttons properly working, including a toggle for sound, featuring era-accurate beeps and boops, along with on-screen instructions for both mobile and desktop. As for the gameplay twist, there was a "Lucky Sweep" button in the corner that would occasionally reveal one safe tile when you've earned it.

The coding experience with Codex was also smooth, with the command line interface featuring nice animations and local permission management, though the agent did take its sweet time with writing the code. Ars Technica described this effort as the closest to something that would be ready to ship with minimal human interference, scoring it an impressive 9/10.

Claude Code - 7/10

Image 1 of 2 (Image credit: Future) (Image credit: Future)

... continue reading