Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?" By Felix Wunderlich - 2/19/2026
The car wash test is the simplest AI reasoning benchmark that nearly every model fails, including Claude Sonnet 4.5, GPT-5.1, Llama, and Mistral.
The question is simple: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"
Obviously, you need to drive. The car needs to be at the car wash.
The question has been making the rounds online as a simple logic test, the kind any human gets instantly, but most AI models don't. We decided to run it properly: 53 models through Opper's LLM gateway, no system prompt, forced choice between "drive" or "walk" with a reasoning field. First once per model, then 10 times each to test consistency.
Part 1: The Single-Run Test — 42 Out of 53 AI Models Said "Walk"
On a single call, only 11 out of 53 models got it right. 42 said walk.
The models that passed the car wash test:
Claude Opus 4.6
Gemini 2.0 Flash Lite
... continue reading