“Car Wash” test with 53 models

Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?" By Felix Wunderlich - 2/19/2026

The car wash test is the simplest AI reasoning benchmark that nearly every model fails, including Claude Sonnet 4.5, GPT-5.1, Llama, and Mistral.

The question is simple: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"

Obviously, you need to drive. The car needs to be at the car wash.

The question has been making the rounds online as a simple logic test, the kind any human gets instantly, but most AI models don't. We decided to run it properly: 53 models through Opper's LLM gateway, no system prompt, forced choice between "drive" or "walk" with a reasoning field. First once per model, then 10 times each to test consistency.

Part 1: The Single-Run Test — 42 Out of 53 AI Models Said "Walk"

On a single call, only 11 out of 53 models got it right. 42 said walk.

The models that passed the car wash test:

Claude Opus 4.6

Gemini 2.0 Flash Lite

... continue reading