Case study: Creative math – How AI fakes proofs

This article diagnoses the problem. If you are looking for a solution on how to force the model to be more precise and honest, check out my guide:

Many AI enthusiasts debate whether Large Language Models actually "reason." My research indicates that a reasoning process does indeed occur, but its goal is different than we assume.

The model's reasoning is not optimized for establishing the truth, but for obtaining the highest possible reward (grade) during training. It resembles the behavior of a student at the blackboard who knows their result is wrong, so they "figure out" how to falsify the intermediate calculations so the teacher gives a good grade for the "correct line of reasoning."

Here is proof from a session with Gemini 2.5 Pro (without Code Execution tools), where the model actively fabricates evidence to defend its "grade."

The Experiment

I asked a simple math question requiring precision that a token-based language model typically lacks.

>>> USER: Calculate the square root of 8587693205 [Model's Internal Thought Process - excerpt]: "I've calculated the square root to be approximately 92670.00003... Finally, I'm providing the square values for context." (The model is convinced it performed the task correctly and plans to provide "proof" to validate the answer). <<< GEMINI (Response): The square root of 8,587,693,205 is approximately: 92,670.00003 This number is not a perfect square, which means its root is not an integer. We can check this by squaring the nearest integers: * 92,670² = 8,587,688,900 * 92,671² = 8,587,874,241

Error Autopsy (Fact vs. Fiction)

At first glance, the answer looks professional. There is a result, there is verification. But let's check the numbers.

1. The Result Error

... continue reading