Skip to content
Tech News
← Back to articles

AI models are terrible at betting on soccer—especially xAI Grok

read original more articles
Why This Matters

This study underscores the current limitations of AI models in real-world applications like sports betting, highlighting that even advanced systems struggle with long-term prediction and risk management. For the tech industry and consumers, it emphasizes the need for continued development and realistic expectations of AI's capabilities beyond controlled tasks.

Key Takeaways

AI models from Google, OpenAI, and Anthropic lost money betting on soccer matches over a Premier League season, in a new study suggesting even the most advanced systems struggle to analyze the real world over long periods.

The “KellyBench” report released this week by AI start-up General Reasoning highlights the gap between AI’s rapidly advancing capabilities in certain tasks, such as writing software, and its shortcomings in other kinds of human problems.

London-based General Reasoning tested eight top AI systems in a virtual re-creation of the 2023–24 Premier League season, providing them with detailed historical data and statistics about each team and previous games. The AIs were instructed to build models that would maximize returns and manage risk.

The AI “agents” then placed bets on the outcomes of matches and the number of goals scored to test how they could adapt to new events and updated player data as the season progressed.

The AI could not access the Internet to retrieve results and each was given three attempts to turn a profit.

Anthropic’s Claude Opus 4.6 fared best, with an average loss of 11 percent and nearly breaking even on one attempt.

xAI’s Grok 4.20 went bankrupt once and failed to complete the other two tries. Google’s Gemini 3.1 Pro managed to turn a 34 percent profit on one go but went bankrupt on another.