31.
32.
33.
xAI's Grok 4.1 rolls out with improved quality and speed for free
(bleepingcomputer.com)
34.
35.
Study identifies weaknesses in how AI systems are evaluated
(news.ycombinator.com)
36.
AI benchmarks are a bad joke – and LLM makers are the ones laughing
(news.ycombinator.com)
37.
AI Model Growth Outpaces Hardware Improvements
(spectrum.ieee.org)
38.
Microsoft lets bosses spot teams that are dodging Copilot
(news.ycombinator.com)
40.
Worried about the Pixel 10 Pro XL benchmark controversy? Here’s why you shouldn’t be
(androidauthority.com)
41.
AI agent benchmarks are broken
(news.ycombinator.com)
42.
AI Agent Benchmarks Are Broken
(news.ycombinator.com)
43.
Koala: A benchmark suite for performance-oriented shell-optimization research
(news.ycombinator.com)
44.
45.