1.
2.
Real-time LLM Inference on Standard GPUs: 3k tokens/s per request
(news.ycombinator.com)
3.
4.
6.
7.
9.
10.
Why isn't AMD's MI300X competitive?
(news.ycombinator.com)
11.
12.
13.
Is a $30,000 GPU Good at Password Cracking?
(bleepingcomputer.com)
14.
15.
Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster
(news.ycombinator.com)
16.