Eagle 3.1: Collaboration Between the EAGLE Team, vLLM Team, and TorchSpec Team
(news.ycombinator.com)
1.
2.
4.
Accelerating Gemma 4: faster inference with multi-token prediction drafters
(news.ycombinator.com)
5.
GLM-4.7-Flash
(news.ycombinator.com)
6.
7.
AdapTive-LeArning Speculator System (ATLAS): Faster LLM inference
(news.ycombinator.com)
8.
Faster LLM inference
(news.ycombinator.com)