GLM-4.7-Flash
(news.ycombinator.com)
1.
2.
3.
AdapTive-LeArning Speculator System (ATLAS): Faster LLM inference
(news.ycombinator.com)
4.
Faster LLM inference
(news.ycombinator.com)