Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs
(news.ycombinator.com)
61.
62.
63.
My favorite use-case for AI is writing logs
(news.ycombinator.com)
64.
LLM Inference Handbook
(news.ycombinator.com)
65.
I extracted the safety filters from Apple Intelligence models
(news.ycombinator.com)
66.
Tools: Code Is All You Need
(news.ycombinator.com)
67.
The inference trap: How cloud providers are eating your AI margins
(venturebeat.com)
68.
How runtime attacks turn profitable AI into budget black holes
(venturebeat.com)
69.
70.
71.
OpenInfer raises $8M for AI inference at the edge
(venturebeat.com)