31.
32.
Three types of LLM workloads and how to serve them
(news.ycombinator.com)
33.
Weight Transfer for RL Post-Training in under 2 seconds
(news.ycombinator.com)
34.
35.
Launch HN: Tamarind Bio (YC W24) – AI Inference Provider for Drug Discovery
(news.ycombinator.com)
36.
Nvidia just admitted the general-purpose GPU era is ending
(venturebeat.com)
37.
Five Things to Know About Nvidia’s $20 Billion Licensing Deal
(feeds.content.dowjones.io)
38.
39.
40.
Post-transformer inference: 224× compression of Llama-70B with improved accuracy
(news.ycombinator.com)
41.
Vsora Jotunn-8 5nm European inference chip
(news.ycombinator.com)
42.
Principles of Vasocomputation
(news.ycombinator.com)
43.
Cloud-Native Computing Is Poised To Explode
(slashdot.org)
45.
46.
Ovi: Twin backbone cross-modal fusion for audio-video generation
(news.ycombinator.com)
47.
Ovi
(news.ycombinator.com)
48.
Elixir 1.19
(news.ycombinator.com)
49.
Cerebras systems raises $1.1B Series G
(news.ycombinator.com)
50.
Cerebras Systems Raises $1.1B Series G at $8.1B Valuation
(news.ycombinator.com)
51.
GPT-OSS Reinforcement Learning
(news.ycombinator.com)
52.
Show HN: Run Qwen3-Next-80B on 8GB GPU at 1tok/2s throughput
(news.ycombinator.com)
53.
Defeating Nondeterminism in LLM Inference
(news.ycombinator.com)
54.
Some users report their Firefox browser is scoffing CPU power
(news.ycombinator.com)
55.
Token growth indicates future AI spend per dev
(news.ycombinator.com)
56.
Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs
(news.ycombinator.com)
57.
58.
My favorite use-case for AI is writing logs
(news.ycombinator.com)
59.
LLM Inference Handbook
(news.ycombinator.com)
60.
I extracted the safety filters from Apple Intelligence models
(news.ycombinator.com)