VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO
(news.ycombinator.com)
1.
2.