Open R1
A fully open reproduction of DeepSeek-R1. This repo is a work in progress, let's build it together!
Table of Contents
Overview
The goal of this repo is to build the missing pieces of the R1 pipeline such that everybody can reproduce and build on top of it. The project is simple by design and mostly consists of:
src/open_r1 : contains the scripts to train models as well as generate synthetic data: grpo.py : trains a model with GRPO on a given dataset. sft.py : performs a simple SFT of a model on a dataset. generate.py : generates synthetic data from a model using Distilabel.
: contains the scripts to train models as well as generate synthetic data: Makefile : contains easy-to-run commands for each step in the R1 pipeline leveraging the scripts above.
Plan of attack
We will use the DeepSeek-R1 tech report as a guide, which can roughly be broken down into three main steps:
Step 1: replicate the R1-Distill models by distilling a high-quality corpus from DeepSeek-R1.
... continue reading