We Finally Know How Much It Cost to Train China’s Astonishing DeepSeek Model
Remember when DeepSeek briefly shook up the entire artificial intelligence industry by launching its large language model, R1, that was trained for a fraction of the money that OpenAI and other big players were pouring into their models? Thanks to a new paper published by the DeepSeek AI team in the journal Nature, we finally know what it took to train DeepSeek 1: $294,000 and 512 Nvidia H800 chips. The reason it was able to spend less, it seems, is because of the team’s use of trial-and-error-b