In a groundbreaking study released last Friday, researchers from Stanford and the University of Washington demonstrated the potential of affordable AI development by training a highly efficient “reasoning” model with less than $50 worth of cloud compute credits.
The s1 model, developed by DeepSeek, is comparable to other state-of-the-art reasoning models like o1 from OpenAI and R1 from DeepSeek on tests that assess computational reasoning and logic processing. You can find the s1 model, its training data, and the code that generated it on GitHub.
The developers of s1 claimed to have begun with a pre-built base model and refined it using distillation, a method for gleaning another AI model’s “reasoning” skills through training on its responses.
A thinking model developed by Google, the Gemini 2.0 Flash Thinking Experimental, is supposedly the source of s1, according to the researchers. Last month, researchers from Berkeley employed distillation to build an artificial intelligence reasoning model that cost only $450.
Some find it thrilling that a small group of researchers can still make strides in artificial intelligence (AI) without having access to millions of dollars in funding. But s1 raises real questions about the commoditisation of AI models.
If someone can closely copy a model that costs several million dollars for almost nothing, then what’s the point of having a moat?
It’s not surprising that large AI companies are unhappy. OpenAI claims that DeepSeek extracted data from their API without authorization in order to distill models.
Aiming for “test-time scaling,” or letting an AI model consider more before answering a question, and good reasoning performance were the primary goals of the s1 research team. Several other AI laboratories, like DeepSeek, have attempted to mimic OpenAI’s o1 and its many innovations.
In supervised fine-tuning (SFT), an artificial intelligence model is told to replicate specific actions in a dataset; the s1 study proposes that reasoning models can be reduced to a small dataset through this method.
Compared to R1, DeepSeek’s alternative to OpenAI’s o1 model trained using a large-scale reinforcement learning approach, SFT is typically more cost-effective.
Through its AI Studio platform, Google provides free access to Gemini 2.0 Flash Thinking Experimental, with daily rate constraints.
But, you can’t use Google’s terms to build services that compete with their AI solutions by reverse-engineering their models. We have contacted Google to request their comment.
S1 is built on top of a free, modest AI model from Qwen, a Chinese AI lab owned by Alibaba. Researchers trained s1 using data from Google’s Gemini 2.0 Flash Thinking Experimental, which consisted of just 1,000 hand-picked problems and their corresponding answers.
Using 16 Nvidia H100 GPUs, s1 was trained in just 30 minutes, and the researchers found that it performed well on some artificial intelligence benchmarks. A Stanford researcher who contributed to the experiment, Niklas Muennighoff, said to Techjuice that he could rent the required compute today for roughly $20.
Smartly, the researchers told s1 to wait, which allowed it to review its work and increase its “thinking” time. According to the paper, the model was able to produce slightly more accurate results after including the word “wait” in s1’s reasoning.
Meta, together with Google and Microsoft, intends to pour hundreds of billions of dollars into AI infrastructure by 2025. A portion of this investment will be used to train models for the next generation of AI.
To propel AI innovation to new heights, that amount of funding might be required. While distillation can successfully replicate the functionality of an AI model at a low cost, it is unable to generate AI models that are significantly superior to those already in existence.