A Chinese startup just showed every American tech company how quickly it's catching up in AI

By: Hasan Chowdhury

22 January 2025 at 02:51

OpenAI CEO Sam Altman addresses the Station F in Paris — A new AI model from China's DeepSeek rivals OpenAI's o1.
JOEL SAGET/AFP via Getty Images

An AI startup in China just showed how it's closing the gap with America's top AI labs.
Chinese startup DeepSeek released a new AI model on Monday that appears to rival OpenAI's o1.
Its reasoning capabilities have stunned top American AI researchers.

Donald Trump started his new presidency by declaring America must lead the world. He just got a warning shot from an AI crack team in China that is ready to show US technological supremacy is not a given.

Meet DeepSeek, a Chinese startup spun off from a decade-old hedge fund that calculates shrewd trades with AI and algorithms. Its latest release, which came on Trump's inauguration day, has left much of America's top industry researchers stunned.

In a paper released Monday, DeepSeek unveiled a new flagship AI model called R1 that shows off a new level of "reasoning." Why it has left such a huge impression on AI experts in the US matters.

🚀 DeepSeek-R1 is here!

⚡ Performance on par with OpenAI-o1
📖 Fully open-source model & technical report
🏆 MIT licensed: Distill & commercialize freely!

🌐 Website & API are live now! Try DeepThink at https://t.co/v1TFy7LHNy today!

🐋 1/n pic.twitter.com/7BlpWAPu6y
— DeepSeek (@deepseek_ai) January 20, 2025

Some of Silicon Valley's most well-resourced AI labs have increasingly turned to "reasoning" as a frontier of research that can evolve their technology from a student-like level of intelligence to something that eclipses human intelligence entirely.

To accomplish this, OpenAI, Google, Anthropic, and others have focused on ensuring models spend more time thinking before responding to a user query. It's an expensive, intensive process that demands a lot from the computing power buzzing underneath.

As a reminder, OpenAI fully released o1 — "models designed to spend more time thinking before they respond" — to a glowing reception in December after an initial release in September. DeepSeek's R1 shows just how quickly it can close the gap.

DeepSeek narrows the gap

What exactly does R1 do? For one, DeepSeek says R1 achieves "performance comparable to OpenAI o1 across math, code, and reasoning tasks."

Its research paper says that this is possible thanks to "pure reinforcement learning," a technique that Jim Fan, senior research manager at Nvidia, said was reminiscent of the secret behind making Google DeepMind's AlphaZero a master at games like Go and Chess from scratch, "without imitating human grandmaster moves first." "This is the most significant takeaway from the paper," he wrote on X.

We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive - truly open, frontier research that empowers all. It makes no sense. The most entertaining outcome is the most likely.

DeepSeek-R1 not only open-sources a barrage of models but… pic.twitter.com/M7eZnEmCOY
— Jim Fan (@DrJimFan) January 20, 2025

DeepSeek, which launched in 2023, said in its paper that it did this because its goal was to explore the potential of AI to "develop reasoning capabilities without any supervised data." This is a common technique used by AI researchers. The company also said that an earlier version of R1, called R1-Zero, gave them an "aha moment" in which the AI "learns to allocate more thinking time to a problem to reevaluating its initial approach."

The end result offers what Wharton professor Ethan Mollick described as responses from R1 that read "like a human thinking out loud."

Notably, this level of transparency into the development of AI has been hard to come by in the notes published by companies like OpenAI when releasing models of a similar aptitude.

Nathan Lambert, a research scientist at the Allen Institute for AI, noted on Substack that R1's paper "is a major transition point in the uncertainty in reasoning model research" as "until now, reasoning models have been a major area of industrial research without a clear seminal paper."

Staying true to the open spirit, DeepSeek's R1 model, critically, has been fully open-sourced, having obtained an MIT license — the industry standard for software licensing.

Together, these elements of R1 provide complications to US players caught up in an AI arms race with China — Trump's main geopolitical rival — for a few reasons.

First, it shows that China can rival some of the top AI models in the industry and keep pace with cutting-edge developments coming out of Silicon Valley.

Second, open-sourcing highly advanced AI could also challenge companies that are seeking to make huge profits by selling their technology.

OpenAI, for instance, introduced a ChatGPT Pro plan in December that costs $200 per month. Its selling point was that it included "unlimited access" to its smartest model at the time, o1. If an open-source model offers similar capabilities for free, the incentive to buy a costly paid subscription could, in theory, diminish.

Nvidia's Fan described the situation like this on X: "We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive — truly open, frontier research that empowers all."

DeepSeek has shown off reasoning know-how before. In November, the company released an "R1-lite-preview" that showed its "transparent thought process in real time." In December, it released a model called V3 to serve as a new, bigger foundation for future reasoning in models.

It's a big reason American researchers see a meaningful improvement in the latest model, R1.

Theo Browne, a software developer behind a popular YouTube channel for the tech community, said, "The new DeepSeek R1 model is incredible." Tanay Jaipuria, a partner investing in AI at Silicon Valley's Wing VC, also described it as "incredible."

DeepSeek R-1 is incredible.

- OpenAI o-1 level reasoning at 1/25th the cost
- Fully open source with MIT license
- API outputs can be used for distillation pic.twitter.com/YjHbylNuH8
— Tanay Jaipuria (@tanayj) January 20, 2025

Awni Hannun, a machine learning researcher at Apple, said that a key advantage of R1 was that it was less intensive, showing that the industry was "getting close to open-source o1, at home, on consumer hardware," referring to OpenAI's reasoning model introduced last year.

The model can be "distilled," meaning smaller but also powerful versions can run on hardware that is far less intensive than the computing power loaded into servers in data centers many tech companies depend on to run their AI models.

Hannun demonstrated this by sharing a clip on X of a 671 billion parameter version of R1 running on two Apple M2 Ultra chips, responding with reason to a prompt asking if a straight or a flush is better in a game of Texas Hold 'em. Hannun said its response came "faster than reading speed."

AI censorship

R1 does appear to have one key problem. Former OpenAI board member Helen Toner pointed out on X that there are demos of R1 "shutting itself down when asked about topics the CCP doesn't like."

Toner did suggest, however, that "the censorship is obviously being done by a layer on top, not the model itself." DeepSeek did not immediately respond to Business Insider's request for comment.

It is worth noting, of course, that OpenAI has introduced a new model called o3 that is meant to be a successor to the o1 model DeepSeek is currently rivaling. Lambert said it was "likely technically ahead" in his blog, with the key caveat that the model is "not generally available," nor will basic information like its "weights" be available anytime soon.

Given DeepSeek's track record so far, don't be surprised if its next model shows parity to o3. America's tech leaders may have met their match in China.

Read the original article on Business Insider

Normal view

DeepSeek narrows the gap

AI censorship