Continue reading this on our app for a better experience

Open in App
Floating Button
Home Views Artificial Intelligence

DeepSeek is a wake-up call for the AI ecosystem

Assif Shameen
Assif Shameen • 10 min read
DeepSeek is a wake-up call for the AI ecosystem
Photo: Bloomberg
Font Resizer
Share to Whatsapp
Share to Facebook
Share to LinkedIn
Scroll to top
Follow us on Facebook and join our Telegram channel for the latest updates.

The news that China’s generative artificial intelligence (Gen AI) lab DeepSeek has seemingly usurped OpenAI’s pioneering ChatGPT’s position as the most downloaded free app on Apple’s App Store sent AI-related stocks reeling on Jan 27. The US$1 trillion ($1.36 trillion)stock-market rout was one of the worst in history, hammering hard stocks with exposure to AI. Chip giant Nvidia went from the world’s largest firm by market value to No. 3. Within days, however, it had pipped software giant Microsoft to take the No. 2 place behind iPhone-maker Apple.

The emergence of DeepSeek is being seen as the first visible challenge to costlier models such as OpenAI and Anthropic, and raised questions over the hundreds of billions in planned spending on the technology by the likes of Microsoft, Instagram’s owner Meta Platforms and Alphabet’s Google. “DeepSeek achieved competitive performance with significantly fewer resources, using only 2,048 graphics processing units (GPUs) for 57 days compared to the 16,000 to 100,000 GPUs typically required,” notes Morgan Brown, vice-president for product development and growth at cloud storage firm Dropbox. The upstart AI lab demonstrated that high-performing AI models can be built with fewer resources than previously thought necessary and that China has developed advanced AI capabilities despite US export controls on high-end GPUs.

DeepSeek’s success is remarkable given the constraints faced by Chinese AI companies in the form of increasing US export controls on cutting-edge chips from the likes of Nvidia, and limited access to Taiwan Semiconductor Manufacturing Corp’s (TSMC) state-of-art chip plants or tools, and equipment made by Dutch firm ASML. Clearly, these measures are not working as intended. Rather than weakening China’s AI capabilities, the sanctions may be driving start-ups like DeepSeek to innovate in ways that prioritise efficiency, resource-pooling and collaboration.

DeepSeek was created as a side project of High-Flyer Capital Management, a Hangzhou-based hedge fund that uses quantitative mathematical and statistical models to develop trading strategies. With US$14 billion in assets under management, its boyish-looking “math nerd” founder Liang Wenfeng, 40, an alumnus of Zhejiang University, incubated DeepSeek as he wanted to use AI to help him pick stocks and build artificial general intelligence (AGI), a form of AI that can match or beat humans on a range of tasks.

In December, DeepSeek released its V3 model with 671 billion parameters, trained for just US$5.58 million — a fraction of what labs like OpenAI typically spend. The Chinese upstart matched industry leaders like OpenAI’s GPT-4 and Anthropic’s Claude 3.5 Sonnet in benchmarks while being fully open source. On Jan 20, the Chinese chatbot creator released DeepSeek R-1, a reasoning model that achieved strong results on mathematical and logical tasks.

How did an upstart Chinese quant firm beat the likes of Microsoft, Google, Elon Musk’s xAI as well as OpenAI? Necessity is the mother of invention. Denied access to technology, chips and software, Liang and High-Flyer cut corners to keep pace with their US counterparts and re-thought everything from the ground up. Right now, training top AI models is insanely expensive, says Dropbox’s Brown. “OpenAI and Anthropic spend US$100 million-plus just on computing or AI chips designed by Nvidia. They need massive data centres with thousands of GPUs that each cost US$40,000. It’s like needing a whole power plant to run a factory. DeepSeek just showed up and said: ‘What if we did this for US$5 million instead?’ Its models match or beat GPT-4 and Claude on many tasks.

See also: Alibaba unveils AI agent app in race to keep up with rivals

Traditional AI is like writing every number with 32 decimal places. DeepSeek used just eight. It is still accurate but needs 75% less memory. DeepSeek reads in whole phrases at once unlike normal AI which reads like a first-grader: word by word. It is twice as fast and 90% as accurate. “When you’re processing billions of words, it matters,” Brown says. DeepSeek also built an expert system. “Instead of one massive AI trying to know everything, like having one person be a doctor, another a lawyer and an engineer, they have specialised experts that only wake up when they are needed,” he notes.

Compare DeepSeek to what traditional AI models tend to do. All 1.8 trillion parameters are active all the time for OpenAI, Anthropic, Meta and Google. DeepSeek has only 37 billion parameters active at once of its 671 billion. “It’s like having a huge team but only calling in the experts you actually need for each task,” Brown says. That cut training cost from US$100 million to US$5 million; GPUs needed for training slashed from 100,000 units to just 2,000; application programming interface or API costs are 95% cheaper; and the DeepSeek AI model can run on cheaper gaming GPUs instead of the expensive data centre hardware that Microsoft, Google and xAI buy from Nvidia. Brown also notes that DeepSeek’s AI model is open source. “Anyone can check their work. The code is public. It’s just incredibly clever engineering.”

Microsoft and OpenAI are now questioning how DeepSeek got to where it is so fast, so cheaply. To train its models, DeepSeek leveraged a process known as knowledge distillation, by which the learnings of a well-developed large “teacher model” are transferred to a smaller “student model”. The goal of the process is to train a more compact model to mimic the larger model at a fraction of the cost to operate.

See also: Google debuts AI model for robotics, challenging Meta, OpenAI

JPMorgan’s Michael Cembelest concedes that DeepSeek probably violated the terms of service and copyright of OpenAI. The irony, he says, is that OpenAI spent years training its own models on other people’s data.

German tech writer Jan Kammerath notes that the Chinese start-up referred to a single-party state as a “dictatorship”. “Why would it reject a one-party system unless it was trained on Western data with strong ideological beliefs? ... Why does it talk about presidents and ‘best cities to live in’ by talking about American ones, even when asked in German?” Kammerath noted on a Medium post that when asked to program an impossible graphics function, DeepSeek’s answer was 95% similar to ChatGPT’s but very different from the garbage that Microsoft Co-Pilot and Google’s Gemini produced, further proof that it was trained on a Western model.

Classic disruption story
Deep Seek shredded the belief that only mega-cap tech firms spending hundreds of billions can play in the AI league. You don’t need a billion-dollar data centre anymore; a few good GPUs will do. Nvidia’s entire business model is built on selling super-expensive GPUs with 80%-plus margins. If everyone can suddenly do AI with regular gaming GPUs, then Nvidia has a problem.

It took just 200 young researchers for DeepSeek to build its game-changing chatbot app. Meta, on the other hand, has teams where the compensation alone exceeds DeepSeek’s entire training budget, and their models aren’t as good. “This is a classic disruption story: Incumbents optimise existing processes, while disruptors rethink the fundamental approach,” Brown tweeted on X, formerly Twitter. “What if we just did this smarter instead of throwing more hardware at it?” DeepSeek’s founder Liang asked. 

The implications of DeepSeek’s emergence as a frontline AI player are huge: AI development becomes more accessible; competition increases dramatically; users can access AI at a fraction of the cost; the “moats” of big tech companies look more like puddles; and hardware requirements and costs plummet. It is unlikely that the giants will stand still. They are probably already working on or indeed even implementing these innovations. Clearly, the efficiency genie is out of the bottle. The “just throw more GPUs at it” approach no longer works. Musk tried to do just that with his xAI last year when he acquired as many GPUs as he could wrangle out of Nvidia. 

“This feels like one of those moments we’ll look back on as an inflection point,” says Brown. It’s like when PCs made mainframes less relevant, or when cloud computing changed everything. Prominent tech blogger Ben Thomson notes that there are two big winners from the DeepSeek breakthrough. The first is iPhone-maker Apple, which, unlike other tech giants, did not spend tens of billions on LLMs, and instead adopted a different strategy training its Apple silicon on a small language model for Edge AI, which enables the processing of data and algorithms directly from an endpoint device like the iPhone. “Apple silicon uses unified memory and is the best consumer chip for inference,” Thomson notes. 

The other clear winner is Meta. “Dramatically cheaper inference and cheaper training” will help Meta be a formidable AI player, he adds.

Sink your teeth into in-depth insights from our contributors, and dive into financial and economic trends

Edison Lee, an analyst for Jefferies & Co in Hong Kong, believes DeepSeek’s breakthrough will push President Trump and Musk to recognise that further restrictions on AI chips risk forcing China to innovate faster. “Trump could relax controls on the sale of AI chips if that can be used as a bargaining chip as part of a broader deal with China,” notes veteran equity strategist Christopher Wood. While Nvidia might lose a bit as US hyperscalers cut back on AI chips, it will easily make up for that by selling more chips to China.

As higher-quality language models become available for free, market forces will help drive down prices. That will benefit end-users of AI or people like you and me. David Sacks, Trump’s “AI czar”, says a high-quality open-source model is a wake-up call for America. Until now, the US was believed to have a huge lead over China in AI particularly because it was five years ahead in AI chips. Clearly, that gap has narrowed. That’s important because leadership in the AI sector will not only determine economic primacy but also be increasingly important for military effectiveness via enhanced decision-making and autonomous systems.

How Nvidia will be affected
A seismic shift in AI computing could be troublesome for Nvidia, whose valuation has been closely tied to its ability to sell expensive and immensely profitable advanced chips. “DeepSeek is an excellent AI advancement and a perfect example of Test Time Scaling,” an Nvidia spokesperson said in a statement, referring to the technical approach that has made reasoning models like OpenAI’s o1 and DeepSeek’s R1 possible. However, it downplayed the idea that DeepSeek’s model reduces the need for its advanced hardware. The process of training an AI model still “requires significant numbers of Nvidia GPUs and high-performance networking”, it said.

Nvidia sold AI chips, software and services worth US$230 billion in its just-ended latest fiscal year. Analysts estimate that it earned over US$100 billion in net profits. Until two weeks ago, Wall Street was putting a 34 multiple on it, giving the company a market capitalisation of US$3.4 trillion, almost neck-and-neck with Apple. Since Nvidia has sold almost all of the chips that it can make this year, its revenue is safe, as is its net profit. 

Google will spend US$75 billion and Meta US$65 billion on AI infrastructure this year, mostly Nvidia chips. What’s changed is that Wall Street collectively feels Nvidia deserves a 28 multiple or market cap of US$2.8 trillion. The concern is whether the tech behemoth can continue growing sales for its top-end AI chips or force hyperscalers to buy the software and services that it sells as part of the chip bundle at a premium price.

If AI requires fewer GPUs and computing power, it means fewer data centres would be needed. On his second day in office, Trump got Softbank founder Masayoshi Son, Oracle chairman Larry Ellison and OpenAI CEO Sam Altman together to announce an investment of US$500 billion in data centres in the US. If the world needs fewer data centres, all the projects from Asia to America will have to be scaled down or abandoned. That, however, is unlikely to push the US economy into a recession like the bursting of the tech bubble in 2000 did. Software investments will likely rise and offset declines in data centres, which in turn could lead to strong productivity gains and indeed even reinforce US exceptionalism. 

Assif Shameen is a technology and business writer based in North America

×
The Edge Singapore
Download The Edge Singapore App
Google playApple store play
Keep updated
Follow our social media
© 2025 The Edge Publishing Pte Ltd. All rights reserved.