Continue reading this on our app for a better experience

Open in App
Floating Button
Home News Artificial Intelligence

DeepSeeking the true value of Gen AI

Nurdianah Md Nur and Nicole Lim
Nurdianah Md Nur and Nicole Lim • 22 min read
DeepSeeking the true value of Gen AI
Said to be based on open-source models that are on par with or surpass OpenAI’s ChatGPT o1, DeepSeek could spur generative AI advancements and accelerate real-world deployment. Photo: Bloomberg
Font Resizer
Share to Whatsapp
Share to Facebook
Share to LinkedIn
Scroll to top
Follow us on Facebook and join our Telegram channel for the latest updates.

A small Chinese AI firm has shaken financial markets with language and reasoning models developed at a fraction of the cost of their US counterparts. How will this achievement shape the future of AI?

Generative AI (Gen AI) has once again sent shockwaves worldwide, but this time, a Chinese name is hogging the headlines. In just a week, Chinese AI firm DeepSeek has disrupted the tech community, investors and government leaders alike, shattering the long-held belief that running Gen AI is costly and energy-intensive.

DeepSeek claims its V3 large language mod­el (LLM) and reasoning model, R1, are on par with and can surpass OpenAI’s ChatGPT o1 on key benchmarks despite operating at a fraction of the cost and on less advanced Nvidia chips. Reasoning models are the latest fad in the Gen AI world. They are essentially LLMs that break a question into smaller parts and explore differ­ent ways of answering it.

Breaking down helps the model handle com­plex problems in a way that mimics human think­ing but typically requires more computing pow­er and energy than simpler AI tasks like pattern recognition.

Impact on AI infrastructure

DeepSeek’s models are estimated to be 20 to 40 times less expensive than OpenAI’s. Since ChatGPT was launched, OpenAI has drastical­ly reduced the cost of its models. GPT4o tokens now cost about US$4 ($5.44) per million tokens, compared to GPT4’s price of US$36 per million tokens at its initial release in March 2023.

See also: Microsoft creates in-house AI models it believes rival OpenAI’s

This considerable price differential raises ques­tions about the necessity of large investments in AI infrastructure by governments, hyperscalers, data centre providers and telcos.

For instance, US President Donald Trump re­cently announced the new US$500 billion Star­gate AI venture from OpenAI, Softbank and Or­acle — a colossal figure from the US$100 billion previously floated. Meanwhile, Singapore said it will invest up to $500 million to secure high-per­formance computing resources to power AI inno­vation in the private and public sectors.

“We’ve always said that the cost of intelli­gence will continue to fall. We’ve shown this with 4o-mini and o1-mini so more people can access and also try to prioritise features, like low latency and enterprise-grade reliability, that are important to the user and customer experience,” says OpenAI in a written response to The Edge Singapore on the emergence of a cost-effective challenger.

See also: Nvidia chips, Trump's tariffs and AI's future

“Our frontier models continue to set the stand­ard and we’re just getting started. The reason­ing paradigm is still in its infancy, yet we’ve seen tremendous progress — from o1 to o3 — in just months. There’s so much more to come. Over time, we believe the cost of intelligence will con­tinue to fall exponentially, while demand and consumption will grow dramatically.”

Singtel’s regional data centre arm, Digital InfraCo, has been leasing GPU-as-a-Service (GPUaaS) to enterprises since 3Q2024. It says that having more foundational models like DeepSeek will only encourage more enterprises to adopt AI to transform their businesses and become more efficient.

CEO Bill Chang says its GPUaaS business is mostly utilised by large enterprises, including many publicly listed companies and agencies, which are increasingly commencing their AI use cases. “Models like DeepSeek, which can run on much smaller GPUs, will make AI more affordable to enterprises, thereby driving the volume of enterprises adopting AI and creating more demand for AIaaS offerings from RE:AI,” says Chang.

RE:AI is Singapore Telecommunications ’ (Sing­tel’s) Digital Infraco’s AI cloud service. Chang notes that most customers currently use GPUaaS to train their enterprise models, retrieval augmented generation experiments augment­ed generation experiments, and perform fine-tun­ing exercises. Meanwhile, inferencing is still in the early stages and he expects it to grow in the next 18 months as fine-tuning exercises yield more stable and usable models for customers.

Cutting AI’s computing costs could also ease environmental worries. The data centres pow­ering these models guzzle electricity and wa­ter, mostly to keep servers from overheating, occupying land and producing electrical waste. Current estimates suggest that data centres account for 3% of global electricity consumption, with predictions indicating a rise to a potential 10% by 2030.

Governments around the world have scram­bled to undertake efforts to reduce the envi­ronmental impact of AI. Last May, Singapore launched a green data centre roadmap aiming to add at least 300 megawatts (MW) of capacity soon, with more through green energy.

The plan includes boosting the energy effi­ciency of all data centres in the city-state, de­ploying energy-efficient IT equipment and offer­ing incentives or grants for resource efficiency. No announcement has been made about how the green data centre roadmap will be executed.

To stay ahead of Singapore and the region’s corporate and economic trends, click here for Latest Section

Meanwhile, data centre operators and hyper­scalers like Equinix and Google are exploring new cooling methods and clean energy sources. For instance, recycled water is being reused multi­ple times to reduce a data centre’s water intake, while direct-to-chip liquid cooling helps dissipate heat from AI chips more efficiently, using less energy than traditional methods by directly tar­geting the heat source at the chip level. They are also considering using nuclear energy to power their energy-intensive data centres.

More efficient AI models like DeepSeek can complement those efforts as they use less re­sources and energy without compromising per­formance. Recognising this, OpenAI told The Edge Singapore that it is “constantly working to improve efficiency”. “We carefully consider the best use of our computing power and support our partners’ efforts to achieve their sustainabil­ity goals. We also believe that AI can play a key role in accelerating scientific progress in the dis­covery of climate solutions.”

Experts, however, warn of the possibility of Jevons paradox, in which greater (model) effi­ciency could drive down costs, fuelling higher demand and offsetting the savings.

When bigger isn’t always better

DeepSeek’s cost-cutting achievement has been attributed to the “mixture of experts” (MoE) tech­nique, wherein the AI model comprises smaller models, each with expertise in specific domains. When given a task, the AI model only activates the specific “experts” (or smaller models) need­ed, significantly reducing computation costs dur­ing pre-training and achieving faster performance during inference time.

Neither the MoE technique nor the idea of small language models (SLMs) is new. Compa­nies such as French AI company Mistral and IBM have been popularising the MoE architecture over the past year and saw greater model efficiency by combining the technique with open source.

IBM’s Granite 13B is one such example. “De­spite being five times smaller than models like LLaMA-2 70B, Granite 13B performs competitive­ly across various tasks, particularly in specialised fields like finance and cybersecurity. Addition­ally, available in base and instruction-following model variants, Granite is especially suitable for tasks such as complex application moderni­sation, code generation, fixing bugs, explaining and documenting code, maintaining repositories and more,” claims Tan Siew San, general man­ager of IBM Singapore.

Instead of focusing on the size of an AI mod­el, Tan emphasises the need for businesses to be able to customise and tailor their founda­tion models for evolving use cases — in short, to have fit-for-purpose AI models.

“Think of a bus that is carrying just one pas­senger. Is that the most efficient way to transport that person? In the world of Gen AI, that’s like an enterprise running a complex LLM of more than 70 billion parameters to complete specif­ic tasks that are only accessing and using up to 2% of the data in the model. They do not need to run (or pay for) a model that large. Many en­terprise use cases are best served with the enter­prise’s own data, and every use case has unique needs. The key for businesses is finding a way to tap into that valuable data by picking the right ‘vehicle’ to eliminate those costly ‘empty seats’,” says Tan.

The usefulness of SLMs is more prominent in enterprises operating in specialised domains like telecommunications and healthcare. Tan says: “With SLMs, the cost of training them with do­main-specific enterprise data is lower as they are not retraining an entire model with hundreds of billions of parameters. In addition, these mod­­els can be hosted in an enterprise’s data centre instead of the cloud; computation and inferenc­ing take place as close to the data as possible, making it faster and more secure than through a cloud provider.”

Ying Shao Wei, senior partner and chief sci­entist at NCS, shares a similar view. “Designed for specific tasks, SLMs demonstrate that size is not everything. These models are highly effi­cient at handling specialised tasks with minimal computational resources, resulting in less ener­gy consumption and a reduced environmental footprint. Given the looming US export controls on AI chips and the restriction of closed model weights, we anticipate a rise in SLMs as business­es seek cost-effective, task-focused solutions.”

Despite the benefits of SLMs, LLMs will con­tinue having a role in the enterprise as they are suited for tasks requiring a broad understanding of various topics and handling complex queries. However, LLMs have been said to “hallucinate” (or produce incorrect or misleading results) and suffer from model drift (wherein its predictive accuracy degrades from the performance during the training period) over time.

Common ways of addressing the accuracy and reliability concerns around LLM include retriev­al augmented generation (RAG) and fine-tuning. RAG plugs an LLM into an organisation’s propri­etary database so that the model can return more accurate responses with the added context of the internal data. Meanwhile, fine-tuning means re­training a model based on a focused set of data so that the model generates more accurate, do­main-specific results.

“By leveraging external, up-to-date knowledge sources, RAG minimises the need for expensive and resource-heavy retraining. It allows business­es to access the most current data without over­hauling entire models. While fine-tuning LLMs remains valuable for highly specialised applica­tions, it can require significant investment, mak­ing RAG a more versatile and economical alter­native for many enterprises,” says NCS’s Ying.

He continues: “The combination of SLMs, RAG, and fine-tuning will play a crucial role in shaping the future of AI. Each approach offers distinct advantages depending on the specific needs of the business, and the growing diversi­ty in AI solutions ensures that companies can choose the most appropriate tools to optimise both performance and cost-efficiency.”

AI agents taking action

DeepSeek’s R1 also uses reinforcement learn­ing, in which an AI agent learns to optimally perform a task (or make a decision) through tri­al and error without any instructions from a hu­man user. What is interesting is that AI agents are more action-oriented and autonomous than LLM-based chatbots, which create content based on human input.

The tech industry believes AI agents are the next phase of Gen AI and will bring us a step clos­er to making artificial general intelligence a real­ity. OpenAI recently released an AI agent called Deep Research, which can conduct time-consum­ing and complex online research on various top­ics. This is in addition to its Operator AI agent, which can help users book flights, plan grocery orders, and even complete purchases. Both AI agents are available to the platform’s Pro sub­scribers on ChatGPT’s online chatbot.

For enterprises, Salesforce and Zendesk offer out-of-the-box AI agents that can autonomous­ly handle basic customer queries or lead quali­fications to deliver better customer experience.

Investors are also optimistic that AI agents will be the next frontier of Gen AI. According to CB Insights, 37% of venture capital funding and 17% of the deal activity was to AI start-ups in 2024. Autonomous AI agents saw the largest venture capital deal funding growth of 150% y-o-y last year.

“Real-world applications of AI agents span across multiple domains. For example, AI agents can autonomously place orders with suppliers or adjust production schedules to maintain op­timal inventory levels. In healthcare, AI agents can monitor patient data, adjust treatment recommendations based on new test results and provide real-time feedback to clinicians,” says IBM’s Tan.

AI agents can perform complex tasks with a high degree of autonomy as they leverage mul­tiple AI models that operate across various data types. “In practice, AI agents often involve a combination of traditional AI models and gen­erative AI models, including SLMs, which sup­port distinct steps in each workflow. SLMs are crucial in handling specific tasks like real-time speech-to-text conversion. In an agentic AI sys­tem, these models work together to provide com­prehensive solutions. With the use of SLMs, AI agents can then be fine-tuned for specific tasks, delivering highly effective outcomes and revo­lutionising industries reliant on intricate work­flows,” explains NCS’s Ying.

In the case of a call centre, SLMs can be trained to understand local accents and unique vocab­ularies, transcribing audio signals into text. The transcribed text is then processed by LLMs that assist human agents by suggesting responses. After a call, additional LLMs can handle tasks such as summarising the conversation, assess­ing quality and compliance, and alerting super­visors of discrepancies. This seamless workflow — where different AI models are integrated and specialised — has led to measurable gains in productivity, as exemplified in a project NCS did for Singapore’s Ministry of Manpower (MOM).

NCS partnered with AWS to add Gen AI fea­tures to MOM’s contact centre, reducing handling time by 12% and cutting average after-call work by over 50%, Ying shares.

Using AI tools also streamlined the tasks of call centre agents allowing them to focus more on each caller’s specific queries. Job satisfaction in­creased and overall productivity improved by 6%.

Dr Leslie Teo, senior director of AI Products at AI Singapore, agrees that SLMs offer a comple­mentary approach to AI agents. “AI agents are not a specific model per se but rather a framework and approach for groups of AI models (which may or may not be LLMs or SLMs) working to accomplish a particular task. With their compact size and optimised architectures, SLMs can per­form complex reasoning and decision-making tasks required by AI agents while maintaining speed and efficiency. SLMs can also be trained to understand specific contexts and domains, al­lowing AI agents to operate effectively in their intended environments,” he adds.

Generative AI at the edge

SLMs and AI agents, including DeepSeek, are ex­pected to accelerate the advancement of edge de­vices such as robots, personal computers (PCs) and smartphones.

“The edge AI (or AI-powered devices) and mobile AI space could see long-term implica­tions. While DeepSeek is still a cloud-first mod­el, its efficiency breakthroughs point toward a future where powerful AI can run more effec­tively on local hardware. This could drive fur­ther investment in AI-optimised chips, benefit­ing firms like Qualcomm, Apple and AMD,” says Leslie Joseph, principal analyst at research and advisory firm Forrester.

NCS’s Ying adds that SLMs and AI agents can make “AI-powered devices more efficient, privacy-conscious, and widely accessible” by minimising the need to transmit sensitive data to cloud services.

SLMs enhance smartphone functionality and privacy. The latest devices from Google and Sam­sung feature Google’s smallest AI model, Gem­ini Nano, while iOS devices integrate on-device foundation models. By doing so, users can in­teract with their phones more seamlessly with­out needing an Internet connection or cloud-based processing, ensuring sensitive data stays on the device.

Research firm International Data Corp (IDC) predicts that Gen AI-enabled phones will be “the next big thing the mobile industry has to offer consumers”. Worldwide shipments of such smartphones are expected to grow at a CAGR of 78.4% to reach 912 million by 2028.

As for AI robots, SLMs empower them with efficient, on-the-fly language understanding and decision-making. This allows robots to interpret instructions, navigate environments, and perform tasks autonomously without heavy reliance on cloud resources. “Due to their smaller size, SLMs are ideal for edge devices, where power and com­putational resources are limited. This makes ro­bots more agile and responsive, enhancing their ability to perform real-time tasks in dynamic, re­al-world environments,” says NCS’s Ying.

AI robots are expected to become mainstream soon. A Citi GPS report in December 2024 indi­cated investors’ optimism as venture capital in­vestment in robotics reached US$10 billion in 2023, of which 38% went to Asia. The report also estimates that the global AI robot popula­tion will hit 1.3 billion by 2035 and increase to 4 billion by 2050. These robots will transform various sectors, including healthcare, manufac­turing and hospitality.

Embedding AI into PCs

First introduced in late 2023, AI PCs are expect­ed to see greater adoption this year due to AI advancements and the promise of improved productivity. Data from research firm Canalys shows that 13.3 million AI PC units were shipped globally in 3Q2024, accounting for one-fifth of all PC shipments that quarter. The top three Windows-based AI PC providers were HP, Le­novo and Dell.

These AI PCs are equipped with neural pro­cessing units (NPUs) or specialised AI chips, which can more efficiently manage complex computations compared to regular PCs.

“AI PCs represent a significant leap in comput­ing, requiring a sophisticated interplay of hard­ware and software. Central to this, the NPU han­dles AI-specific workloads, freeing up the CPU and GPU to focus on their core functions. This specialised processing allows for significantly faster speeds, enabling complex tasks like run­ning generative AI models and AI-assistant ap­plications locally on the PC,” says Jacinta Quah, vice-president for the APJC client solutions group at Dell Technologies.

She continues: “Integration with small lan­guage models and AI applications is also essen­tial for providing the intelligent features users an­ticipate, including improved search capabilities, studio effects, and real-time translations, [even when the PC is not connected to the Internet].”

AI PCs are also energy efficient. Quah says: “Specialised energy efficiency and cooling solu­tions are important, especially for higher-perfor­mance AI PCs, to ensure optimal performance and longevity. Efficient thermal design is another key component for managing the heat generated by high-performance components such as NPUs.”

Enhanced personalisation is another advan­tage of AI PCs. “AI PCs leverage on-device AI capabilities to learn and adapt individual needs and preferences to help streamline workflows, optimise performance, and enhance user expe­rience,” says Ivan Cheung, vice-president and chief operating officer for Asia Pacific at Lenovo.

The Lenovo AI Now, for example, is an on-de­vice AI agent that offers a user-adaptive com­puting performance. This is in addition to the AI tools running locally on Lenovo’s AI PCs for task automation, document summaries, and nat­ural language interactions.

Lenovo's ThinkPad X9 Aura Editions feature advanced AI tools like Lenovo AI Now for task automation and workflow optimisation. Photo: Lenovo

Cheung also highlights that AI PCs process data locally on the device instead of in cloud services or platforms, minimising the risk of data breach­es and unauthorised access. “This is paramount for data privacy in the AI era, also helping enter­prises to comply with regulatory requirements.”

Since AI PCs come at a premium, price-sen­sitive consumers and businesses have been hes­itant to adopt them. In response, AI PC makers are introducing devices at various price points. “Our AI PC lineup — Dell, Dell Pro, and Dell Pro Max — offers a range of configurations across silicon partners (AMD, Intel, Qualcomm, Nvid­ia), allowing customers to choose the AI capa­bility that best fits their needs and budget,” says Dell’s Quah.

AI PC makers also expect users to recognise the productivity benefits of AI PCs soon, driving greater demand for these devices.

According to Dell’s Quah, AI PCs are reshap­ing how we work and interact with technology in two ways. The first is the integration of intel­ligence into familiar productivity apps, which el­evates users’ productivity by automating tasks, providing insights, and streamlining workflows, making every action more impactful.

“Equipped with more compact language models, AI PCs also allow businesses to tailor their tools to meet specialised needs in custom­er service or retail. With advancements in SLMs, these devices can now efficiently support multi­ple models running simultaneously. This capa­bility opens the door for highly customised, in­telligent workflows directly on the PC,” she says.

Lenovo’s Cheung adds that AI PCs will cater to diverse users. “Beyond helping users be more productive and businesses gain a competitive edge, AI PCs can enhance workflows for crea­tive users and deliver the next generation of im­mersive and interactive experiences to gamers. Even educators are recognising the potential of AI PCs to revolutionise education by providing personalised learning experiences.”

Realising Gen AI’s promise

By open-sourcing R1, DeepSeek enables research­ers and developers to use, modify and commer­cialise the model freely. This could ultimately accelerate the real-world deployment of Gen AI.

“The rise of models like DeepSeek will shift the focus from model development to deploy­ment [that will drive] more practical applica­tions of AI. Companies that prioritise data read­iness and ensuring AI outputs are trustworthy, explainable, and actionable will emerge as lead­ers, while those clinging to outdated notions of AI exceptionalism risk falling behind,” says Mike Capone, CEO of Qlik, a data integration, quality and analytics solutions provider.

He adds: “Ultimately, this shift presents an opportunity for global businesses, especially in Asia. With the region’s diverse linguistic and cultural landscape, localised AI models like AI Singapore’s Sea-Lion have already demonstrated the potential for tailored innovation. DeepSeek accelerates this trend by lowering barriers to en­try, encouraging the development of region-spe­cific AI solutions that can cater to unique local demands.”

The South East Asian Languages in One Network (Sea-Lion) is a family of open-source LLMs designed to understand better Southeast Asia’s diverse contexts, languages and cultures. Teo shares that AI Singapore will release a set of small models under Sea-Lion in the next few months, which can be used independently or in an agentic system to provide multi-lingual and multi-cultural perspectives.

“The open nature of our Sea-Lion models means they should be accessible. However, or­ganisations need more than just a model. So, we are also building services and tools such as appli­cation programming interfaces (APIs) to reduce the technical barriers to using Sea-Lion models. We’re also actively working with industry part­ners to identify and develop real-world applica­tions across various sectors, demonstrating their practical value,” he says.

He adds that driving widespread adoption of Gen AI at the national level requires affordable and scalable compute resources, open datasets that meet standardised quality benchmarks, and a robust AI regulatory framework with clear gov­ernance and ethical guidelines. Also, fostering research and innovation and upskilling and re­training the workforce are essential.

Meanwhile, organisations looking to fully harness generative AI — including SLMs and AI agents — must design and optimise their IT in­frastructure to support AI effectively. “Typically, only 10% of an AI system’s code is the actual AI model; the rest comprises supporting infrastruc­ture and applications. This makes the resilience and robustness of the underlying digital infra­structure crucial for AI success,” says NCS’s Ying.

A recent global survey by NCS and IDC found a strong correlation between AI adoption and digital resilience. Companies with the highest levels of both achieved 1.3 times more revenue and 1.4 times more cost savings than others, highlight­ing the need for a strong digital foundation be­fore embarking on AI adoption.

Cybersecurity should be another priority, es­pecially for edge devices using SLMs. While SLMs can enhance data security and privacy by oper­ating on-device and keeping the data local, there is no guarantee. Ying says those models still re­quire cloud connections for updates and may be part of hybrid systems that share data externally.

He adds that SLMs are not immune to adver­sarial attacks and can be vulnerable to data poisoning or manipulation if they are not well-se­cured. As SLM-powered edge devices become more widespread, enterprises must also address the increasingly complex challenge of managing security across numerous distributed edge devices.

According to IBM’s Tan, there is the risk of malfunction when organisations implement mul­tiple AI agents to execute specific complex tasks. “Multi-agent systems built on the same founda­tion models may experience shared pitfalls. Such weaknesses could cause a system-wide failure of all involved agents or expose vulnerability to adverse attacks. This highlights the importance of data governance in building foundation mod­els and thorough training and testing processes.”

Data and AI governance are also key to ad­dressing the ethical concerns of SLMs and AI agents. “SLMs still carry risks of bias, misinfor­mation, and ethical dilemmas, yet they are not as good as LLMs in detecting and correcting for these. Their limited capacity may even inad­vertently amplify biases present in their train­ing data. With AI agents, their capacity for au­tonomous decision-making introduces potential risks, from unintended actions to biases in the agent’s learning process. As organisations con­sider SLMs and agentic AI, robust governance frameworks will be essential to prevent undesir­able outcomes,” says Ying.

With DeepSeek open-sourcing R1, we can ex­pect the emergence of new AI models that chal­lenge R1’s efficiency advantages. Forrester’s Jo­seph advises enterprises to regularly evaluate new AI models against cost, performance, and task suitability to avoid missing out on better alter­natives. This requires setting up an LLM evalua­tion pipeline that tracks inference efficiency, ac­curacy, and overall ROI across multiple models.

“Net-net, AI adoption is no longer about choosing the best model — it is about strate­gically integrating multiple models for optimal performance and cost-efficiency. Companies that master this approach will gain a significant com­petitive advantage as AI continues its rapid evo­lution,” he says.

Read more about the impact of DeepSeek and generative AI here: 

×
The Edge Singapore
Download The Edge Singapore App
Google playApple store play
Keep updated
Follow our social media
© 2025 The Edge Publishing Pte Ltd. All rights reserved.