Amid the excitement surrounding agentic AI’s transformative potential, several publications have dubbed 2025 the year of agentic AI. The reality, however, is that the technology remains in its infancy. Despite the blistering pace of innovation, fully autonomous AI agents are, in many respects, still an aspirational goal — held back by significant technical, ethical and regulatory challenges. Take AutoGPT, for instance — the first widely recognised autonomous AI agent that was released in 2023. As an agentic counterpart to ChatGPT (both are powered by OpenAI’s GPT models), AutoGPT has faced persistent criticism for slow processing speeds and glaring inaccuracies, severely limiting its real-world utility. As a result, many still prefer the reliability and responsiveness of ChatGPT. But to fixate on what qualifies as truly “agentic” quickly becomes an exercise in unmitigated pedantry. A more useful lens is to view agentic functionality as existing on a spectrum. Within this framework, the current trend of agentic transformation is less about achieving full autonomy and more about integrating agentic capabilities into existing GenAI workflows — enhancing these processes with improved decision-making, multi-tasking and automation features (see Diagram 1).
Recent entrants in this space include Salesforce’s Agentforce platform, Zendesk’s Resolution Platform and Deloitte’s Zora AI, all of which leverage hybrid generative-agentic AI technologies and multi-agent systems (MAS) to deliver a range of AI-driven automation solutions (see Diagram 2 for example of a multi-agent system).
See also: Not investing in AI is not an option
This trend aligns with broader investment patterns in the AI inferencing market, where capex is increasingly directed towards enterprise-focused co-pilots and domain-specific chatbots. Here, customer service stands out as the dominant use case, owing to its straightforward application and broad relevance across industries.
That said, the landscape is by no means confined to narrow enterprise functions. The high-profile launch of Chinese start-up, Monica’s Manus AI, in March 2025 — a general-purpose assistant with mass consumer appeal — highlighted the growing potential of agentic products in everyday consumer markets.
THE WINNERS
A.The AI-powered workforce: Users
When it comes to technologies that promise efficiency, cost savings and productivity gains, the advantages accrue to the first movers as always. Broadly, these players fall into two camps: those who adopt the technology, and those who develop it. At present, sectors such as media and telecommunications, financial and professional services, and scientific research and development rank in the top quartile of AI adoption and usage. This pattern is unlikely to change with the rise of agentic AI, given that businesses with existing tech infrastructure are better positioned to deploy agentic systems more quickly — allowing them to capture the earliest and most substantial gains — while lagging industries continue playing catch-up on core AI advancements. What is clear is that the industry-wide pivot towards agentic workflows marks a significant evolution in the role of automation at work: from personal productivity tools to deeply integrated components of the core business system and processes.
The emergence of Service-as-a-Software 2.0, or SaaS 2.0 (notice the twist on the familiar terminology: service comes first, not software), presents further opportunities for cost savings by enabling businesses to outsource not just software, but entire processes. AI-powered systems autonomously handle tasks, transforming software from a tool for workers into a worker itself. Where traditional SaaS once disrupted licensing and ownership models in the early 2000s with subscription-based services, SaaS 2.0 is predicated on an outcome-based business model. Under this approach, costs are directly tied to specific business outcomes. This alignment of value and cost supports more accurate budgeting and allows businesses to maintain operational efficiency with leaner teams. Accordingly, SaaS 2.0 has the potential to displace not only traditional SaaS subscriptions but also conventional business process outsourcing (BPO) models by offering a more cost-effective and flexible alternative. Though outcome-based pricing has yet to be the industry norm, tech giants like Amazon and Microsoft have already begun to experiment with usage-based flexible pricing models in certain AI-related services, moving closer to an outcome-based pricing structure.
Separately, cybersecurity has always been a key concern in any discussion surrounding AI — and this is only expected to compound as agentic AI gains traction. In 2025 alone, global cybercrime is projected to cost a staggering US$10.5 trillion. Mainstream adoption of GenAI and agentic AI will likely accelerate cloud adoption, as companies increasingly opt to process workloads in the cloud rather than on-premise. In this context, cloud-native security providers, such as CrowdStrike and SentinelOne, along with platforms offering integrated cloud and data security ecosystems — like Microsoft with its Defender and Sentinel products — are well-positioned to lead the sector.
Further, as agentic AI drives a proliferation of endpoints, autonomous agents and digital identities, investment in identity and access management (IAM) and zero-trust architecture solutions will likely become more critical. Prominent players include Okta, CyberArk and Zscaler. For cybersecurity vendors themselves, staying ahead as malicious actors adopt increasingly sophisticated AI-driven tactics will require products that embed AI into their defensive strategies. Case in point: Last year, CrowdStrike launched the industry’s first AI-powered Indicator of Attack (IoA) system on its flagship Falcon platform, using AI and machine learning for real-time behavioural analysis and automated threat detection.
B.Builders of the future: Infrastructure
GenAI and agentic AI can otherwise be understood as a means for monetising foundational models — a way to generate returns for the capital spent on developing/training the AI models and building the underlying infrastructure. That is, rising adoption rates in these technologies naturally benefit the usual tech heavyweights — the large language model (LLM) developers themselves, such as OpenAI, Anthropic and Meta, alongside hyperscalers (that is, large-scale cloud providers) — the “big three”, which comprise Amazon, Microsoft and Google. Some players like Google and Meta operate across both domains, building and serving AI models as well as the supporting cloud infrastructure.
For more stories about where money flows, click here for Capital Section
Within this space, two developments are worth highlighting: DeepSeek’s splashy January launch, featuring a vastly more efficient training model; and more recently, Microsoft’s decision to scale back on existing data centre lease agreements. Both raise concerns of potential oversupply in data centres. But to clarify two points: first, there is more to data centres than just racks of servers, and more to running an AI model than just Nvidia graphics processing units (GPUs) and TSMC chips. For any given AI system, its performance can be said to be a function of four interdependent elements: chips define computational power, storage determines speed, network enables scale, and power determines the overall feasibility of running the system. Sweeping claims about data centre oversupply — which have yet to be broadly corroborated — lack meaningful context and verge on fearmongering without clarifying which components are being overbuilt, if any.
This leads to the second point: It is important to distinguish between training and inferencing workloads, as they place different demands on infrastructure. Currently, the bulk of data-centre capex caters to training demand. GenAI and agentic AI primarily drive inferencing workloads — where trained models are deployed to make real-time decisions on new data. For inferencing, raw computing power takes a backseat to low latency and computing efficiency — in other words, fast response times and low power consumption. This is because LLMs must handle millions of requests per second, many of which are time-sensitive, such as those needed for autonomous driving and augmented reality/ virtual reality (AR/VR) applications. To give a sense of the potential differences in hardware requirements, consider the following two areas: chips and memory.
When it comes to chips, Nvidia remains the dominant chip supplier for both training and inferencing. However, application-specific integrated circuit (ASIC) chips represent a fast-growing segment expected to surpass Nvidia in the inferencing market over the longer term. Unlike general-purpose GPUs, ASICs can be custom-built for specific inference workloads, making them more efficient in terms of performance per watt and latency. Google currently leads this space with its tensor-processing unit (TPU) series, followed by Amazon with its Inferentia2 chip. Broadcom is also a major player, leading in third-party ASIC development. Beyond ASICs, other AI-specialised accelerators in development include neural-processing units (NPUs) and field programmable gate arrays (FGPAs), both of which are designed to support the next wave of efficient, low-powered inferencing needs.
For memory, Samsung, SK Hynix and Micron remain the top producers of data centre memory technologies, including preferred DRAM products for optimised inferencing workloads. That said, Chinese manufacturers such as ChengXin Memory Technologies and Huawei-backed Swaysure have been making aggressive gains in the DRAM market. As it stands, one of the current bottlenecks to low-latency performance lies in memory bandwidth limitations. Inference workloads rely heavily on matrix multiplications, which require large volumes of data to be constantly shuttling back and forth between various hardware components. While this issue is even more pronounced in training workloads, it remains a significant constraint in inferencing, particularly when models are large or require real-time responses. A fundamental mismatch exists, as data retrieval lags compute speeds, because improvements in I/O — the pathways that govern how data enters and exits in a chip — have not kept pace with advancements in chip performance. To mitigate this, several workarounds have emerged for inference, including near-memory computing design (where memory and processing units are engineered to be in closer physical proximity, reducing data travel time and distances), and the growing trend of edge computing, by bringing processing closer to the data source.
With respect to edge computing, AI processing is increasingly being integrated as a core feature in devices such as smartphones, laptops and Internet of Things (IoT) hardware. Today, flagship mobile chipsets — like Qualcomm’s Snapdragon X Elite — are already capable of handling light inferencing workloads efficiently. This enables parts of the processing to be “offloaded” to local devices or handled entirely by lightweight AI models such as Gemini Nano that run fully on-device. The importance of edge processing is magnified in the era of agentic AI, where keeping sensitive data local not only reduces latency but also mitigates security and privacy risks. Looking ahead, breakthroughs in advanced packaging and memory integration are expected to unlock more complex, multi-modal inferencing on smartphones as early as 2026. Ultimately, this shift towards distributed computing could accelerate refresh cycles for consumer technology, as rapid AI advancements drive demand for higher-performance edge devices. In turn, this will potentially benefit vendors across the mobile and consumer tech supply chain.
WHAT’S NEXT?
When it comes to AI, the question of how we got here is deceptively simple — its answer spans decades of research and breakthroughs, and an even longer history of imaginative speculation. That’s why it is hard to step away from a degree of speculation when it comes to talking about AI. Science fiction, after all, is alternatively termed speculative fiction. As for science fact, what lies ahead as we explore agentic AI is no less complex, with implications that cut across industries and a promise to reshape how we work and interact with technology. What we are witnessing now is a quiet, yet profound, shift towards a new technological foundation built not just for generating words, but also for getting things done.
To capitalise on the unfolding AI revolution, we are setting up a brand new portfolio, Tong’s AI Portfolio. The concept — to make a decade-long investment bet on the future of AI. As with our other portfolios, this too will be a real portfolio. The Tong’s AI Portfolio will start with US$100,000 and will consist of eight to 10 stocks at any point in time. We envision a mixture of companies, including those involved in AI infrastructure, the service providers and enterprise end users. Some will be mega-cap household names, others lesser known among retail investors but well established within the tech landscape. The portfolio will also include some “moonshot” stocks, which are relatively small and unknown today, have the potential to offer substantial upside but also carry significant risks of failure. We have shortlisted the stocks in the accompanying table here, from which we will decide on the final selection for the AI Portfolio. Read next week’s column for the actual stocks we chose and why.
Box Article: How did we get here?
Additional context for developments that preceded agentic AI:
In the realm of artificial intelligence (AI), we’ve progressed from machine learning to deep learning to generative AI (GenAI) — and now, agentic AI. To understand the rapid advancement in this field of recent years, the breakthrough in transformer architecture would be a good place to start. Deep learning, a subset of machine learning, allows computer systems to learn without explicit programming — by using complex algorithms inspired by the human brain’s structure. These systems are trained to identify complex patterns by processing vast amounts of unstructured data. Early deep learning systems, however, were constrained by sequential processing.
The introduction of transformer architecture in 2017 changed this by enabling parallel data processing, which drastically increased speed, scaleability and the capacity to train models with billions of parameters while maintaining stability. This was especially transformational for the field of natural language processing (NLP) as language relies heavily on context.
Transformers allowed language models to analyse entire sentences and passages simultaneously, which paved the way for large language models (LLMs). This unlocked models with human-like fluency and broad, universal applicability across various domains.
Today, LLMs represent the most commercially viable application of modern deep learning and serve as the foundation for two emerging paradigms: (text-based) GenAI and agentic AI.
— End of Box Article —
The Malaysian Portfolio was up 0.1% for the week ended May 7. Insas Bhd – Warrants C (+14.3%) and Kim Loong Resources (+0.5%) closed higher, while United Plantations (-0.1%) finished the week marginally lower. Total portfolio returns now stand at 186.1% since inception. This portfolio is outperforming the benchmark FBM KLCI, which is down 15.3% over the same period, by a long, long way.
Meanwhile, the Absolute Returns Portfolio ended flat last week. Total returns since inception now stand at 22.2%. Gains from Tencent Holdings (+2.6%), SPDR Gold (+2.4%), JPMorgan Chase (+1.9%) and Goldman Sachs (+0.8%) offset losses from US Steel Corp (-7.3%), Berkshire Hathaway (-2.8%) and CrowdStrike (-1.5%).
Disclaimer: This is a personal portfolio for information purposes only and does not constitute a recommendation or solicitation or expression of views to influence readers to buy/sell stocks, including the particular stocks mentioned herein. It does not take into account an individual investor's particular financial situation, investment objectives, investment horizon, risk profile and/ or risk preference. Our shareholders, directors and employees may have positions in or may be materially interested in any of the stocks. We may also have or have had dealings with or may provide or have provided content services to the companies mentioned in the reports.