AI Singapore has launched a new large language model (LLM), Qwen-SEA-LION-v4, with support from Alibaba Cloud, to better support the linguistic, cultural and commercial needs of Southeast Asia. The model is designed to run even on a consumer laptop with 32GB of RAM, while delivering stronger multilingual accuracy and cultural contextual understanding.
Alibaba’s Qwen3-32B foundation model was trained on more than 100 billion words and phrases from Southeast Asian languages, drawn from a dataset spanning 119 languages and dialects. By doing so, the system learns to interpret local expressions, conversational styles and cultural references that global AI models typically miss.
Moreover, the Qwen team increased the amount of translation and cross-lingual training tasks during post-training. This helps the model handle everyday multilingual scenarios common across the region, including code-switching, informal chat, and mixed English-local language usage.
“Our collaboration with Alibaba on Qwen-SEA-LION-v4 is an important milestone in advancing AI inclusivity and to make it more representative of Southeast Asia. It embodies our shared vision of accelerating AI innovation across the region and ensuring that developers, enterprises, and public institutions have access to AI that is open, affordable, and locally relevant, and is designed to truly understand the languages, cultures, and communities of this region,” says Dr Leslie Teo, senior director of AI Products at AI Singapore.
Qwen-SEA-LION-v4 is part of SEA-LION (Southeast Asian Languages in One Network), a family of large language models that AI Singapore built to reflect the region’s cultural contexts and linguistic nuances.
Under the collaboration, Alibaba provided the Qwen3-32B foundation model and technical support for advanced post-training while AI Singapore provided region-specific open-source data, optimisation and evaluation across Southeast Asian language tasks.
See also: AI overtakes growth plans in Asia’s boardrooms for 2026: report
To boost accuracy across languages, Qwen-SEA-LION-v4 uses byte-pair encoding, which breaks text into smaller, more predictable pieces to enable more efficient and accurate multilingual text processing. Training has also been expanded to include additional datasets, including Burmese, Filipino, Indonesian, Malay, Tamil, Thai and Vietnamese, further strengthening cultural fluency and contextual understanding.
With a native 32k-token context length, Qwen-SEA-LION-v4 can now handle complex interactions such as document-level reasoning and summarisation. It is available in 4-bit and 8-bit quantised versions, making it easier and more cost-effective for developers and enterprises to deploy on local infrastructure without significant performance trade-offs.
“By combining our model's multilingual and reasoning strengths with AI Singapore's deep regional expertise, Qwen-SEA-LION-v4 demonstrates how open collaboration can make advanced AI more inclusive and locally relevant. We look forward to enabling more developers, enterprises and public-sector partners to build applications that truly understand the languages and cultures of this region,” says Hon Keat Choong, General Manager of Singapore at Alibaba Cloud Intelligence.
Qwen-SEA-LION-v4 currently ranks first among open-source language models under 200 billion parameters for Southeast Asian languages. It is available for free download on AI Singapore’s website and Hugging Face.
