DeepSeek power, but at lower cost? India’s new AI model prompts scrutiny, scepticism

Wed, 04 Mar 2026, 05:00 WIB | Source: CNA | Technology

analysis Asia

Launched at a landmark event in India recently, homegrown firm Sarvam AI claims its new 105-billion-parameter model can rival China’s DeepSeek at a fraction of the computational cost. CNA tests how it stacks up against global peers.

SINGAPORE: In recent years, India’s artificial intelligence (AI) ambitions have been accompanied by a persistent perception.

It is that while the country excels at building AI-powered software, it has yet to produce a frontier foundational model capable of standing shoulder-to-shoulder with global leaders such as OpenAI, Google, Anthropic, DeepSeek, or Alibaba’s Qwen.

These perceptions surfaced again at the recent India AI Impact Summit in February, when India’s homegrown firm Sarvam AI unveiled a 105-billion-parameter (105B) model alongside a 30B model, marking the company’s most ambitious effort yet.

Four other Indian firms unveiled their own large language models (LLMs) at the event, including Gnani.ai’s two speech-focused foundation models, BharatGen’s 17-billion-parameter Param 2, Tech Mahindra’s 8-billion-parameter Hindi education model, and Fractal Analytics’ healthcare-focused AI system. However, these were either smaller in scale or tailored to specific use cases.

Sarvam AI said both of its models were built from scratch, unlike its earlier Sarvam-M model launched in May 2025, which was built on top of Mistral’s Small model.

At the summit, co-founder and CEO Pratyush Kumar presented slides showing the 105B model performing broadly on par with - and in some cases surpassing - leading open-weight models such as OpenAI’s GPT-OSS 120B, Qwen3 Next 80B and Zhipu AI’s GLM 4.5 Air.

An open-weight model is one whose trained parameters and code are publicly released, allowing developers to download, test, and fine-tune it independently.

In AI systems, parameters refer to the internal values a model learns during training that shape how it interprets inputs and generates responses. Generally, more parameters increase a model’s capacity to handle complex tasks - though design and training also play key roles.

Sarvam’s 105B and 30B models, however, are not open-weight. Their codes have not been made public, meaning external developers cannot independently test or verify the company’s performance claims.

One of Sarvam’s most striking claims is that its 105B model uses only about 9 billion “active parameters” per prompt. Although the model contains 105 billion parameters in total, it activates only a small fraction of them to process each query.

The company said this selective activation allows it to operate at significantly lower computational cost - requiring less processing power, memory and energy - than some leading global models.

By comparison, DeepSeek’s R1 and V3 models activate around 37 billion parameters per prompt, implying substantially higher compute requirements. OpenAI’s GPT-OSS 120B, however, activates only about 5.1 billion parameters per prompt.

“I’m impressed but also cautious,” said Abhishek Chatterjee, CEO of Karmaloop AI - a firm that builds AI automation tools.

“If it is really achieving this level of reasoning with just 9 billion active parameters, that’s serious innovation. But we need the model weights to verify that.”

However, an AFP report suggested that India was unlikely to experience its own “DeepSeek moment” - the kind of surge China saw last year with the launch of a high-performance, low-cost chatbot - any time soon.

CNA tested Sarvam’s 105B model and spoke with analysts and engineers.

The broader question remains: Has India delivered its long-awaited breakthrough, or is its leading homegrown model still playing catch-up in the global AI race?

INDIA’S 105B MODEL: GLOBAL AI BREAKTHROUGH OR BOLD CLAIM?

Building an LLM of this scale from scratch is a significant undertaking. The achievement is particularly notable in India, which until recently lacked broad access to large clusters of graphics processing units (GPUs) - the specialised chips that power advanced AI systems.

Chatterjee said Sarvam executives at the AI summit acknowledged they “couldn’t have done this a year ago” due to compute constraints, adding that government-backed GPU access under the IndiaAI Mission made the breakthrough possible.

In May 2025, India’s Ministry of Electronics and Information Technology (MeitY) granted Sarvam AI access to 4,086 high-end Nvidia H100 chips.

With that infrastructure in place, the company’s main claim centres on efficiency - just 9 billion active parameters per prompt. If validated, experts told CNA this could mark a significant step forward globally in reducing the cost of running AI queries.

Chatterjee suggested the model might even run locally on a high-end MacBook Pro with 128GB unified memory - potentially enabling developers to build applications on local devices without relying on costly cloud GPUs.

However, Partha Rao, co-founder and CEO of AI firm Pints.AI, urged caution. He said that for high accuracy, the model would need about 420GB of memory to run - meaning only very high-end machines, such as a 512GB Mac Studio, could handle it without modification.

That said, experts told CNA that Sarvam has not disclosed any fundamentally new architecture.

“I do not see any technical innovation there,” said Amit Verma, Head of Engineering and AI at Neuron7.ai, who evaluated the model via its application programming interface.

“They are using Mixture of Experts. That’s about it.”

Mixture of Experts (MoE) - an AI design architecture popularised by DeepSeek’s R1 model last year - uses multiple specialised sub-models that exist within an AI model, but only a specialised sub-model is activated for each prompt. This reduces compute costs.

Think of an MoE model as a team of specialists - a doctor, lawyer or engineer - where only the most relevant expert responds to a question while the others remain idle.

With the model weights yet to be released

Tags: Asia

View JSON | Print