The story so far: At the AI Impact Summit, the Bengaluru-based startup Sarvam AI released two Large Language Models (LLMs), which are the foundation for AI systems that power services like Google’s Gemini and OpenAI’s ChatGPT. The two models were trained on 35 billion and 105 billion parameters respectively, and were less power- and compute-intensive than comparable models, while demonstrating improvements over other models in Indian languages, Pratyush Kumar, a Sarvam co-founder said. How are LLMs trained? LLMs are trained and operated on clusters of Graphics Processing Units (GPUs). The combined cost of the GPUs and the electricity needed to run them long enough to train a model, run into millions of dollars. The grist for this mill is data, largely scraped from the Internet, where English, European languages and East Asian languages like Korean and Japanese are more richly represented than Indian languages. This creates a twofold challenge for training an LLM on Indian soil with Indian capital: for one thing, with scarce data sources, many LLMs either perform worse when operating on Indian languages, or burn more “tokens” on inference to translate sentences into English (and translating responses back) to perform better. Since machine translation has improved dramatically for Indian languages, this remains the gold standard for many LLMs. Secondly, since capital is also scarce, efforts to train an LLM by Indian firms targeting Indian users can be challenging, especially if there is no immediate business use case for doing so. Using translations as a fulcrum can be a challenge for developers who want to leverage local LLMs — like Sarvam’s 35 billion parameter model, which was shown off in a demo during the summit’s research symposium working on a feature phone — where suboptimal performance in Indian languages can impact adoption and quality of performance. Has there been government support? The IndiaAI Mission has subsidised efforts to conduct training in India, by commissioning over 36,000 GPUs in data centres operated by Indian firms like Yotta, and allowing researchers and startups to run training and inference workloads at a relatively nominal fee. The government gave Sarvam access to 4,096 GPUs from its common compute cluster, and the subsidy so far is estimated at almost ₹100 crore. The “bill of materials” for this cluster is ₹246 crore, though these GPUs can probably be continued to be used by others. The Ministry of Electronics and Information Technology has encouraged domestic LLM development for many reasons. The main one is a belief that foreign-developed LLMs can’t possibly find the capabilities or the business case to develop the capacity to work well with Indian languages. Additionally, encouraging talent that can train LLMs has been seen as important to foster the Indian AI ecosystem. As such, Sarvam’s announcement of its two models is a significant development in India’s own quest to develop a powerful and relatively inexpensive LLM. When China’s DeepSeek developed its R1 LLM, the entire AI industry quickly adopted its techniques, as it saved on cost for training and inference without compromising output quality. The government has sought to spark a similar cost advantage. Mr. Kumar of Sarvam said that the LLM was trained “from scratch,” and that the model would be made open source. However, while it has been made available on an app named Indus, it is not available on platforms like Hugging Face, making it difficult for outside experts to scrutinise the claims the firm has made. What is the MoE architecture? A key breakthrough for AI models seeking to function locally was the Mixture of Experts (MoE) architecture. When the first LLMs trained on hundreds of billions or even over a trillion parameters were launched, inference was typically run by “activating” all parameters, making queries expensive. But an MoE model only activates a fraction of the overall parameters of a model, making it run faster and also consume a lower level of computing resources. Even 105 billion parameters, Sarvam acknowledges, “is significantly smaller than the frontier models powering global consumer chat applications today,” and the firm says it is “intentionally focused on accuracy, usefulness, efficiency, and alignment for the Indian context before training bigger foundational models”. As such, the answers are not as in-depth as a response from paid versions of Gemini or ChatGPT. That part will come later, Sarvam says, when it has the necessary investments to put money into a larger training run. Another LLM developed and trained on the common compute cluster was by BharatGen, the IIT Bombay-incubated firm that was able to train a “multilingual” 17 billion parameter model. That model, the firm says, is for use in sectors like education and healthcare. Gnani.ai, another firm, launched a small text-to-speech model. Published – February 26, 2026 08:30 am IST Share this: Click to share on WhatsApp (Opens in new window) WhatsApp Click to share on Facebook (Opens in new window) Facebook Click to share on Threads (Opens in new window) Threads Click to share on X (Opens in new window) X Click to share on Telegram (Opens in new window) Telegram Click to share on LinkedIn (Opens in new window) LinkedIn Click to share on Pinterest (Opens in new window) Pinterest Click to email a link to a friend (Opens in new window) Email More Click to print (Opens in new window) Print Click to share on Reddit (Opens in new window) Reddit Click to share on Tumblr (Opens in new window) Tumblr Click to share on Pocket (Opens in new window) Pocket Click to share on Mastodon (Opens in new window) Mastodon Click to share on Nextdoor (Opens in new window) Nextdoor Click to share on Bluesky (Opens in new window) Bluesky Like this:Like Loading... Post navigation What are carbon capture and utilisation technologies? | Explained When home becomes hostile: women of the Northeast write their stories