Loud and clear: Open source for India’s linguistic diversity

© Indian Institute of Science
In a country as linguistically diverse as India, voice technology has the potential to transform how people connect with information, services, and each other. Yet for millions, this promise remains out of reach. The Indian Institute of Science (IISc) in collaboration with the BMZ initiative “FAIR Forward—Artificial Intelligence for All” seeks to change that.
By building high-quality Text-To-Speech (TTS) corpora in nine Indian languages, IISc is laying the foundation for inclusive, accessible voice-based applications that serve people in their own languages and cultural contexts.

The IISc’s SYSPIN project focuses on nine languages: Bengali, Bhojpuri, Chhattisgarhi, Hindi, Kannada, Magahi, Maithili, Marathi, and Telugu. These languages have historically lacked the technological resources needed to support modern voice systems based on artificial intelligence (AI). IISc addresses this gap by creating 720 hours of studio-quality audio data—40 hours each from one male and one female voice artist per language.

But this project isn’t just about quantity. What sets it apart is its attention to quality, phonetic diversity, and human-centric validation at every step.

Our goal was to build a foundation that is not only scientifically robust, but also socially meaningful. We worked hard to ensure that the voices in our datasets truly reflect the people and regions they represent.

Prof. Prasanta Ghosh, project lead at IISc

The project began by selecting native speakers to compose and validate sentences from everyday topics, ensuring both linguistic authenticity and phonetic diversity. Carefully chosen voice artists then recorded these texts in professional studios, resulting in high-quality audio data that was thoroughly checked for accuracy and consistency.

The usefulness of the datasets and models

© SPIRE Lab

In loading this video you are accepting the privacy policy of YouTube. Further information you find in our privacy policy.

Watch on YouTube

These types of digital public goods for voice technologies unlock a wide range of applications—from voice assistants and screen readers to language learning tools, automated helplines, and accessible public service delivery. They empower users in rural and low-literacy communities to interact with digital systems in their native languages, bridging the digital divide and fostering more inclusive participation in education, healthcare, governance, and beyond.

© Bhashini AI Solutions

SYSPIN’s mission aligns closely with India’s national effort to make language technology accessible to all. “Open, inclusive voice tech in Indian languages is not a luxury—it’s a necessity,” says Mr. Nag, head of India’s Bhashini mission. By contributing high-quality datasets and tools to the open-source ecosystem, SYSPIN supports broader initiatives to democratize AI and foster digital inclusion across the country.

Implemented in partnership with German Development Cooperation, this is a powerful example of global cooperation for digital public goods. It shows what’s possible when cutting-edge research, community engagement, and policy support come together with a shared vision.

In parallel to SYSPIN, a sister project called RESPIN focused on developing datasets for Automatic Speech Recognition (ASR) in Indian languages. RESPIN was supported by the Gates Foundation and aimed to create high-quality speech data to power inclusive voice-driven applications across diverse linguistic communities. Together, SYSPIN and RESPIN form a complementary effort to advance voice technology for Indian languages.

How datasets and models were created

© SPIRE Lab

In loading this video you are accepting the privacy policy of YouTube. Further information you find in our privacy policy.

Watch on YouTube