IIT-BHU Innovator Unveils Luna AI, World’s First Speech-to-Speech Model That Feels and Sings

While Big Tech pours billions into mechanical voice bots, a 25-year-old founder from Jaipur has shown the world another path. Sparsh Agrawal, founder of Pixa AI, has unveiled Luna AI,
IIT-BHU Innovator Unveils Luna AI, World’s First Speech-to-Speech Model That Feels and Sings
Published on
2 min read

While Big Tech pours billions into mechanical voice bots, a 25-year-old founder from Jaipur has shown the world another path. Sparsh Agrawal, founder of Pixa AI, has unveiled Luna AI, world’s first speech-to-speech foundational model that doesn’t just talk, but can sing, whisper, pause, and respond with emotional intelligence in real time.

Unlike traditional systems that convert speech to text and back, Luna AI directly processes audio and generates human-like speech, removing conversion latency and enabling faster, more expressive, emotionally intelligent conversations.

This zero-to-one architectural breakthrough allows Luna to whisper, pause, sing, and respond in context, creating a level of emotional nuance and responsiveness that feels human. Benchmark evaluations show up to 50% lower latency and improved emotional fidelity compared to text-mediated voice models.

Internal evaluations show Luna outperforming leading real-time systems, including OpenAI’s, on key metrics of accuracy and speech naturalness:

  • ASR (Automatic Speech Recognition): Luna – 5.24% → beats Deepgram Nova (8.38%) & ElevenLabs Scribe (5.81%)

  • TTS WER (Text-to-Speech Word Error Rate): Luna – 1.3% → beats Sesame (2.9%) & GPT-4o TTS (3.2%)

  • MOS (Mean Opinion Score) Naturalness: Luna – 4.62 → tops GPT-real-time (4.15)

“I didn’t have a research lab or a $100 million runway,” said Sparsh Agrawal, Founder, Pixa AI. “I borrowed GPUs, cloud credits, and even credit card debt. Luna is proof that world-class technology can be built from India, with resourcefulness, not just resources. This is exactly what the IndiaAI Mission stands for: innovation that’s homegrown, open, and globally competitive,” he added.

Luna has been trained on over millions of hours of speech data, fine-tuned for real-time performance, emotion recognition, and expressiveness. 

The rise of speech-to-speech AI is opening new frontiers in entertainment, mobility, wellness, and companionship. Heavy inbound demand for Luna already shows universal appeal cutting across industries and geographies, from automakers to kids’ apps to consumer AI products.

Through a licensing-led business model, Pixa AI aims to make Luna AI’s capabilities available to global partners, enabling applications across entertainment studios, wellness platforms, automotive, and gaming sectors, forming part of its future roadmap for international expansion.

Pixa AI is backed by WTFund (where Agrawal was the only solo founder selected from 15,000+ applicants) & other renowned investors such as Kunal Shah (CRED), Shankar Narayana (ex-MD, Carlyle Asia Growth Partners), and Kunal Kapoor (celebrity and investor).

While leading players such as OpenAI, ElevenLabs are focused on utility first customer call centre support, Luna is emotion & entertainment first, achieving faster response times, smoother continuity, and more expressive, natural-sounding voices.

The result is a model that feels less like an interface and more like a companion — capable of conversing, singing, storytelling, and adapting tone and tempo in real time.

𝐒𝐭𝐚𝐲 𝐢𝐧𝐟𝐨𝐫𝐦𝐞𝐝 𝐰𝐢𝐭𝐡 𝐨𝐮𝐫 𝐥𝐚𝐭𝐞𝐬𝐭 𝐮𝐩𝐝𝐚𝐭𝐞𝐬 𝐛𝐲 𝐣𝐨𝐢𝐧𝐢𝐧𝐠 𝐭𝐡𝐞 WhatsApp Channel now! 👈📲

𝑭𝒐𝒍𝒍𝒐𝒘 𝑶𝒖𝒓 𝑺𝒐𝒄𝒊𝒂𝒍 𝑴𝒆𝒅𝒊𝒂 𝑷𝒂𝒈𝒆𝐬 👉 FacebookLinkedInTwitterInstagram

Related Stories

No stories found.
logo
DIGITAL TERMINAL
digitalterminal.in