Blogs
Simran Vohra

Author

  • Published: Oct 29 2025 04:35 PM
  • Last Updated: Oct 29 2025 04:55 PM

Karan Goel, IIT Delhi alumnus, raises $100M for Cartesia. Sonic-3 AI voice technology redefines real-time conversation.



Newsletter

wave

Karan Goel, a graduate of IIT Delhi and Stanford, is the founder and CEO of Cartesia, an AI voice technology company based in Silicon Valley. Goel’s startup raised $100 million from top venture funds—Kleiner Perkins, Index Ventures, Lightspeed, and NVIDIA—positioning Cartesia as a key innovator in voice AI.​

Who is Karan Goel? IIT Delhi Alumnus Profile

Karan Goel

Karan Goel studied at Delhi Public School, IIT Delhi (Electrical Engineering), Carnegie Mellon, and Stanford, earning top honors including the Siebel Scholarship. At Stanford’s AI Lab, Goel and his co-founder Albert Gu advanced "State Space Models" (SSMs) and launched Cartesia to bring real-time, natural voice AI to businesses worldwide​

What is Sonic-3? The Fastest Natural AI Voice Model

Sonic-3 is Cartesia’s real-time text-to-speech (TTS) AI model designed to make conversations sound human-like. It can generate laughter and express a full range of emotions during live conversations. Sonic-3 supports 42 languages and achieves a lightning-fast end-to-end latency of just 190 milliseconds, making it one of the fastest voice AI models on the market.​

How Sonic-3 Works: State Space Models (SSMs) Explained

Unlike most voice AI tools that use Transformers, Sonic-3 is built on State Space Models (SSMs). Transformers reprocess the entire conversation to generate each new word, causing delays. SSMs, pioneered by Karan and his co-founder at Stanford AI Lab, work like humans by remembering the topic and tone, enabling Sonic-3 to respond in real-time more naturally and efficiently.​

Sonic-3 vs Other AI Voice Tools Like Eleven Labs

Feature Sonic-3 (Cartesia) Eleven Labs
Latency 90ms model latency, 190ms end-to-end latency 75ms (lower quality) to 300ms+
Voice Quality Natural, expressive, emotional range including laughter Less natural, fewer emotions
Audio Required for Cloning 3 seconds for instant voice clones 10-30 seconds minimum audio
Language Support 42 languages 32 languages
Model Architecture State Space Models (SSMs) Transformer-based
Deployment Supports on-prem and on-device No on-prem or device support
Voice Customization Speed and emotion controls, synthetic voice mixing Stability, style exaggeration controls
User Preference Preferred by 62% in blind human tests Preferred by 38.6% in tests

Sonic-3 stands out with faster speeds, richer emotional range, and more flexible deployment options compared to competitors like Eleven Labs.

Karan Goel’s Viral Tweet: $100M Funding + Sonic-3 Launch

Karan Goel announced the $100 million funding and Sonic-3 launch in a viral tweet that sparked thousands of likes and replies. In the tweet, he detailed Sonic-3’s unique SSM approach and offered a $5,000 charity pledge if Cartesia cannot improve qualified users’ voice AI.

Karan Goyal's post rapidly get viral and hundreds of techies and founders have comment on his post:

  • Future Stacked: "190ms end-to-end is seriously impressive. The breakthrough on emotional range is what really caught our attention, that’s been the missing piece for natural conversation."
  • Kevin Garber: "Would be curious to use this for @LogicGlue_ - we are particularly interested in latency improvements over OpenAI."

Cartesia’s funding and innovation puts it ahead among global AI startups as top investors bet on new technology for voice and conversation. SSMs may become the new standard, replacing older transformer-based models that are slower and less natural.

FAQ

Sonic-3 uses state space models (SSMs), making it much faster and natural than transformer-based systems.​

Leading venture firms Kleiner Perkins, Index Ventures, Lightspeed, and NVIDIA funded Cartesia.​

Sonic-3 can achieve 90ms model latency and 190ms end-to-end, beating all competitors in speed.​

Sonic-3 supports 42 languages, making it ideal for global use.​

Yes, businesses and users can test or book a free demo at cartesia.ai/sonic.

Sonic-3 supports 42 languages, making it versatile for global applications.

Search Anything...!