Cartesia: Pioneering real-time voice AI

Tuesday, March 11, 2025

Voice is the most natural way humans communicate. But making real-time voice interactions seamless and lifelike has eluded AI. Existing models are slow and computationally expensive to power instant, high-quality speech.

Enter Cartesia.

Founded by Stanford researchers Karan Goel, Albert Gu, Arjun Desai, and Brandon Yang, Cartesia is pioneering a new era of AI efficiency with state space models (SSMs). Unlike traditional transformer-based architectures, SSMs enable continuous, real-time processing, allowing models to handle long input sequences while dramatically improving latency, cost and scalability. This makes SSMs ideal for context-heavy, latency-sensitive use cases like real-time audio generation. This has led to fast uptake of Cartesia’s models in customer support, content creation, and entertainment applications.

Its first product, Sonic, is already redefining AI-powered voice synthesis. Sonic delivers low-latency, lifelike speech generation, balancing inference speed, quality, throughput and latency. More than 10,000 customers already rely on Sonic for applications spanning customer engagement, media, gaming and enterprise automation.

Today, we’re thrilled to partner with Cartesia and lead its Series A. We believe real-time AI is the next major computing shift, and Cartesia is positioned to lead this transformation. As more applications demand instant, high-fidelity voice interactions, the company’s breakthroughs will unlock new possibilities across industries.

They’re also hiring across the board. Join them in shaping the future of real-time AI.

ー Bucky & Nadia