Your voice. Your brand. Clone a voice from a short recording and manage your entire voice catalog from the xAI console.
Today, we're introducing Custom Voices. Clone your voice from a few seconds of audio and use it instantly across Grok Text to Speech and Voice Agent APIs.
Tyler
SpaceX Broadcast Host
Alongside Custom Voices, the new Voice Library gives your team a single place to browse, preview, and manage all your voices from the xAI console.
Custom Voices unlock a new class of applications.
I need help with my recent order.
Of course! Let me pull up your order details.
Give your customer support agent a consistent, recognizable voice that matches your brand identity, not a generic preset.
In today's episode we dive deep into the future of AI and what it means for creators everywhere
Narrate videos, podcasts, and social posts in your own voice at scale, without re-recording every time.
Create personalized voices for individuals who have lost the ability to speak, preserving their vocal identity.
Deliver your CEO's keynote in every major language โ naturally in English, Spanish, French, German, Chinese, Japanese, and more.
Bring characters to life with unique voices without scheduling studio time for every line of dialogue.
She opened the notebook and found the handwriting unmistakably her own though she had no memory of writing it
Make your narrative engaging. Turn scripts into full audiobooks narrated in your own voice, chapter by chapter, without stepping into a studio.
Clone your voice in under two minutes. Use it everywhere.
Record about a minute of natural speech in the xAI console. Our pipeline verifies you're the voice owner, processes your recording, and delivers a production-ready voice model, all in under two minutes. Your custom voice inherits every TTS capability: speech tags, multilingual output, and both REST and WebSocket streaming.
Custom voices work everywhere our built-in voices do. Pass the voice_id to any TTS endpoint or use it with the Voice Agent API for real-time conversational agents.
Every custom voice goes through a two-stage verification process before it can be created. First, the speaker reads a verification phrase that our STT engine transcribes and matches in real time, confirming intent and presence. Then we compute speaker embeddings from the verification clip and the full recording to confirm they belong to the same person.
You can't clone a voice from a pre-existing recording, and you can't clone someone else's voice.
Read a verification phrase aloud. Our STT engine transcribes and matches it in real time, verifying your consent and presence.
Speaker embeddings from the passphrase and the full recording are compared to confirm they belong to the same person.
The Voice Library is a new section in the xAI console that organizes every voice available to your team, with your custom creations alongside our built-in voices. Browse, preview, and manage voices from a single page.
We've expanded our built-in voice catalog to over 80 voices across 28 languages. Listen to any voice across different scenarios before choosing one for your application.
There is no extra charge to use Text to Speech or Voice Agent APIs with custom voices.