Our most capable voice agent is now available via API.
Today, we're excited to announce a step change in xAI's Voice Agent capabilities: Introducing grok-voice-think-fast-1.0 — our new flagship voice model.
This new model excels at complex, ambiguous, multi-step workflows across customer support, sales, and enterprise applications. It is especially well-suited for high-stakes scenarios that demand precise data entry and high-volume tool calling to address the user's request.
We built grok-voice-think-fast-1.0 through tight collaboration with partners like Starlink to combine top-tier intelligence with low response latency and organic conversational ability.
Our model prioritizes snappy responses and unparalleled cost effectiveness without compromising on accuracy or tool orchestration. The result is a model that lets teams confidently deploy complex, multi-turn voice experiences across almost any conceivable use case: Customer support, phone sales, appointment booking, restaurant reservations, and more.
This new model takes the top spot on the τ-voice Bench leaderboard, which evaluates full-duplex voice agents under realistic conditions including noise, accents, interruptions, and turn-taking. See the benchmark details here.
Order handling, returns, promotions in noisy environments
Booking changes, delays, and complex itineraries
Plan changes, billing disputes, technical troubleshooting
The model has been battle-tested in the toughest real-world conditions: telephony audio, background noise, heavy accents, and frequent interruptions. It natively supports 25+ languages, making it ideal for global deployments.
Collecting and confirming user information is critical for many workflows. Grok Voice is able to seamlessly collect email addresses, physical street addresses, phone numbers, full names, account numbers, and other structured data—even when information is spoken quickly or with a strong accent. It gracefully handles speech disfluencies and accepts natural corrections as a human would.
The model handles the spoken corrections and extracts the intended address.
Invoking the address lookup tool with the corrected query parameter.
Reading back the normalized address with location for user confirmation.
Grok Voice Think Fast performs reasoning in the background, allowing it to think through challenging queries and workflows in real-time with no impact on response latency. This enables intelligent answers while retaining the dexterity needed for natural conversation.
Voice models often default to confident, plausible-sounding answers, despite being completely wrong. We've built grok-voice-think-fast-1.0 to reason through edge cases before responding, catching obvious mistakes that other models get wrong.
None of the months are spelled with the letter X. You can check them all, but X doesn't appear in any month name.
Only one month is spelled with the letter X. It's February.
Grok Voice enables Starlink's phone sales and customer support experience at +1 (888) GO STARLINK. This requires working across numerous languages, helping customers through customer support scenarios, and onboarding new customers via sales: