Product Thumbnail

Grok Voice API

Fast, accurate STT and TTS APIs at the best price

API
Artificial Intelligence
Audio

Hunted byZac ZuoZac Zuo

Grok now offers standalone Speech-to-Text and Text-to-Speech APIs for developers. The new voice stack covers real-time and batch transcription, multispeaker diarization, multichannel audio, text formatting, expressive TTS with speech tags, multilingual support, and simple usage-based pricing.

Top comment

Hi everyone!

With the new transcription (Speech-to-Text) API now available, combined with their Voice Agent capabilities, it’s clear that @Grok is making a systematic push to capture the entire Voice AI ecosystem.

Looking specifically at the STT model, they have shipped a highly pragmatic feature set. It includes native WebSocket support for real-time streaming, built-in speaker diarization (a must-have for meetings), and intelligent text formatting that automatically handles numbers and currencies (it's cool and pretty useful in production!).

The pricing is also very aggressive: $0.10 per hour for batch and $0.20 per hour for streaming. xAI is once again putting some real price pressure on the market, isn't it?

Comment highlights

I've always appreciated the extent in which Grok can utilize voice for projects. Is the text to speech compatible and fluent with all manner of accents as well?

@zaczuo — the pricing puts real pressure on Deepgram and Whisper API. Curious about multilingual coverage — is speaker diarization accuracy consistent across languages, or is English still the primary target where the model performs best? That's usually where the gap shows up in production.

the multispeaker diarization built right into the STT is a nice touch — that's usually a painful separate step. how's the latency on the real-time streaming? would love to see benchmarks vs whisper and deepgram

About Grok Voice API on Product Hunt

Fast, accurate STT and TTS APIs at the best price

Grok Voice API launched on Product Hunt on April 18th, 2026 and earned 116 upvotes and 6 comments, placing #11 on the daily leaderboard. Grok now offers standalone Speech-to-Text and Text-to-Speech APIs for developers. The new voice stack covers real-time and batch transcription, multispeaker diarization, multichannel audio, text formatting, expressive TTS with speech tags, multilingual support, and simple usage-based pricing.

Grok Voice API was featured in API (98k followers), Artificial Intelligence (466.3k followers) and Audio (2k followers) on Product Hunt. Together, these topics include over 99.7k products, making this a competitive space to launch in.

Who hunted Grok Voice API?

Grok Voice API was hunted by Zac Zuo. A “hunter” on Product Hunt is the community member who submits a product to the platform — uploading the images, the link, and tagging the makers behind it. Hunters typically write the first comment explaining why a product is worth attention, and their followers are notified the moment they post. Around 79% of featured launches on Product Hunt are self-hunted by their makers, but a well-known hunter still acts as a signal of quality to the rest of the community. See the full all-time top hunters leaderboard to discover who is shaping the Product Hunt ecosystem.

Reviews

Grok Voice API has received 12 reviews on Product Hunt with an average rating of 4.58/5. Read all reviews on Product Hunt.

Want to see how Grok Voice API stacked up against nearby launches in real time? Check out the live launch dashboard for upvote speed charts, proximity comparisons, and more analytics.