Amazon Polly's Neural voices sound great - when they work. But developers report a persistent issue: Polly randomly switches from Neural to Standard voice mid-synthesis. Same text, same settings, but some sentences come out robotic while the rest sound natural.
This voice switching creates a jarring quality difference that's immediately noticeable to listeners. Certain SSML constructs and phrase patterns trigger a silent downgrade from Neural to Standard, with no warning or error code.
When you're generating high volumes of files for IVR systems, e-learning, or content platforms, you can't listen to every file. TTSAudit scans each Polly batch and flags files where voice quality dropped - whether from Neural/Standard switching, service errors, or consistency drift.
What developers are saying
"When using Amazon Polly with Neural voices, Amazon Polly will switch to the standard voice. The 1st and 3rd sentences are read using the Neural voice, while the second sentence is read using the Standard voice. This is affecting a lot of our customers."
AWS re:Post
"When listening to neural voice Ivy, if you type 'The dog's name is Cal.' Ivy is going to read it as 'The dog's name is California.' How can I stop this?"
u/Qasym on AWS re:Post
"5 customers have hit this problem this year. Any plans to fix or represent?"
AWS re:Post follow-up
"We can't say to our customers to not use similar forms of writing in their TTS scripts."
AWS re:Post
How TTSAudit solves this
Voice Switch Detection
Detect when Polly silently switches between Neural and Standard voices. Flag files where quality dropped mid-synthesis.
Consistency Scoring
Every file scored against the batch baseline. Catch subtle quality drops that manual review misses.
Batch Error Recovery
Know exactly which files in a failed Polly batch need re-processing. No more guessing.
Provider-Agnostic
Works with all Polly voice engines - Standard, Neural, Long-Form, and Generative.