Play.ht (rebranded Play AI) was acquired by Meta in mid-2025 and permanently shut down on December 31, 2025. The API went offline first and the full platform - studio, voice library, cloning tools, and Voice Agents - sunset at the end of the year, with stored projects purged. If you were running a content pipeline on Play.ht, you are already migrating. This post covers the quality issues Play.ht was known for, what they teach us about Text-to-Speech QA in general, and how to audit whichever platform you are moving to.
Play.ht's documented quality issues
Before the shutdown, Play.ht was well-known in production communities for a specific set of recurring quality problems. Users reported voice quality degradation during peak usage - the same API returning great audio at 2 AM and noticeably worse audio during business hours when server load was high. Voice cloning required significantly longer training audio than the marketing claimed (30-plus seconds rather than the advertised 10) and reference audio quality mattered more than documented. A speed glitch in the UI caused narration speed to reset to maximum after saving, silently corrupting projects. Of the 140-plus languages claimed on the marketing page, only about 20 had usable production quality. And PlayHT 1.0's retirement earlier in 2025 had already forced users to migrate to 2.0/3.0 inside the platform before the platform itself went away.
What Play.ht's issues teach us about Text-to-Speech QA
The Play.ht story is a useful concentrated example of the risks that every hosted Text-to-Speech platform carries. Server load affects quality, so the same API can give different results at different times of day. Language coverage claims rarely match production reality - always test your specific language before committing. Voice cloning is more fragile than platforms admit. Model deprecation is a real risk: PlayHT 1.0's retirement broke existing workflows even before the full shutdown. And relying on a single hosted provider with no QA layer means you have no safety net when any of these hit you - it is the difference between noticing a regression in 5 minutes and noticing it in 5 months after your users have churned.
Every one of these problems exists somewhere else too. Gemini 2.5 Pro had its own metallic-noise regression in December 2025. ElevenLabs v3 ships updates weekly and quality shifts subtly each time. The useful takeaway is not to avoid hosted Text-to-Speech (self-hosting has its own set of problems), but to build a quality layer on top that catches regressions independent of which provider produced the audio.
Where Play.ht users are migrating
ElevenLabs is the most common destination - the highest quality ceiling on the market and the strongest voice cloning, at the highest credit cost. See our v3 vs Turbo comparison for model-level guidance. Fish Audio is a strong option for long-form consistency at a significantly lower price (45-70 percent less than ElevenLabs on comparable workloads). Murf AI sits at a similar price to Fish and has strong video sync tooling. Gemini 2.5 Pro Text-to-Speech on Google Cloud is new, cheap at scale, and improving fast, with a buggy baseline that requires active monitoring. Open-source options like Chatterbox, Fish Audio S2, and Kokoro are worth looking at if you have the infrastructure to self-host and the appetite for more aggressive QA.
QA your migration
The worst migration mistake is assuming your new platform is artifact-free because it is not the one you just left. Run the same reference script through your new Text-to-Speech provider and compare against your Play.ht archive. Set quality baselines on the new provider before you go into production. Use TTSAudit to batch-check the migrated audio for voice drift, glitches, pacing anomalies, and script accuracy. The audit will tell you within minutes whether your new provider is a clean upgrade or whether you are about to ship a different flavour of the same failures. See our artifact reference for the full list of what to check for.
Audit your post-migration output
Upload a batch from your new Text-to-Speech provider and compare against your baseline. 100 free credits on signup.
Audit Your MigrationWhat developers are saying
"PlayHT is great for long-form content like audiobooks, but some users mentioned inconsistencies across different languages."
Reddit aggregation via Acciyo
"Play.ht provides ultra-realistic voices through its advanced voice cloning feature, but pricing can be a barrier, especially for small businesses."
Reddit r/TextToSpeech
"Play.ht is a 'middle-of-the-road performer' - solid for podcast and long-form but not quite the realism of ElevenLabs for creative content."
Reddit r/TextToSpeech
How TTSAudit solves this
Clone Quality Verification
Detect when Play.ht voice cloning output deviates from expected voice characteristics across your batch.
Noise & Artifact Detection
Catch background noise bleed-through and audio artifacts in cloned voice output.
Tone Consistency Scoring
Flag files where emotional tone doesn't match the rest of the batch. Catch the flat/robotic outliers.
Multi-Language QA
Verify quality across Play.ht's 142-language output. Catch language-specific quality issues.