Find the bad files in your Text to Speech batch
Submit Text-to-Speech audio files and get back a report telling you which ones have anomalies and need regenerating.
What we check for
Every file is analyzed across multiple dimensions. Anomalies are flagged so you know exactly what to regenerate.
Speaker Consistency
→Detect when the voice changes mid-batch
Example audit report
Here's what a real batch audit looks like - every file scored, anomalies clearly flagged.
Overall Score
Looking good
8
Files
6
Passed
2
Flagged
3
Checks
24¢
Cost
2 Files Recommended for Regeneration
These tracks failed quality checks and should be re-generated for best results.
Track Overview
| # | File | Status |
|---|---|---|
| 1 | chapter_01.mp3 | Pass |
| 2 | chapter_02.mp3 | Pass |
| 3 | chapter_03.mp3 | Failed |
| 4 | chapter_04.mp3 | Pass |
| 5 | chapter_05.mp3 | Failed |
| 6 | chapter_06.mp3 | Pass |
| 7 | chapter_07.mp3 | Pass |
| 8 | chapter_08.mp3 | Pass |
Sign up to run your own audit - 100 free credits (three full 10-file audits), no card required.
Built for every text-to-speech pipeline
Wherever you're generating audio at scale, TTSAudit fits in as the quality gate between "generated" and "shipped".
Audiobook Production
Find the chapters your listeners won't forgive - voice drift, mispronunciations, and silent mistakes buried in hours of narration.
Audiobook QA guideE-Learning & Courses
Keep pacing and voice consistent across every lesson in a course. Flag the bad modules before they reach your learners.
See the workflowContent Pipelines
Drop TTSAudit into your generation pipeline as a quality gate. Get a verdict on every file, regenerate only what failed.
See the workflowOne API call. Full batch report.
Submit your files, get back a per-file anomaly report with scores and regeneration recommendations.
# Audit tracks - specify each check as true/false via query params
curl -X POST "https://api.ttsaudit.com/v1/audit?comparison=true&quality=true&pace=false&scriptAccuracy=false" \
-H "X-API-Key: YOUR_API_KEY" \
-F "files=@track1.mp3" \
-F "files=@track2.mp3" \
-F "files=@track3.mp3" \
-F "accuracy=standard" \
-F "deviationThreshold=0.15"
# Check balance
curl -s https://api.ttsaudit.com/v1/balance \
-H "X-API-Key: YOUR_API_KEY"{
"score": 82,
"summary": "2 files flagged for regeneration.",
"fileCount": 8,
"tracksToRegenerate": [
{
"file": "chapter_03.mp3",
"reasons": [
{ "check": "comparison", "message": "deviation 41.00%", "deviation": 0.41 }
]
},
{
"file": "chapter_05.mp3",
"reasons": [
{ "check": "quality", "message": "score 32 (pop, static, silence)", "score": 32.0 }
]
}
],
"checks": {
"comparison": {
"score": 85,
"summary": "7 of 8 files match the batch speaker profile.",
"tracks": [
{ "file": "chapter_01.mp3", "similarity": 0.96, "deviation": 0.02, "flagged": false },
{ "file": "chapter_03.mp3", "similarity": 0.42, "deviation": 0.41, "flagged": true },
...
]
},
"quality": {
"score": 81,
"summary": "1 file has significant audio issues.",
"tracks": [
{
"score": 32, "issueCount": 3, "flagged": true,
"issues": [
{ "type": "pop", "timeSec": 3.1, "durationMs": 8, "audibilityLabel": "severe" },
{ "type": "static", "timeSec": 18.7, "durationMs": 420, "audibilityLabel": "distracting" },
{ "type": "silence", "timeSec": 22.0, "durationMs": 1200, "audibilityLabel": "noticeable" }
]
},
...
]
},
"pace": {
"score": 96,
"median": 138,
"summary": "All files within normal speaking speed range.",
"tracks": [
{ "wpm": 142, "wordCount": 312, "deviation": 0.029, "flagged": false },
...
]
},
"scriptAccuracy": {
"score": 0,
"summary": "1 tag spoken aloud - chapter_02.mp3 said [scoff].",
"spokenTagCount": 1,
"tracks": [
{ "accuracy": 97.0, "flagged": false, "tags": { "found": 0, "spoken": [] } },
{ "accuracy": 85.0, "flagged": true, "tags": { "found": 1, "spoken": [{ "tag": "[scoff]", "content": "scoff", "timeSec": 3.24, "endSec": 3.71 }] } },
...
]
}
},
"creditsUsed": 32
}Developer resources
Everything you need to integrate TTSAudit into your pipeline.
From the blog
Provider-specific QA guides, voice AI deep-dives, and audio production notes from the team.
Why Your Text-to-Speech Voice Changes Between Files
Generate twenty Text-to-Speech files for one project and one of them sounds different. This is voice drift. Here's why it happens and how to catch it.
Why Text-to-Speech Voices Switch Accent
Some Text-to-Speech models shift accent based on what the text is about. An American voice narrating a Scottish landmark comes back Scottish. Here's why.
AI Audiobook QA Checklist: Ship Without ACX Rejections
A complete quality assurance checklist for AI-narrated audiobooks - ACX technical requirements, TTS-specific failure modes, and automated detection.
Stop guessing which Text to Speech files are bad
100 free credits on signup - three full 10-file audits. No credit card required.