TTSAudit is an automated quality assurance API for text-to-speech audio. You submit a batch of TTS files and get back a per-file anomaly report showing which files have voice drift, audio artifacts, pacing issues, or script accuracy problems, so you regenerate only the bad ones instead of the whole batch.

How much does TTSAudit cost?

Each check costs 1 credit per file ($0.01). You get 100 free credits on signup with no credit card required. Re-auditing unchanged files costs only 0.2 credits per check. Credit packs start at $5 for 500 credits and never expire.

What checks does TTSAudit run?

TTSAudit runs four checks: Speaker Consistency (detects voice drift across files using speaker embeddings), Audio Quality (flags clipping, noise, pops, and artifacts), Speaking Speed (measures WPM and pacing consistency), and Script Accuracy (compares transcription against source scripts and detects spoken tags).

Which audio formats does TTSAudit support?

TTSAudit supports MP3, WAV, OGG, FLAC, M4A, AAC, and most common audio formats. Files are uploaded via multipart form data through the REST API or the web dashboard.

How do I integrate TTSAudit into my pipeline?

TTSAudit provides a REST API with API key authentication. Send a POST request to /audit with your audio files and specify which checks to run. You can also use x402 micropayments (USDC on Base) for agent-driven access without creating an account.

Batch anomaly detection API

Find the bad files in your Text to Speech batch

Submit Text-to-Speech audio files and get back a report telling you which ones have anomalies and need regenerating.

Start Audit See How It Works

What we check for

Every file is analyzed across multiple dimensions. Anomalies are flagged so you know exactly what to regenerate.

Speaker Consistency

→

Detect when the voice changes mid-batch

Consistent

Accent Drift

Consistent

Audio Quality

→

Catch glitches, noise, and artifacts

Speaking Speed

→

Flag abnormal pacing and speed drift

Accepted

Too Fast

Script Accuracy

→

Catch when Text-to-Speech speaks your audio tags out loud

Example audit report

Here's what a real batch audit looks like - every file scored, anomalies clearly flagged.

Overall Score

Looking good

Files

Passed

Flagged

Checks

24¢

Cost

2 Files Recommended for Regeneration

These tracks failed quality checks and should be re-generated for best results.

chapter_03.mp3

different speaker detected

chapter_05.mp3

noise artifactslow quality score

Track Overview

#	File	Similarity	Quality	WPM	Status
1	chapter_01.mp3	96%	94	142	Pass
2	chapter_02.mp3	95%	91	138	Pass
3	chapter_03.mp3	42%	87	145	Failed
4	chapter_04.mp3	97%	88	140	Pass
5	chapter_05.mp3	94%	32	136	Failed
6	chapter_06.mp3	96%	95	141	Pass
7	chapter_07.mp3	93%	90	139	Pass
8	chapter_08.mp3	95%	93	143	Pass

Built for every text-to-speech pipeline

Wherever you're generating audio at scale, TTSAudit fits in as the quality gate between "generated" and "shipped".

All use cases

auto_stories

Audiobook Production

Find the chapters your listeners won't forgive - voice drift, mispronunciations, and silent mistakes buried in hours of narration.

Audiobook QA guide

school

E-Learning & Courses

Keep pacing and voice consistent across every lesson in a course. Flag the bad modules before they reach your learners.

See the workflow

factory

Content Pipelines

Drop TTSAudit into your generation pipeline as a quality gate. Get a verdict on every file, regenerate only what failed.

See the workflow

One API call. Full batch report.

Submit your files, get back a per-file anomaly report with scores and regeneration recommendations.

API Keyx402 · USDC on Base

# Audit tracks - specify each check as true/false via query params
curl -X POST "https://api.ttsaudit.com/v1/audit?comparison=true&quality=true&pace=false&scriptAccuracy=false" \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "files=@track1.mp3" \
  -F "files=@track2.mp3" \
  -F "files=@track3.mp3" \
  -F "accuracy=standard" \
  -F "deviationThreshold=0.15"

# Check balance
curl -s https://api.ttsaudit.com/v1/balance \
  -H "X-API-Key: YOUR_API_KEY"

JSON Response

{
  "score": 82,
  "summary": "2 files flagged for regeneration.",
  "fileCount": 8,
  "tracksToRegenerate": [
    {
      "file": "chapter_03.mp3",
      "reasons": [
        { "check": "comparison", "message": "deviation 41.00%", "deviation": 0.41 }
      ]
    },
    {
      "file": "chapter_05.mp3",
      "reasons": [
        { "check": "quality", "message": "score 32 (pop, static, silence)", "score": 32.0 }
      ]
    }
  ],
  "checks": {
    "comparison": {
      "score": 85,
      "summary": "7 of 8 files match the batch speaker profile.",
      "tracks": [
        { "file": "chapter_01.mp3", "similarity": 0.96, "deviation": 0.02, "flagged": false },
        { "file": "chapter_03.mp3", "similarity": 0.42, "deviation": 0.41, "flagged": true },
        ...
      ]
    },
    "quality": {
      "score": 81,
      "summary": "1 file has significant audio issues.",
      "tracks": [
        {
          "score": 32, "issueCount": 3, "flagged": true,
          "issues": [
            { "type": "pop", "timeSec": 3.1, "durationMs": 8, "audibilityLabel": "severe" },
            { "type": "static", "timeSec": 18.7, "durationMs": 420, "audibilityLabel": "distracting" },
            { "type": "silence", "timeSec": 22.0, "durationMs": 1200, "audibilityLabel": "noticeable" }
          ]
        },
        ...
      ]
    },
    "pace": {
      "score": 96,
      "median": 138,
      "summary": "All files within normal speaking speed range.",
      "tracks": [
        { "wpm": 142, "wordCount": 312, "deviation": 0.029, "flagged": false },
        ...
      ]
    },
    "scriptAccuracy": {
      "score": 0,
      "summary": "1 tag spoken aloud - chapter_02.mp3 said [scoff].",
      "spokenTagCount": 1,
      "tracks": [
        { "accuracy": 97.0, "flagged": false, "tags": { "found": 0, "spoken": [] } },
        { "accuracy": 85.0, "flagged": true, "tags": { "found": 1, "spoken": [{ "tag": "[scoff]", "content": "scoff", "timeSec": 3.24, "endSec": 3.71 }] } },
        ...
      ]
    }
  },
  "creditsUsed": 32
}

Developer resources

Everything you need to integrate TTSAudit into your pipeline.

Full API reference

code

API Overview

REST API, authentication, rate limits, error codes.

cloud_upload

Create Audit

Submit a batch of files for per-file analysis.

list

List & Get Audits

Poll for results or retrieve past audits by ID.

shield

Security

How we handle your audio, keys, and data.

From the blog

Provider-specific QA guides, voice AI deep-dives, and audio production notes from the team.

All posts

Voice AI 10 min read

Why Your Text-to-Speech Voice Changes Between Files

Generate twenty Text-to-Speech files for one project and one of them sounds different. This is voice drift. Here's why it happens and how to catch it.

TTSAudit Team2026-04-12

Voice AI 7 min read

Why Text-to-Speech Voices Switch Accent

Some Text-to-Speech models shift accent based on what the text is about. An American voice narrating a Scottish landmark comes back Scottish. Here's why.

TTSAudit Team2026-04-12

Audiobooks 14 min read

AI Audiobook QA Checklist: Ship Without ACX Rejections

A complete quality assurance checklist for AI-narrated audiobooks - ACX technical requirements, TTS-specific failure modes, and automated detection.

TTSAudit Team2026-04-12

Stop guessing which Text to Speech files are bad

100 free credits on signup - three full 10-file audits. No credit card required.

Start Audit See How It Works