Question 1

Does ACX accept AI-narrated audiobooks?

Accepted Answer

ACX's public policy on fully AI-generated narration is still conservative, but in 2024 ACX launched a Narrator Voice Replicas beta that lets invited human narrators create and monetize AI replicas of their own voices. Titles narrated with a replica are labelled in the narrator field. Outside that beta, generator-agnostic TTS output is not officially a supported submission path, but the same audio engineering bar still applies: RMS between -23 dB and -18 dB, peaks below -3 dB, noise floor under -60 dB. AI submissions that fail on voice drift or audio artifacts get rejected by the automated scanner the same way human submissions do.

Question 2

Which Text-to-Speech platform is best for audiobook production?

Accepted Answer

ElevenLabs v3 is the quality leader for long-form narration and voice cloning. Gemini 2.5 Pro is cheaper on large projects and explicitly designed for audiobooks. Open-source options like Chatterbox or Coqui XTTS work if you accept more regeneration and rigorous automated QA. Whichever you pick, treat audit as non-optional.

Question 3

How many regenerations should I expect per chapter?

Accepted Answer

Across batch audits we see roughly 5 to 15 percent of chapters need at least one regeneration on production-grade providers. That ratio climbs for Turbo-style fast tiers and open-source models. Planning for 10 percent regeneration is a reasonable starting budget.

Question 4

Can I mix Text-to-Speech and human narration in one audiobook?

Accepted Answer

Technically yes - you can submit a mixed audiobook to ACX as long as every individual file meets the technical requirements. In practice, mixed narration sounds jarring unless the voices and delivery are very carefully matched. Most teams commit to one approach per book.

Question 5

What is the real cost difference between Text-to-Speech and human narration?

Accepted Answer

A 10-hour audiobook from a union narrator costs $2,500 to $5,000 (roughly $225 to $500 per finished hour). The same book on ElevenLabs v3 costs around $50-$100 in credits plus an audit pass. The catch is that Text-to-Speech requires a proper QA pipeline - without one, you will ship a worse product than a human narrator at any price.

AI Audiobook QA Checklist

ACX technical requirements you must hit

The Text-to-Speech-specific QA checklist

Manual vs automated QA

Common failures and how to fix them

Which Text-to-Speech platform for audiobooks

Run your audiobook through TTSAudit before submission

Key capabilities

Chapter-Level Analysis

Drift Detection

Per-Chapter Scores

ACX Pre-Check

Frequently asked questions

Related guides

Why Your Text-to-Speech Voice Changes Between Files

Common Text-to-Speech Audio Artifacts

ElevenLabs v3 vs Turbo v2.5

Catch bad TTS files before they ship