API Documentation

RESTful API with simple authentication

Overview POST /audit GET /balance GET /audits GET /audits/:id DELETE /audits/:id

POST/audit

Run one or more audio checks on a batch of files. Enable each check via query parameters and upload files as multipart form data.

Each enabled check costs 1 credit per file.

Known Limitations

Primarily designed and tested with English audio. Other languages are untested and may produce unexpected results.
Best results come from one speaker per file. Multiple speakers per file is not supported.
Audio containing intentional sound effects (jingles, ambient music, crowd noise, intro/outro effects) can trigger false positives on the garble and staticNoise quality sub-checks. If your audio includes sound effects, disable these checks via the qualityChecks parameter.

Headers

Authenticate with an API key or pay per-request via x402 (USDC on Base). See Authentication for details.

Header	Type	Required	Description
X-API-Key	`string`	Required*	Your API key (or use `Authorization: Bearer`). Not needed when paying via x402.
Payment-Signature	`string`	Optional	Base64-encoded x402 payment receipt. Send with the retry after receiving a 402 response. x402
X-Wallet	`string`	Optional	Your wallet address, sent on the initial request to receive cached-file discounts in the 402 price. x402

Query Parameters

All four are required. Values are "true" or "false"; at least one must be "true".

Parameter	Type	Required	Description
comparison	`string`	Required	Enable speaker consistency analysis Learn more
quality	`string`	Required	Enable audio quality analysis (SNR, artifacts, clipping) Learn more
pace	`string`	Required	Enable speaking speed consistency analysis Learn more
scriptAccuracy	`string`	Required	Enable script accuracy and spoken tag detection Learn more
forceFresh	`string`	Optional	Skip cached-file discount and charge full credits for every file

Request Body

Encoded as multipart/form-data.

Field	Type	Required	Description
files	`File`	Required	Audio files to analyze (min 1 for quality-only, min 2 for comparison or pace). Use field name "files" for each part
accuracy	`string`	Optional	Comparison accuracy: "standard" (default), "high", or "highest"
deviationThreshold	`number`	Optional	Flagging sensitivity across all checks, 0–1. Default 0.15
qualityChecks	`JSON string`	Optional	Toggle individual quality sub-checks. All default to true — only include keys to change
scripts	`JSON string`	Optional	JSON object mapping each filename to its script text. Optional — improves spoken tag detection when scriptAccuracy is enabled. Without scripts, repetitions and likely spoken tags are still detected. Example: {"file1.mp3": "Hello [laughs] world!"}

Code Examples

curl -X POST "https://api.ttsaudit.com/v1/audit?comparison=true&quality=true&pace=false&scriptAccuracy=false" \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "files=@chapter1.mp3" \
  -F "files=@chapter2.mp3" \
  -F "files=@chapter3.mp3" \
  -F "accuracy=standard" \
  -F "deviationThreshold=0.15"

# To disable specific quality sub-checks, add qualityChecks as a JSON string:
#   -F 'qualityChecks={"garble": false, "silenceGaps": false}'
# Omitted keys default to true.

Response

Returns an overall score, a list of files to regenerate, and per-check results. Each check contains a score, summary, and tracks[] array.

Property	Type	Description
score	`number`	Overall score (0–100), average of all enabled check scores
summary	`string`	One-sentence overview of the audit result
fileCount	`number`	Number of audio files in the batch
auditId	`string`	Unique identifier for this audit session
reportUrl	`string`	Direct link to view this audit on ttsaudit.com
tracksToRegenerate	`array`	Files flagged by any check. Each entry has file name and reasons
checks.comparison	`object`	Speaker consistency results including voice similarity and volume consistency (if enabled)
checks.quality	`object`	Audio quality analysis results (if enabled)
checks.pace	`object`	Speaking speed consistency results (if enabled)
checks.scriptAccuracy	`object`	Script accuracy and spoken tag detection results (if enabled)
creditsUsed	`number`	Total credits consumed by this audit
timing	`object`	Performance breakdown in seconds

{
  "score": 87.2,
  "summary": "2 of 3 checks passed. 1 file flagged for regeneration.",
  "fileCount": 3,
  "auditId": "a1b2c3d4e5f6",
  "reportUrl": "https://ttsaudit.com/dashboard?tab=audit&session=a1b2c3d4e5f6",
  "tracksToRegenerate": [
    {
      "file": "chapter3.mp3",
      "reasons": [
        { "check": "comparison", "message": "deviation 18.00%", "deviation": 0.18 }
      ]
    }
  ],
  "checks": {
    "comparison": {
      "score": 91.2,
      "summary": "1 of 3 files flagged - 91% average consistency.",
      "similarityMatrix": [[1.0, 0.92, 0.88], [0.92, 1.0, 0.91], [0.88, 0.91, 1.0]],
      "volumeConsistency": { "score": 96.2, "medianDb": -18.3, "spreadDb": 1.8, "outliers": [] },
      "tracks": [
        { "file": "chapter1.mp3", "similarity": 0.93, "deviation": -0.02, "flagged": false },
        { "file": "chapter2.mp3", "similarity": 0.91, "deviation": 0.00, "flagged": false },
        { "file": "chapter3.mp3", "similarity": 0.81, "deviation": 0.18, "flagged": true }
      ]
    },
    "quality": {
      "score": 94.5,
      "summary": "Good audio quality - average score 94.5 across 3 files.",
      "tracks": [
        {
          "score": 95.2,
          "flagged": false,
          "snrDb": 45.2,
          "issueCount": 2,
          "issueSummary": { "total": 2, "severe": 0, "noticeable": 2, "garbleCount": 1, "staticCount": 0, "silenceCount": 0, "worstLabel": "noticeable" },
          "issues": [
            { "timeSec": 3.21, "endSec": 3.242, "durationMs": 32.0, "type": "click", "severity": 0.4, "audibility": 0.45, "audibilityLabel": "noticeable" },
            { "timeSec": 6.66, "endSec": 6.692, "durationMs": 32.0, "type": "garble", "severity": 0.62, "audibility": 0.58, "audibilityLabel": "noticeable" }
          ],
          "clipping": { "clipCount": 0, "clipPercentage": 0 },
          "bandwidth": { "cutoffHz": null, "spectralCentroidHz": 1824.3, "bandwidthRatio": 1.0 }
        }
      ]
    },
    "scriptAccuracy": {
      "score": 0,
      "summary": "1 tag spoken aloud across 3 files.",
      "spokenTagCount": 1,
      "tracks": [
        {
          "transcript": "Hello scoff well that was something.",
          "accuracy": 85.0,
          "wordErrorRate": 0.15,
          "flagged": true,
          "tags": {
            "found": 1,
            "spoken": [{ "tag": "[scoff]", "content": "scoff", "spokenWord": "scoff", "timeSec": 3.24, "endSec": 3.71 }]
          }
        },
        {
          "transcript": "This is the second chapter of our story.",
          "accuracy": 100.0,
          "wordErrorRate": 0.0,
          "flagged": false,
          "tags": { "found": 0, "spoken": [] }
        }
      ]
    }
  },
  "creditsUsed": 9,
  "timing": { "decode": 3.1, "checks": 4.8, "total": 8.2 }
}

Ready to integrate?

Get your API key