# Cleanvoice AI

> Cleanvoice is a REST API for automated audio and podcast editing. It removes filler words ("um", "uh"), long silences, stutters, mouth sounds, and breathing from recordings. Also supports noise reduction, studio sound processing, loudness normalization, transcription (Whisper, 99 languages), chapter summarization, and social media content generation.

Base URL: https://api.cleanvoice.ai/v2
Authentication: X-API-Key header
Interactive API reference: https://api.cleanvoice.ai/docs

## Docs

- [Introduction](https://docs.cleanvoice.ai/docs/v2)
- [Python SDK — Quick Start](https://docs.cleanvoice.ai/docs/v2/python/quick-start)
- [Python SDK — Configuration Reference](https://docs.cleanvoice.ai/docs/v2/python/configuration)
- [Python SDK — Uploads](https://docs.cleanvoice.ai/docs/v2/python/uploads)
- [Python SDK — Authentication](https://docs.cleanvoice.ai/docs/v2/python/authentication)
- [Python SDK — Languages](https://docs.cleanvoice.ai/docs/v2/python/languages)
- [Python SDK — Recommendations](https://docs.cleanvoice.ai/docs/v2/python/scalability)
- [Python SDK — SDK Reference](https://docs.cleanvoice.ai/docs/v2/python/sdk-reference)
- [JavaScript SDK — Quick Start](https://docs.cleanvoice.ai/docs/v2/javascript/quick-start)
- [JavaScript SDK — Configuration Reference](https://docs.cleanvoice.ai/docs/v2/javascript/configuration)
- [JavaScript SDK — Uploads](https://docs.cleanvoice.ai/docs/v2/javascript/uploads)
- [JavaScript SDK — Authentication](https://docs.cleanvoice.ai/docs/v2/javascript/authentication)
- [JavaScript SDK — Languages](https://docs.cleanvoice.ai/docs/v2/javascript/languages)
- [JavaScript SDK — Recommendations](https://docs.cleanvoice.ai/docs/v2/javascript/scalability)
- [JavaScript SDK — SDK Reference](https://docs.cleanvoice.ai/docs/v2/javascript/sdk-reference)
- [REST API — Quick Start](https://docs.cleanvoice.ai/docs/v2/rest/quick-start)
- [REST API — Configuration Reference](https://docs.cleanvoice.ai/docs/v2/rest/configuration)
- [REST API — Uploads](https://docs.cleanvoice.ai/docs/v2/rest/uploads)
- [REST API — Create an Edit](https://docs.cleanvoice.ai/docs/v2/rest/edits/create)
- [REST API — Retrieve an Edit](https://docs.cleanvoice.ai/docs/v2/rest/edits/retrieve)
- [REST API — Delete Files](https://docs.cleanvoice.ai/docs/v2/rest/delete-files)
- [REST API — Languages](https://docs.cleanvoice.ai/docs/v2/rest/languages)
- [REST API — Rate Limits](https://docs.cleanvoice.ai/docs/v2/rest/rate-limits)
- [REST API — Recommendations](https://docs.cleanvoice.ai/docs/v2/rest/scalability)
- [Make.com Integration](https://docs.cleanvoice.ai/docs/v2/make)
- [n8n Integration](https://docs.cleanvoice.ai/docs/v2/n8n)

## API endpoints

- POST /v2/edits — submit an edit job; returns { id, status: "PENDING" }
- GET /v2/edits/{edit_id} — poll for status and result
- DELETE /v2/edits/{edit_id} — delete an edit and its associated files
- POST /v2/uploads — get a signed upload URL for a local file
- GET /v2/auth — verify API key

## Edit lifecycle

PENDING → STARTED → SUCCESS (result.url available)
                  ↘ FAILURE (permanent, check error field)
                  ↗ RETRY (auto-retry on transient failure)

## Create an edit — request body

```json
{
  "input": {
    "files": ["https://example.com/episode.mp3"],
    "config": {
      "fillers": { "enabled": true },
      "long_silences": { "enabled": true },
      "normalize": { "enabled": true }
    }
  }
}
```

Multi-track (interview with separate tracks): pass multiple URLs in `files` and set `"upload_type": "multitrack"`.
Batch processing (multiple independent files): submit one POST /v2/edits per file — do NOT pass multiple files in one request unless they are multi-track.

## Deliver result to your own storage

Pass a pre-signed PUT URL as `signed_url` in the request body. Cleanvoice will PUT the cleaned file directly to your storage (S3, GCS, etc.) instead of hosting it.

```json
{
  "input": {
    "files": ["https://example.com/episode.mp3"],
    "signed_url": "https://your-bucket.s3.amazonaws.com/cleaned.mp3?X-Amz-Signature=...",
    "config": { "fillers": { "enabled": true } }
  }
}
```

## Configuration options (REST / Python / JavaScript)

### Audio cleaning (all default: false)
- fillers / fillers=True / fillers: true — remove filler words ("um", "uh", "like", etc.)
- long_silences / long_silences=True / long_silences: true — trim long pauses
- mouth_sounds / mouth_sounds=True / mouth_sounds: true — remove clicks, lip smacks
- breath / breath=True / breath: true — remove audible breathing. Accepts: true (recommended), "legacy" (conservative, for clean audio), "natural" (lighter, preserves more breathing feel), false (disabled)
- stutters / stutters=True / stutters: true — remove repeated word fragments
- hesitations / hesitations=True / hesitations: true — remove short hesitation sounds that aren't full filler words
- muted / muted=True / muted: true — silence edits instead of cutting, preserves original timing

### Audio enhancement
- remove_noise / remove_noise=True / remove_noise: true — reduce background noise. On by default; pass false to disable
- studio_sound / studio_sound=True / studio_sound: true — aggressive studio-quality enhancement. Accepts: true (recommended), "nightly" (advanced/experimental, currently similar to true), false (disabled, default)
- normalize / normalize=True / normalize: true — loudness normalization (default: false)
- keep_music / keep_music=True / keep_music: true — preserve music sections during noise reduction (default: false)
- autoeq / autoeq=True / autoeq: true — legacy automatic EQ. Prefer studio_sound; autoeq will be removed in a future release (default: false)
- mute_lufs / mute_lufs=True / mute_lufs: true — enable LUFS targeting (default: false)
- target_lufs / target_lufs=-16 / target_lufs: -16 — target LUFS. -16 is the standard for podcasts

### Output
- export_format / export_format="mp3" / export_format: "mp3" — audio-only output format: mp3 | wav | flac | m4a | opus | aac | auto (default: auto, matches input). Video jobs keep the original container format
- video / video=True / video: true — must be set to true for video editing. SDKs auto-detect from file extension, but explicit is safer. Without it, video input is treated as audio only
- merge / merge=True / merge: true — merge multi-track files into a single output. Only for multi-track editing
- audio_for_edl / audio_for_edl=True / audio_for_edl: true — return a separate audio track alongside video output (video workflows only)
- signed_url — pre-signed PUT URL to deliver result directly to your storage (S3, GCS, etc.)

### Content generation (all default: false)
- transcription / transcription=True / transcription: true — full transcript (Whisper, auto-detects language)
- summarize / summarize=True / summarize: true — chapters and key learnings (requires transcription)
- social_content / social_content=True / social_content: true — social media posts (requires transcription)
- export_timestamps / export_timestamps=True / export_timestamps: true — include timestamps in edit results

## Recommendations

breath modes: true (recommended for most audio), "legacy" (conservative, for clean recordings), "natural" (lighter touch). false = disabled (default).
studio_sound modes: true (recommended), "nightly" (advanced/experimental, similar to true). false = disabled (default). autoeq is legacy — use studio_sound instead.
muted: set to true to preserve original timing — edits are silenced instead of cut, keeping the file the same duration. Useful when syncing with video timelines or subtitle files.

## Common presets

Noise only: remove_noise + normalize
Studio polish: remove_noise + studio_sound + normalize
Full podcast edit: fillers + long_silences + mouth_sounds + breath + stutters + remove_noise + normalize
Transcript only: transcription
Full analysis: transcription + summarize + social_content

## Python SDK

Install: pip install cleanvoice-sdk
GitHub: https://github.com/cleanvoice/cleanvoice-python

```python
from cleanvoice import Cleanvoice, AsyncCleanvoice

# Sync
client = Cleanvoice.from_env()  # reads CLEANVOICE_API_KEY env var
# or: client = Cleanvoice(api_key="your_key")

# process() blocks until the job finishes and returns the result
result = client.process(
    "https://example.com/episode.mp3",  # URL
    # "/path/to/episode.mp3",           # local file path
    # (numpy_array, sample_rate),        # NumPy array
    fillers=True,
    long_silences=True,
    normalize=True,
)
result.audio.download("cleaned.mp3")

# Async
async_client = AsyncCleanvoice.from_env()
result = await async_client.process("episode.mp3", fillers=True)

# Lower-level methods
edit_id = client.create_edit("episode.mp3", fillers=True)
edit = client.get_edit(edit_id)          # poll manually
file_url = client.upload_file("/path/to/episode.mp3")  # upload local file
account = client.check_auth()            # verify API key
```

## JavaScript SDK

Install: npm install @cleanvoice/cleanvoice-sdk
GitHub: https://github.com/cleanvoice/cleanvoice-js
Works in: Node.js, Deno, Bun, edge runtimes

```typescript
import { Cleanvoice } from '@cleanvoice/cleanvoice-sdk';

const client = Cleanvoice.fromEnv();  // reads CLEANVOICE_API_KEY
// or: new Cleanvoice({ apiKey: '...' })

const result = await client.process('https://example.com/episode.mp3', {
  fillers: true, long_silences: true, normalize: true,
});
console.log(result.audio.url);
await result.audio.download('cleaned.mp3');

const editId = await client.createEdit('episode.mp3', { fillers: true });
const edit   = await client.getEdit(editId);
const account = await client.checkAuth();
```

## Languages

Audio enhancement (noise reduction, silences, normalization, mouth sounds, breathing, stutters) is language-agnostic — works for all languages.

Filler word detection — confirmed working:
English (en), German (de), French (fr), Dutch (nl), Spanish (es), Italian (it), Portuguese (pt), Romanian (ro), Polish (pl), Arabic (ar), Turkish (tr), Bulgarian (bg)

Transcription — powered by Whisper, supports ~99 languages, auto-detected (no manual language field needed).

## File retention

Edit files are automatically deleted after 7 days. To delete immediately: DELETE /v2/edits/{edit_id}

## Processing time

~30 seconds for a 2–3 minute clip. 5–10 minutes for a 1-hour file. Poll GET /v2/edits/{id} every 10 seconds after an initial 30-second wait.