# Cleanvoice AI > Cleanvoice is a REST API for automated audio and podcast editing. It removes filler words ("um", "uh"), long silences, stutters, mouth sounds, and breathing from recordings. Also supports noise reduction, studio sound processing, loudness normalization, transcription (Whisper, 99 languages), chapter summarization, and social media content generation. Base URL: https://api.cleanvoice.ai/v2 Authentication: X-API-Key header Interactive API reference: https://api.cleanvoice.ai/docs ## Docs - [Introduction](https://docs.cleanvoice.ai/docs/v2) - [Python SDK — Quick Start](https://docs.cleanvoice.ai/docs/v2/python/quick-start) - [Python SDK — Configuration Reference](https://docs.cleanvoice.ai/docs/v2/python/configuration) - [Python SDK — Uploads](https://docs.cleanvoice.ai/docs/v2/python/uploads) - [Python SDK — Authentication](https://docs.cleanvoice.ai/docs/v2/python/authentication) - [Python SDK — Languages](https://docs.cleanvoice.ai/docs/v2/python/languages) - [Python SDK — Recommendations](https://docs.cleanvoice.ai/docs/v2/python/scalability) - [Python SDK — SDK Reference](https://docs.cleanvoice.ai/docs/v2/python/sdk-reference) - [JavaScript SDK — Quick Start](https://docs.cleanvoice.ai/docs/v2/javascript/quick-start) - [JavaScript SDK — Configuration Reference](https://docs.cleanvoice.ai/docs/v2/javascript/configuration) - [JavaScript SDK — Uploads](https://docs.cleanvoice.ai/docs/v2/javascript/uploads) - [JavaScript SDK — Authentication](https://docs.cleanvoice.ai/docs/v2/javascript/authentication) - [JavaScript SDK — Languages](https://docs.cleanvoice.ai/docs/v2/javascript/languages) - [JavaScript SDK — Recommendations](https://docs.cleanvoice.ai/docs/v2/javascript/scalability) - [JavaScript SDK — SDK Reference](https://docs.cleanvoice.ai/docs/v2/javascript/sdk-reference) - [REST API — Quick Start](https://docs.cleanvoice.ai/docs/v2/rest/quick-start) - [REST API — Configuration Reference](https://docs.cleanvoice.ai/docs/v2/rest/configuration) - [REST API — Uploads](https://docs.cleanvoice.ai/docs/v2/rest/uploads) - [REST API — Create an Edit](https://docs.cleanvoice.ai/docs/v2/rest/edits/create) - [REST API — Retrieve an Edit](https://docs.cleanvoice.ai/docs/v2/rest/edits/retrieve) - [REST API — Delete Files](https://docs.cleanvoice.ai/docs/v2/rest/delete-files) - [REST API — Languages](https://docs.cleanvoice.ai/docs/v2/rest/languages) - [REST API — Rate Limits](https://docs.cleanvoice.ai/docs/v2/rest/rate-limits) - [REST API — Recommendations](https://docs.cleanvoice.ai/docs/v2/rest/scalability) - [Make.com Integration](https://docs.cleanvoice.ai/docs/v2/make) - [n8n Integration](https://docs.cleanvoice.ai/docs/v2/n8n) ## API endpoints - POST /v2/edits — submit an edit job; returns { id, status: "PENDING" } - GET /v2/edits/{edit_id} — poll for status and result - DELETE /v2/edits/{edit_id} — delete an edit and its associated files - POST /v2/uploads — get a signed upload URL for a local file - GET /v2/auth — verify API key ## Edit lifecycle PENDING → STARTED → SUCCESS (result.url available) ↘ FAILURE (permanent, check error field) ↗ RETRY (auto-retry on transient failure) ## Create an edit — request body ```json { "input": { "files": ["https://example.com/episode.mp3"], "config": { "fillers": { "enabled": true }, "long_silences": { "enabled": true }, "normalize": { "enabled": true } } } } ``` Multi-track (interview with separate tracks): pass multiple URLs in `files` and set `"upload_type": "multitrack"`. Batch processing (multiple independent files): submit one POST /v2/edits per file — do NOT pass multiple files in one request unless they are multi-track. ## Deliver result to your own storage Pass a pre-signed PUT URL as `signed_url` in the request body. Cleanvoice will PUT the cleaned file directly to your storage (S3, GCS, etc.) instead of hosting it. ```json { "input": { "files": ["https://example.com/episode.mp3"], "signed_url": "https://your-bucket.s3.amazonaws.com/cleaned.mp3?X-Amz-Signature=...", "config": { "fillers": { "enabled": true } } } } ``` ## Configuration options (REST / Python / JavaScript) ### Audio cleaning (all default: false) - fillers / fillers=True / fillers: true — remove filler words ("um", "uh", "like", etc.) - long_silences / long_silences=True / long_silences: true — trim long pauses - mouth_sounds / mouth_sounds=True / mouth_sounds: true — remove clicks, lip smacks - breath / breath=True / breath: true — remove audible breathing. Accepts: true (recommended), "legacy" (conservative, for clean audio), "natural" (lighter, preserves more breathing feel), false (disabled) - stutters / stutters=True / stutters: true — remove repeated word fragments - hesitations / hesitations=True / hesitations: true — remove short hesitation sounds that aren't full filler words - muted / muted=True / muted: true — silence edits instead of cutting, preserves original timing ### Audio enhancement - remove_noise / remove_noise=True / remove_noise: true — reduce background noise. On by default; pass false to disable - studio_sound / studio_sound=True / studio_sound: true — aggressive studio-quality enhancement. Accepts: true (recommended), "nightly" (advanced/experimental, currently similar to true), false (disabled, default) - normalize / normalize=True / normalize: true — loudness normalization (default: false) - keep_music / keep_music=True / keep_music: true — preserve music sections during noise reduction (default: false) - autoeq / autoeq=True / autoeq: true — legacy automatic EQ. Prefer studio_sound; autoeq will be removed in a future release (default: false) - mute_lufs / mute_lufs=True / mute_lufs: true — enable LUFS targeting (default: false) - target_lufs / target_lufs=-16 / target_lufs: -16 — target LUFS. -16 is the standard for podcasts ### Output - export_format / export_format="mp3" / export_format: "mp3" — audio-only output format: mp3 | wav | flac | m4a | opus | aac | auto (default: auto, matches input). Video jobs keep the original container format - video / video=True / video: true — must be set to true for video editing. SDKs auto-detect from file extension, but explicit is safer. Without it, video input is treated as audio only - merge / merge=True / merge: true — merge multi-track files into a single output. Only for multi-track editing - audio_for_edl / audio_for_edl=True / audio_for_edl: true — return a separate audio track alongside video output (video workflows only) - signed_url — pre-signed PUT URL to deliver result directly to your storage (S3, GCS, etc.) ### Content generation (all default: false) - transcription / transcription=True / transcription: true — full transcript (Whisper, auto-detects language) - summarize / summarize=True / summarize: true — chapters and key learnings (requires transcription) - social_content / social_content=True / social_content: true — social media posts (requires transcription) - export_timestamps / export_timestamps=True / export_timestamps: true — include timestamps in edit results ## Recommendations breath modes: true (recommended for most audio), "legacy" (conservative, for clean recordings), "natural" (lighter touch). false = disabled (default). studio_sound modes: true (recommended), "nightly" (advanced/experimental, similar to true). false = disabled (default). autoeq is legacy — use studio_sound instead. muted: set to true to preserve original timing — edits are silenced instead of cut, keeping the file the same duration. Useful when syncing with video timelines or subtitle files. ## Common presets Noise only: remove_noise + normalize Studio polish: remove_noise + studio_sound + normalize Full podcast edit: fillers + long_silences + mouth_sounds + breath + stutters + remove_noise + normalize Transcript only: transcription Full analysis: transcription + summarize + social_content ## Python SDK Install: pip install cleanvoice-sdk GitHub: https://github.com/cleanvoice/cleanvoice-python ```python from cleanvoice import Cleanvoice, AsyncCleanvoice # Sync client = Cleanvoice.from_env() # reads CLEANVOICE_API_KEY env var # or: client = Cleanvoice(api_key="your_key") # process() blocks until the job finishes and returns the result result = client.process( "https://example.com/episode.mp3", # URL # "/path/to/episode.mp3", # local file path # (numpy_array, sample_rate), # NumPy array fillers=True, long_silences=True, normalize=True, ) result.audio.download("cleaned.mp3") # Async async_client = AsyncCleanvoice.from_env() result = await async_client.process("episode.mp3", fillers=True) # Lower-level methods edit_id = client.create_edit("episode.mp3", fillers=True) edit = client.get_edit(edit_id) # poll manually file_url = client.upload_file("/path/to/episode.mp3") # upload local file account = client.check_auth() # verify API key ``` ## JavaScript SDK Install: npm install @cleanvoice/cleanvoice-sdk GitHub: https://github.com/cleanvoice/cleanvoice-js Works in: Node.js, Deno, Bun, edge runtimes ```typescript import { Cleanvoice } from '@cleanvoice/cleanvoice-sdk'; const client = Cleanvoice.fromEnv(); // reads CLEANVOICE_API_KEY // or: new Cleanvoice({ apiKey: '...' }) const result = await client.process('https://example.com/episode.mp3', { fillers: true, long_silences: true, normalize: true, }); console.log(result.audio.url); await result.audio.download('cleaned.mp3'); const editId = await client.createEdit('episode.mp3', { fillers: true }); const edit = await client.getEdit(editId); const account = await client.checkAuth(); ``` ## Languages Audio enhancement (noise reduction, silences, normalization, mouth sounds, breathing, stutters) is language-agnostic — works for all languages. Filler word detection — confirmed working: English (en), German (de), French (fr), Dutch (nl), Spanish (es), Italian (it), Portuguese (pt), Romanian (ro), Polish (pl), Arabic (ar), Turkish (tr), Bulgarian (bg) Transcription — powered by Whisper, supports ~99 languages, auto-detected (no manual language field needed). ## File retention Edit files are automatically deleted after 7 days. To delete immediately: DELETE /v2/edits/{edit_id} ## Processing time ~30 seconds for a 2–3 minute clip. 5–10 minutes for a 1-hour file. Poll GET /v2/edits/{id} every 10 seconds after an initial 30-second wait.