CleanvoiceDocs
Python SDK

Recommendations

Recommended patterns for high-volume and production audio processing with the Python SDK.

Choosing the right breath mode

ValueWhen to use
TrueRecommended for most recordings. Best default for noisy or challenging audio.
"legacy"Source audio is already clean and well-recorded. Conservative, predictable removal.
"natural"You want a lighter touch that preserves more of the speaker's natural breathing.
FalseLeave breathing untouched (default).

Choosing the right studio_sound mode

ValueWhen to use
TrueRecommended. Aggressive enhancement for studio-quality output.
"nightly"Advanced/experimental variant. Currently behaves similarly to True.
FalseDisabled (default). Use remove_noise alone for lighter cleanup.

autoeq is legacy and will be removed in a future release. Use studio_sound instead.


Preserve original timing with muted

If you need the output to keep the exact same duration as the input — for example, to stay in sync with a video timeline or subtitle file — set muted=True. Edits are silenced instead of cut, so no content shifts in time.


Video: send audio only when you're enhancing, not editing

When you want to edit a video — remove filler words, cut silences, trim stutters — you must send the video file directly. Cleanvoice applies cuts to the video timeline and returns a shortened video with the edits applied.

In the Python SDK, common video filenames and URLs are auto-detected, but explicit video=True is still safest for ambiguous or extensionless URLs.

When you only want to enhance audio (noise reduction, studio sound, normalization) without any cuts, the video stream is irrelevant. Extract the audio, process the small file, then mux the cleaned audio back. The upload can be 10–50× smaller and you get the same result.

Job typeWhat to send
Fillers, silences, stutters, mouth sounds, breathSend the video file directly
Noise reduction, studio sound, normalization onlyExtract audio → process → mux back

Enhancement-only pattern:

# 1. Extract audio (no re-encode, no quality loss)
ffmpeg -i input.mp4 -vn -acodec copy audio.aac

# 2. Send audio.aac to Cleanvoice, get back cleaned.aac

# 3. Mux cleaned audio back into the original video container
ffmpeg -i input.mp4 -i cleaned.aac -c:v copy -c:a copy -map 0:v:0 -map 1:a:0 output.mp4

In Python:

import subprocess
from cleanvoice import Cleanvoice

client = Cleanvoice.from_env()

# Step 1: extract audio
subprocess.run(
    ["ffmpeg", "-i", "input.mp4", "-vn", "-acodec", "copy", "audio.aac", "-y"],
    check=True,
)

# Step 2: enhance audio only (no editing/cuts)
result = client.process(
    "audio.aac",
    remove_noise=True,
    studio_sound=True,
    normalize=True,
    output_path="cleaned.aac",
)

# Step 3: mux back — video is untouched, only audio is replaced
subprocess.run([
    "ffmpeg", "-i", "input.mp4", "-i", "cleaned.aac",
    "-c:v", "copy", "-c:a", "copy",
    "-map", "0:v:0", "-map", "1:a:0",
    "output.mp4", "-y",
], check=True)

-acodec copy extracts the audio stream without re-encoding it. -c:v copy -c:a copy in the mux step remuxes without transcoding. No quality loss at either step.


Deliver results directly to your storage

By default, Cleanvoice hosts the cleaned file and you download it from our servers. At scale, this means every file travels the full round trip: your servers → Cleanvoice → your servers.

Use signed_url to eliminate the return leg. Generate a pre-signed PUT URL for your S3/GCS/Azure bucket before submitting the job. Cleanvoice will PUT the cleaned file directly into your storage — you never need to download it.

import boto3
from cleanvoice import Cleanvoice

client = Cleanvoice.from_env()
s3 = boto3.client("s3")

# Generate a pre-signed PUT URL (valid for 1 hour)
signed_url = s3.generate_presigned_url(
    "put_object",
    Params={"Bucket": "my-bucket", "Key": "cleaned/episode.mp3"},
    ExpiresIn=3600,
)

result = client.process(
    "audio.aac",
    fillers=True,
    normalize=True,
    signed_url=signed_url,
)
# The cleaned file is now at s3://my-bucket/cleaned/episode.mp3
# No download needed — result.audio.url points to your own bucket

The signed_url must be a PUT URL, not a GET URL. Most S3-compatible storage providers support pre-signed PUT URLs. The URL must remain valid until Cleanvoice finishes processing — use at least a 1-hour expiry for long files.


Upload once, reuse across requests

If you need to process the same file multiple times with different settings (or retry a failed job), upload it once and reuse the URL. The uploaded URL is not public — it's a private Cleanvoice storage URL only accessible by your API key.

# Upload once
file_url = client.upload_file("episode.mp3")
# → "https://storage.cleanvoice.ai/uploads/your-key/..."

# Reuse for multiple jobs — no re-upload needed
edit_id_1 = client.create_edit(file_url, fillers=True, normalize=True)
edit_id_2 = client.create_edit(file_url, transcription=True)
edit_id_3 = client.create_edit(file_url, remove_noise=True, studio_sound=True)

Uploaded files are retained for 7 days, then deleted automatically. You can also delete them early with DELETE /v2/edits/{edit_id}.


Batch processing

Submit all jobs first, then poll — never wait serially between files.

from cleanvoice import Cleanvoice
import time

client = Cleanvoice.from_env()

files = ["ep1.mp3", "ep2.mp3", "ep3.mp3"]

# Submit all jobs immediately
edit_ids = [
    client.create_edit(f, fillers=True, normalize=True)
    for f in files
]

# Poll for completion
results = {}
pending = list(zip(edit_ids, files))

while pending:
    still_pending = []
    for edit_id, filename in pending:
        edit = client.get_edit(edit_id)
        if edit.status == "SUCCESS":
            results[filename] = edit
        elif edit.status == "FAILURE":
            print(f"Failed: {filename}")
        else:
            still_pending.append((edit_id, filename))
    pending = still_pending
    if pending:
        time.sleep(10)

print(f"Done: {len(results)}/{len(files)}")

Do not pass multiple files in one process() call to simulate batching — that activates multi-track mode (for interviews recorded on separate mics). Submit one job per file.