CleanvoiceDocs
Python SDK

Editing & Results

Run an edit with client.process(), understand the job lifecycle, and work with the result object.

How it works

When you call client.process(), the SDK runs the full lifecycle for you:

Your file → Upload (if needed) → Submit job → Poll for completion → Return result
  1. Upload — if you passed a local path or NumPy array, the SDK uploads the file first.
  2. Submit — the job is sent to the Cleanvoice API (POST /v2/edits).
  3. Poll — the SDK polls the job status every few seconds until it finishes.
  4. Return — once the status is SUCCESS, you get back a result object.

client.process() blocks until the result is ready. Processing typically takes ~30 seconds for a 2–3 minute clip and 5–10 minutes for a 1-hour file.


Running an edit

Pass your file source as the first argument, followed by any editing options:

from cleanvoice import Cleanvoice

client = Cleanvoice.from_env()

result = client.process(
    "https://example.com/episode.mp3",  # or local path, or (numpy_array, sample_rate)
    fillers=True,        # remove "um", "uh", filler words
    long_silences=True,  # trim long pauses
    studio_sound=True,   # studio-quality audio enhancement
    normalize=True,      # consistent loudness
)

For the full list of options, see the Configuration Reference.

Common presets

result = client.process(
    "episode.mp3",
    studio_sound=True,
    normalize=True,
)

If studio_sound is too aggressive, use remove_noise=True instead.

result = client.process(
    "episode.mp3",
    studio_sound=True,
    normalize=True,
    fillers=True,
    long_silences=True,
    mouth_sounds=True,
    breath=True,
    stutters=True,
)
result = client.process(
    "episode.mp3",
    long_silences=True,
    normalize=True,
)
result = client.process(
    "episode.mp3",
    transcription=True,
    summarize=True,       # chapters + key learnings (enables transcription)
    social_content=True,  # tweets, LinkedIn posts, show notes (enables summarize)
)

The result object

client.process() returns a result object. Here is what it contains:

result = client.process(
    "episode.mp3",
    fillers=True,
    transcription=True,
    summarize=True,
    social_content=True,
)

# Cleaned audio URL (hosted by Cleanvoice, valid for 7 days)
print(result.audio.url)

# Flattened transcript view (if transcription=True)
print(result.transcript.text)
print(result.transcript.paragraphs[0].text)
print(result.transcript.detailed.words[0].text)

# Summary fields are also mirrored onto result.transcript when summarize=True
print(result.transcript.summary)
print(result.transcript.title)
print(result.transcript.chapters)

# Full summarization object (if summarize=True)
print(result.summarization.title)
print(result.summarization.summary)
print(result.summarization.chapters)
print(result.summarization.key_learnings)
print(result.summarization.summary_of_summary)
print(result.summarization.episode_description)

# Social content (if social_content=True)
print(result.social_content.newsletter)
print(result.social_content.twitter_thread)
print(result.social_content.linkedin)

# Task ID — use this if you need to correlate progress or logs later
print(result.task_id)

result.media is an alias of result.audio — useful for video workflows:

print(result.media.url)   # same as result.audio.url

Text output shapes

When you enable transcription, summarize, or social_content, the SDK-friendly process() result looks like this:

# result.transcript
{
    "text": str,
    "paragraphs": [
        {"start": float, "end": float, "text": str},
    ],
    "detailed": {
        "words": [
            {"id": int, "start": float, "end": float, "text": str},
        ],
        "paragraphs": [
            {"id": int, "start": float, "end": float, "speaker": str},
        ],
    },
    "summary": str | None,
    "title": str | None,
    "chapters": [{"start": float, "title": str}] | None,
    "summarization": {
        "title": str,
        "summary": str,
        "chapters": [{"start": float, "title": str}],
        "summaries": [str],
        "key_learnings": str,
        "summary_of_summary": str,
        "episode_description": str,
    } | None,
}

# result.summarization
{
    "title": str,
    "summary": str,
    "chapters": [{"start": float, "title": str}],
    "summaries": [str],
    "key_learnings": str,
    "summary_of_summary": str,
    "episode_description": str,
}

# result.social_content
{
    "newsletter": str,
    "twitter_thread": str,
    "linkedin": str,
}

client.process() returns the SDK-shaped result shown above. If you use client.get_edit(), you get the raw API polling payload instead, where the fields live under edit.result.transcription, edit.result.summarization, and edit.result.social_content.


Downloading the result

Save to a file

result.audio.download("cleaned.mp3")

The format is determined by export_format (defaults to matching the input).

Convenience: process and download in one call

result, saved_path = client.process_and_download(
    "episode.mp3",
    "cleaned.mp3",
    fillers=True,
    normalize=True,
)
print(saved_path)  # → "cleaned.mp3"

Save path inline with output_path

result = client.process(
    "episode.mp3",
    normalize=True,
    output_path="cleaned.mp3",  # downloaded automatically on completion
)

Download as a NumPy array

cleaned_audio, sample_rate = result.download_audio(as_numpy=True)

Deliver directly to your own storage

Pass a pre-signed PUT URL and Cleanvoice will upload the cleaned file directly to your bucket — nothing goes through your server.

result = client.process(
    "episode.mp3",
    normalize=True,
    signed_url="https://your-bucket.s3.amazonaws.com/cleaned.mp3?X-Amz-Signature=...",
)

Generate the pre-signed PUT URL on your backend before calling process().


Manual job control (advanced)

If you need to submit a job and retrieve the result separately — for example in a background worker, serverless function, or retry logic — use create_edit() and get_edit() instead of process().

Prefer client.process() for most use cases — it handles polling automatically. Use create_edit + get_edit only when you need to decouple submission from retrieval.

1. Submit the job

edit_id = client.create_edit(
    "https://example.com/episode.mp3",
    fillers=True,
    normalize=True,
)
print("Job submitted:", edit_id)

2. Poll for completion

import time

while True:
    edit = client.get_edit(edit_id)
    print("Status:", edit.status)

    if edit.status == "SUCCESS":
        print("Download URL:", edit.result.download_url)
        break
    elif edit.status == "FAILURE":
        print("Processing failed")
        break

    time.sleep(5)

Job statuses:

StatusMeaning
PENDINGQueued, not yet started
PREPROCESSINGInput analysis and setup is in progress
CLASSIFICATIONCleanvoice is classifying events in the media
EDITINGThe main edit pass is running
POSTPROCESSINGFinal cleanup and result assembly is running
EXPORTOutput files are being written
STARTEDProcessing has started
PROCESSINGGeneric in-progress state from the API
QUEUEDWaiting for a worker to pick up the job
SUCCESSComplete — edit.result.download_url is available
FAILUREPermanent failure
RETRYTransient failure, will be retried automatically