CleanvoiceDocs
SDKs

Python SDK

Official Python SDK for the Cleanvoice API.

The official Cleanvoice Python SDK handles authentication, file uploads, job polling, and audio downloads — so you can focus on building.

Installation

pip install cleanvoice-sdk

Initialization

from cleanvoice import Cleanvoice

# Using an explicit API key
client = Cleanvoice(api_key="YOUR_API_KEY")

# Or read from the CLEANVOICE_API_KEY environment variable
client = Cleanvoice.from_env()

Custom base URL and timeout:

client = Cleanvoice(
    api_key="YOUR_API_KEY",
    base_url="https://api.cleanvoice.ai/v2",
    timeout=120,
)

client.process()

Submit a file and wait for the result. This is the recommended method for most use cases.

result = client.process(
    file_input,          # URL string, local path, or (numpy_array, sample_rate)
    fillers=True,
    long_silences=True,
    mouth_sounds=True,
    breath=True,
    stutters=True,
    remove_noise=True,
    studio_sound=False,
    normalize=True,
    transcription=False,
    summarize=False,
    social_content=False,
    export_format="mp3",  # "mp3", "wav", "flac", "m4a"
    output_path=None,     # save directly to a file
    progress_callback=None,
)

Parameters

ParameterTypeDescription
file_inputstr | tupleURL, local file path, or (numpy_array, sample_rate)
fillersboolRemove filler words
long_silencesboolTrim long silences
mouth_soundsboolRemove mouth noises
breathboolRemove audible breathing
stuttersboolRemove stutters
remove_noiseboolReduce background noise
studio_soundboolApply studio sound enhancement
normalizeboolNormalize loudness
mute_lufsfloatGate level for LUFS measurement (for example -120)
target_lufsfloatTarget LUFS (e.g. -16.0)
transcriptionboolReturn transcript
summarizeboolGenerate summary and chapters
social_contentboolGenerate social media copy
export_formatstrOutput format (mp3, wav, flac, m4a)
output_pathstrSave audio to this path automatically
progress_callbackcallableCalled with a dict payload containing status, result, edit_id, and attempt

Working with the result

result = client.process(
    "episode.mp3",
    fillers=True,
    transcription=True,
    summarize=True,
    social_content=True,
)

# Download audio to a file
result.audio.download("cleaned.mp3")

# Or get as a numpy array
audio_array, sample_rate = result.download_audio(as_numpy=True)

# Access transcript (if transcription=True)
if result.transcript:
    print(result.transcript.text)
    print(result.transcript.paragraphs[0].text)
    print(result.transcript.detailed.words[0].text)
    print(result.transcript.summary)
    print(result.transcript.title)
    print(result.transcript.chapters)

# Access summary (if summarize=True)
if result.summarization:
    print(result.summarization.title)
    print(result.summarization.summary)
    print(result.summarization.chapters)
    print(result.summarization.key_learnings)
    print(result.summarization.summary_of_summary)
    print(result.summarization.episode_description)

# Access social content (if social_content=True)
if result.social_content:
    print(result.social_content.newsletter)
    print(result.social_content.twitter_thread)
    print(result.social_content.linkedin)

Text output shapes

# result.transcript
{
    "text": str,
    "paragraphs": [{"start": float, "end": float, "text": str}],
    "detailed": {
        "words": [{"id": int, "start": float, "end": float, "text": str}],
        "paragraphs": [{"id": int, "start": float, "end": float, "speaker": str}],
    },
    "summary": str | None,
    "title": str | None,
    "chapters": [{"start": float, "title": str}] | None,
    "summarization": {
        "title": str,
        "summary": str,
        "chapters": [{"start": float, "title": str}],
        "summaries": [str],
        "key_learnings": str,
        "summary_of_summary": str,
        "episode_description": str,
    } | None,
}

# result.summarization
{
    "title": str,
    "summary": str,
    "chapters": [{"start": float, "title": str}],
    "summaries": [str],
    "key_learnings": str,
    "summary_of_summary": str,
    "episode_description": str,
}

# result.social_content
{
    "newsletter": str,
    "twitter_thread": str,
    "linkedin": str,
}

client.create_edit()

Submit a job without waiting for completion. Returns the edit_id for later polling.

edit_id = client.create_edit(
    "https://example.com/episode.mp3",
    fillers=True,
    long_silences=True,
)
print("Edit ID:", edit_id)

client.get_edit()

Retrieve the current status and result of a previously created edit.

edit = client.get_edit("edit_abc123")
print(edit.status)   # PENDING, PREPROCESSING, CLASSIFICATION, EDITING, POSTPROCESSING, EXPORT, SUCCESS, FAILURE, RETRY

if edit.status == "SUCCESS":
    print(edit.result.download_url)

client.upload_file()

Upload a local file and get back a remote URL you can use in edit requests.

remote_url = client.upload_file(
    "/path/to/episode.mp3",
    filename="episode.mp3",  # optional
)
print("Uploaded to:", remote_url)

client.check_auth()

Verify your API key and retrieve account information.

account = client.check_auth()
print(account)

client.process_and_download()

Convenience method that processes a file and saves the result in one call.

result, saved_path = client.process_and_download(
    "episode.mp3",
    "cleaned.mp3",
    fillers=True,
    long_silences=True,
)
print(saved_path)

Async client

All methods are available in an async variant via AsyncCleanvoice:

import asyncio
from cleanvoice import AsyncCleanvoice

async def main():
    async with AsyncCleanvoice.from_env() as client:
        result = await client.process(
            "https://example.com/episode.mp3",
            fillers=True,
        )
        await result.download_audio_async("cleaned.mp3")

asyncio.run(main())

Progress callbacks

Track processing progress with a callback function:

def on_progress(update):
    print(f"Status: {update['status']}")

result = client.process(
    "episode.mp3",
    fillers=True,
    progress_callback=on_progress,
)

NumPy audio arrays

The Python SDK natively supports NumPy arrays for audio data, useful when working in Jupyter notebooks or audio processing pipelines:

import numpy as np
from cleanvoice import Cleanvoice

client = Cleanvoice.from_env()

# Process an array
audio = np.random.randn(44100 * 60)  # 60 seconds of audio
result = client.process((audio, 44100), fillers=True)

# Get result as array
cleaned_audio, sample_rate = result.download_audio(as_numpy=True)