Chatterbox TTS API
Local, OpenAI-compatible text-to-speech with zero-shot voice cloning. Run it on your own hardware, keep your data private.
Chatterbox TTS API wraps the Chatterbox voice cloning model in a production-ready FastAPI server. Clone any voice from a short sample, generate speech in 22 languages, and integrate with any OpenAI-compatible application.

Features
Production-ready voice cloning API with everything you need.
OpenAI Compatible
Drop-in replacement for OpenAI TTS API. Works with Open WebUI, AnythingLLM, and any OpenAI-compatible client.
Zero-Shot Voice Cloning
Clone any voice from a short audio sample. No training required, instant results.
22 Languages
Generate speech in Arabic, German, English, Spanish, French, Japanese, Korean, and 15 more languages.
Docker Ready
One-command deployment with Docker. GPU and CPU variants available for any setup.
Voice Library
Upload, manage, and organize custom voices. Reference voices by name in API calls.
Streaming Support
Real-time audio streaming with Server-Sent Events. Get audio as it generates.
Run Locally
Your data stays on your machine. No external API calls, complete privacy.
Memory Management
Advanced memory monitoring with automatic cleanup. Efficient resource usage.
22 Languages Supported
Generate natural-sounding speech with language-aware voice cloning.
Quick Start
Get up and running in minutes.
Local Installation
# Clone the repository
git clone https://github.com/travisvn/chatterbox-tts-api.git
cd chatterbox-tts-api
# Install dependencies with uv
uv sync
# Run the server
uv run main.pyDocker
# GPU version (recommended)
docker run -d -p 8004:8004 --gpus all \
-v voices:/app/voices \
travisvn/chatterbox-tts-api:latest
# CPU version
docker run -d -p 8004:8004 \
-v voices:/app/voices \
travisvn/chatterbox-tts-api:cpuAPI Usage
# OpenAI-compatible API call
curl -X POST "http://localhost:8004/v1/audio/speech" \
-H "Content-Type: application/json" \
-d '{
"input": "Hello, this is my cloned voice!",
"voice": "my-custom-voice",
"language": "en"
}' --output speech.mp3Built With
A FastAPI backend wrapping the Chatterbox TTS model, with an optional React frontend for voice management and testing.
Ready to clone some voices?
Star the repo, try the demo, or spin up your own instance.