Open Source

Chatterbox TTS API

Local, OpenAI-compatible text-to-speech with zero-shot voice cloning. Run it on your own hardware, keep your data private.

Chatterbox TTS API wraps the Chatterbox voice cloning model in a production-ready FastAPI server. Clone any voice from a short sample, generate speech in 22 languages, and integrate with any OpenAI-compatible application.

AGPL-3.0 License
Self-hosted
GPU & CPU support
localhost:8004
Chatterbox TTS API interface

Features

Production-ready voice cloning API with everything you need.

OpenAI Compatible

Drop-in replacement for OpenAI TTS API. Works with Open WebUI, AnythingLLM, and any OpenAI-compatible client.

Zero-Shot Voice Cloning

Clone any voice from a short audio sample. No training required, instant results.

22 Languages

Generate speech in Arabic, German, English, Spanish, French, Japanese, Korean, and 15 more languages.

Docker Ready

One-command deployment with Docker. GPU and CPU variants available for any setup.

Voice Library

Upload, manage, and organize custom voices. Reference voices by name in API calls.

Streaming Support

Real-time audio streaming with Server-Sent Events. Get audio as it generates.

Run Locally

Your data stays on your machine. No external API calls, complete privacy.

Memory Management

Advanced memory monitoring with automatic cleanup. Efficient resource usage.

Multilingual

22 Languages Supported

Generate natural-sounding speech with language-aware voice cloning.

EnglishEnglish
SpanishEspañol
FrenchFrançais
GermanDeutsch
Japanese日本語
Korean한국어
Chinese中文
Arabicالعربية
PortuguesePortuguês
ItalianItaliano
RussianРусский
DutchNederlands
PolishPolski
SwedishSvenska
DanishDansk
FinnishSuomi
NorwegianNorsk
Hindiहिन्दी
TurkishTürkçe
GreekΕλληνικά
Hebrewעברית
MalayBahasa Melayu

Quick Start

Get up and running in minutes.

Local Installation

# Clone the repository
git clone https://github.com/travisvn/chatterbox-tts-api.git
cd chatterbox-tts-api

# Install dependencies with uv
uv sync

# Run the server
uv run main.py

Docker

# GPU version (recommended)
docker run -d -p 8004:8004 --gpus all \
  -v voices:/app/voices \
  travisvn/chatterbox-tts-api:latest

# CPU version
docker run -d -p 8004:8004 \
  -v voices:/app/voices \
  travisvn/chatterbox-tts-api:cpu

API Usage

# OpenAI-compatible API call
curl -X POST "http://localhost:8004/v1/audio/speech" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Hello, this is my cloned voice!",
    "voice": "my-custom-voice",
    "language": "en"
  }' --output speech.mp3

Built With

PythonFastAPIPyTorchChatterboxDockerReactTypeScript

A FastAPI backend wrapping the Chatterbox TTS model, with an optional React frontend for voice management and testing.

Ready to clone some voices?

Star the repo, try the demo, or spin up your own instance.