🌐 HTTP API
Piper can be run as a web server, providing a simple HTTP API for text-to-speech synthesis. This is useful for network-based applications or for providing TTS as a shared service.
1. Installation
To run the web server, you need to install the http extra dependencies:
pip install 'piper-tts[http]'
2. Running the Server
After installing the dependencies and downloading a voice model, you can start the server using the piper.http_server module.
python3 -m piper.http_server -m en_US-lessac-medium.onnx
By default, the server starts on localhost at port 5000. You can change this using the --host and --port flags:
python3 -m piper.http_server -m en_US-lessac-medium.onnx --host 0.0.0.0 --port 8080
If you have your voices stored in a different directory, use the --data-dir argument:
python3 -m piper.http_server --data-dir /path/to/voices -m en_US-lessac-medium.onnx
3. API Endpoints
POST /
This is the main endpoint for synthesizing speech. It accepts a JSON body and returns a WAV audio stream.
Example Request using curl:
curl -X POST -H 'Content-Type: application/json' \
-d '{ "text": "This is a test from the HTTP API." }' \
-o test.wav http://localhost:5000
This will create a test.wav file with the synthesized audio.
JSON Body Parameters:
text(string, required): The text to synthesize.voice(string, optional): The name of the voice to use (e.g.,en_US-lessac-medium.onnx). Defaults to the voice specified with the-mflag when starting the server.speaker(string, optional): The name of the speaker for multi-speaker voices.speaker_id(integer, optional): The ID of the speaker for multi-speaker voices. Overridesspeakerif both are provided.length_scale(float, optional): Speaking speed. Values< 1.0are faster,> 1.0are slower. Defaults to1.0.noise_scale(float, optional): Amount of audio variability.noise_w_scale(float, optional): Amount of phoneme width variability.
GET /voices
This endpoint returns a list of available voices that the server can use.
Example Request using curl:
curl http://localhost:5000/voices
Example Response:
{
"en_US-lessac-medium.onnx": {
"name": "en_US-lessac-medium",
"sample_rate": 22050,
/* ... other voice metadata ... */
}
}