🌐 HTTP API
Piper can be run as a web server, providing a simple HTTP API for text-to-speech synthesis. This is useful for network-based applications or for providing TTS as a shared service.
1. Installation
To run the web server, you need to install the http
extra dependencies:
pip install 'piper-tts[http]'
2. Running the Server
After installing the dependencies and downloading a voice model, you can start the server using the piper.http_server
module.
python3 -m piper.http_server -m en_US-lessac-medium.onnx
By default, the server starts on localhost
at port 5000
. You can change this using the --host
and --port
flags:
python3 -m piper.http_server -m en_US-lessac-medium.onnx --host 0.0.0.0 --port 8080
If you have your voices stored in a different directory, use the --data-dir
argument:
python3 -m piper.http_server --data-dir /path/to/voices -m en_US-lessac-medium.onnx
3. API Endpoints
POST /
This is the main endpoint for synthesizing speech. It accepts a JSON body and returns a WAV audio stream.
Example Request using curl
:
curl -X POST -H 'Content-Type: application/json' \
-d '{ "text": "This is a test from the HTTP API." }' \
-o test.wav http://localhost:5000
This will create a test.wav
file with the synthesized audio.
JSON Body Parameters:
text
(string, required): The text to synthesize.voice
(string, optional): The name of the voice to use (e.g.,en_US-lessac-medium.onnx
). Defaults to the voice specified with the-m
flag when starting the server.speaker
(string, optional): The name of the speaker for multi-speaker voices.speaker_id
(integer, optional): The ID of the speaker for multi-speaker voices. Overridesspeaker
if both are provided.length_scale
(float, optional): Speaking speed. Values< 1.0
are faster,> 1.0
are slower. Defaults to1.0
.noise_scale
(float, optional): Amount of audio variability.noise_w_scale
(float, optional): Amount of phoneme width variability.
GET /voices
This endpoint returns a list of available voices that the server can use.
Example Request using curl
:
curl http://localhost:5000/voices
Example Response:
{
"en_US-lessac-medium.onnx": {
"name": "en_US-lessac-medium",
"sample_rate": 22050,
/* ... other voice metadata ... */
}
}