Quick Start Guide
This guide provides a minimal "Hello, World!" example to get you up and running with Piper in just a few steps.
Prerequisites
Make sure you have completed the Installation Guide and have:
- Installed the
piper-tts
Python package. - Downloaded at least one voice model.
For this guide, we will assume you have downloaded the en_US-lessac-medium
voice.
Step 1: Synthesize Text to a WAV File
The most basic use of Piper is to convert a line of text into an audio file using the command-line interface.
Open your terminal and run the following command. This tells Piper to use the specified model (-m
) to generate a WAV file (-f
) from the provided text.
python3 -m piper -m en_US-lessac-medium.onnx -f output.wav -- 'Welcome to the world of speech synthesis!'
After running the command, you will find a file named output.wav
in your current directory. Play this file to hear the synthesized audio.
Note: You must provide the full path to the .onnx
model file. If the voice files are in a different directory, use the --data-dir
flag to specify their location.
Step 2: Synthesize and Play Audio Directly
If you have ffplay
(part of the FFmpeg suite) installed, you can have Piper synthesize and play the audio directly without saving it to a file. Simply omit the -f
(output file) argument.
python3 -m piper -m en_US-lessac-medium.onnx -- 'This will play on your speakers.'
This command will load the model, synthesize the audio, and stream it directly to ffplay
for immediate playback.
What's Next?
You have successfully synthesized your first speech with Piper! To explore more advanced features, check out the following guides:
- Command-Line Interface: For a detailed look at all CLI options.
- Python API: To integrate Piper into your Python applications.
- HTTP API: To run Piper as a web server.