🔧 C/C++ API (libpiper)
For high-performance applications, Piper offers a shared library (libpiper) with a C-style API that can be used from C, C++, and other languages that support C bindings.
Building libpiper
The libpiper library is built using CMake. From the libpiper/ directory in the repository:
-
Configure the build:
cmake -Bbuild -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$PWD/install -
Build the library:
cmake --build build -
Install the library and headers:
cmake --install build
This process will automatically download and build espeak-ng and download the pre-compiled onnxruntime shared libraries. The final artifacts will be placed in the libpiper/install directory.
To use libpiper in your project, you will need to:
- Include the header file:
install/include/piper.h - Link against the
libpipershared library:install/libpiper.so(or.dll/.dylib) - Link against the
libonnxruntimeshared library:install/lib/libonnxruntime.so - Ensure the
espeak-ng-datadirectory (install/espeak-ng-data/) is available at runtime.
C++ Example
Here is a basic example of how to use the C API from C++:
#include <fstream>
#include "piper.h"
int main() {
// Create the synthesizer
piper_synthesizer *synth = piper_create("/path/to/voice.onnx",
"/path/to/voice.onnx.json",
"/path/to/espeak-ng-data");
if (!synth) {
// Handle error
return 1;
}
// Open a file to write the raw audio samples
std::ofstream audio_stream("output.raw", std::ios::binary);
// Get and modify default synthesis options
piper_synthesize_options options = piper_default_synthesize_options(synth);
// options.length_scale = 1.5; // 50% slower
// options.speaker_id = 5;
// Start synthesis
piper_synthesize_start(synth, "Welcome to the world of speech synthesis!", &options);
piper_audio_chunk chunk;
while (piper_synthesize_next(synth, &chunk) != PIPER_DONE) {
audio_stream.write(reinterpret_cast<const char *>(chunk.samples),
chunk.num_samples * sizeof(float));
}
// Free resources
piper_free(synth);
return 0;
}
To play the output file, you can use a tool like aplay:
aplay -r 22050 -c 1 -f FLOAT_LE -t raw output.raw
C API Reference
This section details the functions and structs exposed by piper.h.
Structs
piper_synthesizer
An opaque struct representing the text-to-speech synthesizer instance.
piper_audio_chunk
Contains a chunk of synthesized audio and associated metadata.
const float *samples: Raw floating-point audio samples.size_t num_samples: The number of samples in the chunk.int sample_rate: Sample rate in Hertz (e.g., 22050).bool is_last: True if this is the final audio chunk for the synthesis request.const char32_t *phonemes: Phoneme codepoints. See the Alignments documentation for details.size_t num_phonemes: Number of phoneme codepoints.const int *phoneme_ids: Phoneme IDs used by the model.size_t num_phoneme_ids: Number of phoneme IDs.const int *alignments: Audio sample count for each phoneme ID. Requires a patched model.size_t num_alignments: Number of alignments.
piper_synthesize_options
Configuration for a synthesis request.
int speaker_id: ID of the speaker for multi-speaker models (0 for the first speaker).float length_scale: Speaking speed (default:1.0).float noise_scale: Audio variability (e.g.,0.667).float noise_w_scale: Phoneme length variability (e.g.,0.8).
Functions
piper_synthesizer *piper_create(const char *model_path, const char *config_path, const char *espeak_data_path)
Creates and initializes a synthesizer. Returns NULL on failure.
model_path: Path to the.onnxvoice model.config_path: Path to the.onnx.jsonconfig file. IfNULL, it's assumed to bemodel_path+.json.espeak_data_path: Path to theespeak-ng-datadirectory.
void piper_free(piper_synthesizer *synth)
Frees all resources associated with a synthesizer.
piper_synthesize_options piper_default_synthesize_options(piper_synthesizer *synth)
Returns the default synthesis options for a given voice model.
int piper_synthesize_start(piper_synthesizer *synth, const char *text, const piper_synthesize_options *options)
Begins the synthesis process for the given text. Call piper_synthesize_next to retrieve audio chunks. Returns PIPER_OK on success.
text: The UTF-8 encoded text to synthesize.options: Synthesis options. IfNULL, default options are used.
int piper_synthesize_next(piper_synthesizer *synth, piper_audio_chunk *chunk)
Retrieves the next chunk of synthesized audio. The memory for chunk members is valid until the next call to this function. Returns PIPER_OK if a chunk is available, PIPER_DONE if synthesis is complete, or an error code.