🔧 C/C++ API (libpiper)
For high-performance applications, Piper offers a shared library (libpiper
) with a C-style API that can be used from C, C++, and other languages that support C bindings.
Building libpiper
The libpiper
library is built using CMake. From the libpiper/
directory in the repository:
-
Configure the build:
cmake -Bbuild -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$PWD/install
-
Build the library:
cmake --build build
-
Install the library and headers:
cmake --install build
This process will automatically download and build espeak-ng
and download the pre-compiled onnxruntime
shared libraries. The final artifacts will be placed in the libpiper/install
directory.
To use libpiper
in your project, you will need to:
- Include the header file:
install/include/piper.h
- Link against the
libpiper
shared library:install/libpiper.so
(or.dll
/.dylib
) - Link against the
libonnxruntime
shared library:install/lib/libonnxruntime.so
- Ensure the
espeak-ng-data
directory (install/espeak-ng-data/
) is available at runtime.
C++ Example
Here is a basic example of how to use the C API from C++:
#include <fstream>
#include "piper.h"
int main() {
// Create the synthesizer
piper_synthesizer *synth = piper_create("/path/to/voice.onnx",
"/path/to/voice.onnx.json",
"/path/to/espeak-ng-data");
if (!synth) {
// Handle error
return 1;
}
// Open a file to write the raw audio samples
std::ofstream audio_stream("output.raw", std::ios::binary);
// Get and modify default synthesis options
piper_synthesize_options options = piper_default_synthesize_options(synth);
// options.length_scale = 1.5; // 50% slower
// options.speaker_id = 5;
// Start synthesis
piper_synthesize_start(synth, "Welcome to the world of speech synthesis!", &options);
piper_audio_chunk chunk;
while (piper_synthesize_next(synth, &chunk) != PIPER_DONE) {
audio_stream.write(reinterpret_cast<const char *>(chunk.samples),
chunk.num_samples * sizeof(float));
}
// Free resources
piper_free(synth);
return 0;
}
To play the output file, you can use a tool like aplay
:
aplay -r 22050 -c 1 -f FLOAT_LE -t raw output.raw
C API Reference
This section details the functions and structs exposed by piper.h
.
Structs
piper_synthesizer
An opaque struct representing the text-to-speech synthesizer instance.
piper_audio_chunk
Contains a chunk of synthesized audio and associated metadata.
const float *samples
: Raw floating-point audio samples.size_t num_samples
: The number of samples in the chunk.int sample_rate
: Sample rate in Hertz (e.g., 22050).bool is_last
: True if this is the final audio chunk for the synthesis request.const char32_t *phonemes
: Phoneme codepoints. See the Alignments documentation for details.size_t num_phonemes
: Number of phoneme codepoints.const int *phoneme_ids
: Phoneme IDs used by the model.size_t num_phoneme_ids
: Number of phoneme IDs.const int *alignments
: Audio sample count for each phoneme ID. Requires a patched model.size_t num_alignments
: Number of alignments.
piper_synthesize_options
Configuration for a synthesis request.
int speaker_id
: ID of the speaker for multi-speaker models (0 for the first speaker).float length_scale
: Speaking speed (default:1.0
).float noise_scale
: Audio variability (e.g.,0.667
).float noise_w_scale
: Phoneme length variability (e.g.,0.8
).
Functions
piper_synthesizer *piper_create(const char *model_path, const char *config_path, const char *espeak_data_path)
Creates and initializes a synthesizer. Returns NULL
on failure.
model_path
: Path to the.onnx
voice model.config_path
: Path to the.onnx.json
config file. IfNULL
, it's assumed to bemodel_path
+.json
.espeak_data_path
: Path to theespeak-ng-data
directory.
void piper_free(piper_synthesizer *synth)
Frees all resources associated with a synthesizer.
piper_synthesize_options piper_default_synthesize_options(piper_synthesizer *synth)
Returns the default synthesis options for a given voice model.
int piper_synthesize_start(piper_synthesizer *synth, const char *text, const piper_synthesize_options *options)
Begins the synthesis process for the given text. Call piper_synthesize_next
to retrieve audio chunks. Returns PIPER_OK
on success.
text
: The UTF-8 encoded text to synthesize.options
: Synthesis options. IfNULL
, default options are used.
int piper_synthesize_next(piper_synthesizer *synth, piper_audio_chunk *chunk)
Retrieves the next chunk of synthesized audio. The memory for chunk
members is valid until the next call to this function. Returns PIPER_OK
if a chunk is available, PIPER_DONE
if synthesis is complete, or an error code.