🔧 C/C++ API (libpiper)

For high-performance applications, Piper offers a shared library (libpiper) with a C-style API that can be used from C, C++, and other languages that support C bindings.

Building `libpiper`

The libpiper library is built using CMake. From the libpiper/ directory in the repository:

Configure the build:

cmake -Bbuild -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$PWD/install

Build the library:
```
cmake --build build
```
Install the library and headers:
```
cmake --install build
```

This process will automatically download and build espeak-ng and download the pre-compiled onnxruntime shared libraries. The final artifacts will be placed in the libpiper/install directory.

To use libpiper in your project, you will need to:

Include the header file: install/include/piper.h
Link against the libpiper shared library: install/libpiper.so (or .dll/.dylib)
Link against the libonnxruntime shared library: install/lib/libonnxruntime.so
Ensure the espeak-ng-data directory (install/espeak-ng-data/) is available at runtime.

C++ Example

Here is a basic example of how to use the C API from C++:

#include <fstream>
#include "piper.h"

int main() {
    // Create the synthesizer
    piper_synthesizer *synth = piper_create("/path/to/voice.onnx",
                                            "/path/to/voice.onnx.json",
                                            "/path/to/espeak-ng-data");

    if (!synth) {
        // Handle error
        return 1;
    }

    // Open a file to write the raw audio samples
    std::ofstream audio_stream("output.raw", std::ios::binary);

    // Get and modify default synthesis options
    piper_synthesize_options options = piper_default_synthesize_options(synth);
    // options.length_scale = 1.5; // 50% slower
    // options.speaker_id = 5;

    // Start synthesis
    piper_synthesize_start(synth, "Welcome to the world of speech synthesis!", &options);

    piper_audio_chunk chunk;
    while (piper_synthesize_next(synth, &chunk) != PIPER_DONE) {
        audio_stream.write(reinterpret_cast<const char *>(chunk.samples),
                           chunk.num_samples * sizeof(float));
    }

    // Free resources
    piper_free(synth);

    return 0;
}

To play the output file, you can use a tool like aplay: aplay -r 22050 -c 1 -f FLOAT_LE -t raw output.raw

C API Reference

This section details the functions and structs exposed by piper.h.

Structs

`piper_synthesizer`

An opaque struct representing the text-to-speech synthesizer instance.

`piper_audio_chunk`

Contains a chunk of synthesized audio and associated metadata.

const float *samples: Raw floating-point audio samples.
size_t num_samples: The number of samples in the chunk.
int sample_rate: Sample rate in Hertz (e.g., 22050).
bool is_last: True if this is the final audio chunk for the synthesis request.
const char32_t *phonemes: Phoneme codepoints. See the Alignments documentation for details.
size_t num_phonemes: Number of phoneme codepoints.
const int *phoneme_ids: Phoneme IDs used by the model.
size_t num_phoneme_ids: Number of phoneme IDs.
const int *alignments: Audio sample count for each phoneme ID. Requires a patched model.
size_t num_alignments: Number of alignments.

`piper_synthesize_options`

Configuration for a synthesis request.

int speaker_id: ID of the speaker for multi-speaker models (0 for the first speaker).
float length_scale: Speaking speed (default: 1.0).
float noise_scale: Audio variability (e.g., 0.667).
float noise_w_scale: Phoneme length variability (e.g., 0.8).

Functions

`piper_synthesizer piper_create(const char model_path, const char config_path, const char espeak_data_path)`

Creates and initializes a synthesizer. Returns NULL on failure.

model_path: Path to the .onnx voice model.
config_path: Path to the .onnx.json config file. If NULL, it's assumed to be model_path + .json.
espeak_data_path: Path to the espeak-ng-data directory.

`void piper_free(piper_synthesizer *synth)`

Frees all resources associated with a synthesizer.

`piper_synthesize_options piper_default_synthesize_options(piper_synthesizer *synth)`

Returns the default synthesis options for a given voice model.

`int piper_synthesize_start(piper_synthesizer synth, const char text, const piper_synthesize_options *options)`

Begins the synthesis process for the given text. Call piper_synthesize_next to retrieve audio chunks. Returns PIPER_OK on success.

text: The UTF-8 encoded text to synthesize.
options: Synthesis options. If NULL, default options are used.

`int piper_synthesize_next(piper_synthesizer synth, piper_audio_chunk chunk)`

Retrieves the next chunk of synthesized audio. The memory for chunk members is valid until the next call to this function. Returns PIPER_OK if a chunk is available, PIPER_DONE if synthesis is complete, or an error code.

🔧 C/C++ API (libpiper)

Building libpiper

C++ Example

C API Reference

Structs

piper_synthesizer

piper_audio_chunk

piper_synthesize_options

Functions

piper_synthesizer *piper_create(const char *model_path, const char *config_path, const char *espeak_data_path)

void piper_free(piper_synthesizer *synth)

piper_synthesize_options piper_default_synthesize_options(piper_synthesizer *synth)

int piper_synthesize_start(piper_synthesizer *synth, const char *text, const piper_synthesize_options *options)

int piper_synthesize_next(piper_synthesizer *synth, piper_audio_chunk *chunk)