Voximplant now includes a native Cartesia module for streaming, low-latency text-to-speech (TTS). You can use a single VoxEngine API to synthesize speech in real time, connect it to any call (PSTN, SIP, WebRTC, WhatsApp) and control playback from a Large Language Model (LLM) or other source, all inside VoxEngine.
This adds another model with more voice choices to Voximplant’s existing real time speech synthesis streaming selection and large list of speech providers, allowing developers to choose the features and voices that best fit their applications.
Highlights
- Expressive voices and controls — all Sonic model capabilities are supported including speed, volume, and emotion controls, custom pronunciation for more lifelike delivery.
- Designed for latency — the integration supports input streaming, timestamps, and multiplexing — ideal for LLM agents.
- Bring your own API key or use Voximplant — Pass a Cartesia key per scenario when you want access to customized and cloned voices billed under your existing Cartesia account.
Developer Notes
- Native VoxEngine module —
require(Modules.Cartesia)and callCartesia.createRealtimeTTSPlayer()to spin up a streaming TTS player you can route into calls. - True realtime synthesis — Stream text chunks with
RealtimeTTSPlayer.generationRequest(), cancel on barge-in withcancelContextRequest(), and clear the buffer instantly. - Pass Cartesia fields directly — Use the
generationRequestParametersobject to provide the same JSON you’d send in Cartesia’s generation API (e.g.,model_id,transcript,voice,language, etc.). - Barge-in — On user interruption, call
clearBuffer()and start a new context withgenerationRequest()to keep prosody natural. AudioChunksPlaybackFinished— event fired when all Cartesia audio currently buffered for the player has actually finished playing to the caller, giving you a precise “TTS turn finished” hook to advance dialog state, start ASR, or record latency metrics (note this feature will be available by late November).
Demo video
See the video below for a demonstration.
Quick start (VoxEngine)
```javascript
require(Modules.Cartesia); // Cartesia Realtime TTS
let call, playerUrl;
VoxEngine.addEventListener(AppEvents.CallAlerting, function (e) {
e.call.answer();
call = e.call;
e.call.addEventListener(CallEvents.Connected, function (callEvent) {
const text = 'Hi, I’m Voxy.';
const request = {
"model_id": "sonic-2",
"transcript": "I now support Cartesia Text-to-Speech",
"voice": {
"mode": "id",
"id": "a0e99841-438c-4a64-b679-ae501e7d6091"
},
"language": "en",
"context_id": "something-unique",
"continue": true
};
const cartesiaRealtimeTTSPlayerParameters = {
generationRequestParameters: request
};
const player = Cartesia.createRealtimeTTSPlayer(text, cartesiaRealtimeTTSPlayerParameters);
player.sendMediaTo(call);
player.generationRequest({
"model_id": "sonic-2",
"transcript": "natively inside VoxEngine",
"voice": {
"mode": "id",
"id": "a0e99841-438c-4a64-b679-ae501e7d6091"
},
"language": "en",
"context_id": "something-unique",
"continue": false
})
// When TTS playback is finished
player.addEventListener(PlayerEvents.PlaybackFinished, (playerEvent) => {
playerUrl = VoxEngine.createURLPlayer(
"https://cdn.voximplant.com/yodl.mp3",
true
); // Play mp3 file
playerUrl.sendMediaTo(call); // end session in 10 sec
setTimeout(VoxEngine.terminate, 10000);
});
});
e.call.addEventListener(CallEvents.Disconnected, VoxEngine.terminate);
});
Pricing and availability
Cartesia support is available today. Pricing is $0.00056 for 10 characters for all models when using the default Voximplant account.
If you choose to use your own API key, your usage is determined by your plan in Cartesia’s portal. Voximplant changes an additional $2 per one million characters when you bring your own API key.
See voximplant.com/pricing for full details
Resources
- VoxEngine Cartesia reference — classes, parameters, and methods. docs.voximplant.com
- Realtime TTS guide (Cartesia example) — end‑to‑end scenario walkthrough. docs.voximplant.com
- Cartesia models — Sonic‑3 features; Sonic/Sonic‑turbo snapshots and latencies. docs.cartesia.ai




