SIGN UP

Reducing audio and TTS playback latency

Reducing audio and TTS playback latency

Voximplant allows to play audio files in your calls/conferences as well as use the text-to-speech feature when a synthesized voice pronounces the specified text. It is possible with the Player module of Voxengine. Using the module, you can create player instances of three types: URL (audio file), TTS and ToneScript. The last one is less commonly used whereas the other two are widespread in our clients' JavaScript scenarios.

The usage is quite simple: you create an instance of a player and assign it to a variable. Then you can use the player's built-in methods to control the playback. Look at the simple example:

require(Modules.Player)

let call,
  player

VoxEngine.addEventListener(AppEvents.CallAlerting, (e) => {
  call = e.call
  call.answer()

  call.addEventListener(CallEvents.Connected, (callevent) => {
    // Play intro speech
    player = VoxEngine.createTTSPlayer("Hello, you have called Voximplant Player demo", Language.US_ENGLISH_FEMALE);
    player.sendMediaTo(call);

    // When TTS playback is finished
    player.addEventListener(PlayerEvents.PlaybackFinished, (playerevent) => {
      player = VoxEngine.createURLPlayer("http://cdn.voximplant.com/yodl.mp3", true)
      // Play mp3 file
      player.sendMediaTo(call)
      // end session in 10 sec
      setTimeout(VoxEngine.terminate, 10000)
    })
  })
  call.addEventListener(CallEvents.Disconnected, VoxEngine.terminate)
})

The scenario answers a call and greets a caller via the TTS player. After the TTS greeting is finished, another player starts to play audio for about 10 seconds, then the scenario terminates a session. These are the basics of the Player's module usage and they work in complicated scenarios in absolutely the same way, but there is a nuance – the cache.

All types of players cache their audio after the very first playing, the cache data of each player's instance is stored up to 2 weeks. But what if there could be an undesirable delay before the first playing? For example, you want the TTS player to pronounce a quite long phrase or the URL player has to download a file of a maximum available size (10mb). In such cases, you can use the cache too.

URL player's onPause parameter

URL player has the onPause parameter, it literally sets the player on pause right after its creation. If this parameter is set to true, a newly created player instance immediately starts file downloading to the cache. Eventually, when you need to play audio, you will use the resume method to start playback directly from the cache, without delays. Here are the changes to the first example:

require(Modules.Player)

let call,
    TTSplayer,
    URLplayer

VoxEngine.addEventListener(AppEvents.CallAlerting, (e) => {
  call = e.call
  call.answer()

  call.addEventListener(CallEvents.Connected, (callevent) => {
    // Play intro speech
    TTSplayer = VoxEngine.createTTSPlayer("Hello, you have called Voximplant Player demo", Language.US_ENGLISH_FEMALE)
    TTSplayer.sendMediaTo(call)

    // Create URL player and save audio to the cache
    URLplayer = VoxEngine.createURLPlayer("http://cdn.voximplant.com/yodl.mp3", true, true)

    // When TTS player playback is finished
    TTSplayer.addEventListener(PlayerEvents.PlaybackFinished, (playerevent) => {
      // Play mp3 file
      URLplayer.resume()
      URLplayer.sendMediaTo(call)
      // End session in 10 sec
      setTimeout(VoxEngine.terminate, 10000)
    })
  })
  call.addEventListener(CallEvents.Disconnected, VoxEngine.terminate)
})

TTS player 

The TTS player doesn't have any onPause-like parameter. Usually, there is no delay in TTS playback, but you can use the same hint as in the previous piece of code: create an instance in advance. The combination of pre-cached URL and TTS players looks like this:

require(Modules.Player)

let call,
    TTSplayer,
    URLplayer

VoxEngine.addEventListener(AppEvents.CallAlerting, (e) => {
  call = e.call
  call.answer()

  // Create TTS player and save audio to the cache
  TTSplayer = VoxEngine.createTTSPlayer("Hello, you have called Voximplant Player demo", Language.US_ENGLISH_FEMALE)
  call.addEventListener(CallEvents.Connected, (callevent) => {
    // Play intro speech
    TTSplayer.sendMediaTo(call)

    // Create URL player and save audio to the cache
    URLplayer = VoxEngine.createURLPlayer("http://cdn.voximplant.com/yodl.mp3", true, true)

    // When TTSplayer playback is finished
    TTSplayer.addEventListener(PlayerEvents.PlaybackFinished, (playerevent) => {
      // Play mp3 file
      URLplayer.resume()
      URLplayer.sendMediaTo(call)
      // End session in 10 sec
      setTimeout(VoxEngine.terminate, 10000)
    })
  })
  call.addEventListener(CallEvents.Disconnected, VoxEngine.terminate)
})

Use the force

We hope this small piece of advice can come in handy during your development journey. Have fun with Voximplant, see you there!

Tags:TTStext-to-speechaudioplayback
B6A24216-9891-45D1-9D1D-E7359CEB8282 Created with sketchtool.

Comments(0)

Add your comment

Please complete this field.

Recommended

Get your free developer account or talk with our sales team to learn more about Voximplant solutions
SIGN UP