Rate this page:

How to use ASR

ASR (automatic speech recognition) operates during a call and provides either recognition of a word among given variants or a "freeform" recognition of an arbitrary speech.


ASR is represented by the ASR module that should be mounted into a scenario via the require syntax. This is how the module is used:

  • Create an ASR object by calling the VoxEngine.createASR method.
  • Subscribe to the ASR object events like ASREvents.Result
  • Send media from a call object to the ASR object via the sendMediaTo method
  • Receive recognized text via events

Recognition: Phrase Hints

Use the following code if you want to build IVR with real-time recognition of some words/phrases from the specified array:

ASR with hints

ASR with hints


Phrase hints supported by the Google profile only (see line 12 in the code above). Note that they do not limit the recognition to the specific list. Instead, words in the specified list will have a higher chance to be selected.

Recognition: Freeform

The Result event is triggered after the voice is recognized. There is always a delay between capture and recognition, so plan user interaction accordingly.

The following code shows how to use the Result event for streaming recognition of an arbitrary speech:

ASR without hints

ASR without hints

False Start

The CaptureStarted event can happen due to background noise. Voximplant VAD (voice activity detection) can be used to mitigate that:



Voice Activity Detection