Rate this page:

Real-time Speech-to-Text (STT) / Automatic speech recognition (ASR)

Voximplant allows you to use voice recognition during the call and to process the results in real time.

Automatic speech recognition module provides a wide range of languages and recognition models for the most popular use cases (audio from video or phone calls, dates, numbers or addresses pronunciation, and so on).

You can use voice recognition to create interactive voice menus, transcribe calls and even use artificial intelligence to detect if it is a pre-recorded voice message or a real person talking.

Voximplant voice activity detection helps to distinguish speech from the background noize and profanity filter helps you to mask explicit speech with asterisks.

Phrase hints mode

By default, the result event triggers as soon as the voice is recognized. When you turn on the phrase hints mode, ASR compares the speech with a predefined array of commonly used phrases, and these phrases have a higher chance to be picked.

Recognition based on provided phrase hints is useful for building IVRs when the expected user input is usually limited to a short number of variants.

Phrase hints

Phrase hints mode is available only for Google profile, and they do not limit the recognition to the specific list. Instead, words in the specified list will have a higher chance to be selected.

You can find all the ASR properties in our API reference.

Beta features

Beta features can drastically increase your app's functionality, though these features can change because they are still in development.

To enable beta features set the beta parameter to true.

For example, Google v1p1beta1 Speech API beta features can help you with proper punctuation in the speech recongition results, save timestamps for every spoken word or even process the up to 3 different languages simultaneously.

You can find more beta features in our API reference.