New!
Introducing OpenAI Realtime API Client
voximplant
Voximplant
Close
Show menu

AI & Natural Language Processing

Integrate phone usage into your AI and NLP voicebots using SIP, WebSockets, or Dialogflow connector. Automate routine tasks and improve the user experience.
AI & Natural Language Processing

We Can Help When You

  • Already have a voicebot and want to integrate it with phone usage
  • Want to automate routine tasks
  • Don’t have a voicebot but are eager to build it
  • Want to upgrade your chatbot to a voicebot
  • Want to improve your live agent performance
We Can Help When You

Built-in AI for simple scenarios

Standard voicebots incorporate AI in terms of speech synthesis and recognition only. This works well for cases such as appointment reminders, order confirmations, and simple feedback collection.

Built-in AI for simple scenarios

We allow you to choose AI from one of our built-in providers

Advanced NLP for intent recognition

Sometimes it’s not enough to just capture words that customers say. There are cases when you need to capture specific keywords from speech, recognize customer intent, and ask for missing information. This is where you need an advanced NLP voicebot.

Advanced NLP for intent recognition

We provide you with the Dialogflow connector

Easily Integrate Your Voicebots

SIP

SIP

A classic interface that does not require installation of additional programs, although it lacks flexibility. SIP is limited by the RTP protocol so you need to ensure data you exchange with your bot is compatible with RTP. You can connect to IBM Watson voicebot using SIP.

WebSockets

WebSockets

A more flexible option that is compatible with all AI providers. WebSockets are advanced enough to support media streams, metadata, and control messages in the same channel.

Dialogflow connector

Dialogflow connector

One of the leading solutions for voicebots is Google Dialogflow, and we have a one-click integration with it. Dialogflow is often considered the most powerful gRPC-based interface allowing you to recognize customer intents.

Avoid Voicemail and Connect Agents with Real People

If you perform outbound calling to provide notifications or reach out to prospective customers, you probably want to know whether a person or machine answers. If your call was picked up by a voicemail, it’s better to leave a pre-recorded message in the inbox.
When you make outbound calls, our answering machine detector checks if it’s a voicemail, answering machine, or live person on the other end. This allows you to spend time only on engaged individuals.

Our Prices

Speech recognition

per 15 seconds

Speech synthesis

per 10 characters

  • Speech recognition

    per 15 seconds

    Speech synthesis

    per 10 characters

    -

    Default

    -

    Enhanced

    -

    Default

    -

    WaveNet

  • Speech recognition

    per 15 seconds

    Speech synthesis

    per 10 characters

    -

    Default

    -

    -

    Default

    -

    Neural

  • Speech recognition

    per 15 seconds

    Speech synthesis

    per 10 characters

    -

    Default

    -

    -

    Default

    -

    Neural

  • Speech recognition

    per 15 seconds

    Speech synthesis

    per 10 characters

    -

    -

    -

    Default

    -

    Neural

  • Speech recognition

    per 15 seconds

    Speech synthesis

    per 10 characters

    -

    Default

    -

    -

    Default

    -

  • Speech recognition

    per 15 seconds

    Speech synthesis

    per 10 characters

    -

    Default

    -

    -

    Default

    -

    Neural

  • Speech recognition

    per 15 seconds

    Speech synthesis

    per 10 characters

    -

    -

    -

    Up to 1M characters

    -

    Over 1M characters

Audio input/output

  • Audio input/output

    -

    per 15 sec

    -

    -

    -

  • Audio input/output

    -

    per 1 min

    -

    -

    -

This table shows approximate prices. Contact our expert to get the final price.

Key Features

Mobile, landline, virtual and toll-free phone numbers

Speech recognition/STT/ASR

Converts speech to text during a call or afterwards for further processing and analysis. Voximplant's STT supports 118 languages and dialects provided by Google Speech Cloud, Microsoft Azure STT, Amazon Transcribe, and Yandex Speech Cloud.

Speech synthesis/TTS

Converts text into a human-sounding voice in real time. Developers use TTS when building IVRs and other voice apps. Voximplant has 150 voice options provided by Google Speech Cloud, Amazon Polly, Yandex Speech Cloud, Microsoft Azure TTS, and Tinkoff VoiceKit.

Audio call recordings

AI-based tool that helps to recognize a voicemail when you call your customers. Our AMD system is pre-trained to ensure 99% detection accuracy.

Answering Machine Detection/AMD

AI-based tool that helps to recognize a voicemail when you call your customers. Our AMD system is pre-trained to ensure 99% detection accuracy.

Call transcriptions

Recognizes and transcribes calls, results are saved to a text file at the end of the call.

Dialogflow connector

Allows users to connect a Dialogflow bot with Voximplant for inbound or outbound telephony apps such as Smart IVRs. Developers also connect Dialogflow bots with PSTN or SIP, and optimize its interaction in VoxEngine.