Standard voicebots incorporate AI in terms of speech synthesis and recognition only. This works well for cases such as appointment reminders, order confirmations, and simple feedback collection.
Synthesizes speech in 27 languages, provides 197 total voice options, 100 of which are neural. Google’s technology is used in Google Assistant, Google Search, and Google Translate.
Provides 116 total voice options covering 35 languages and 49 dialects. These options include 36 neural options based on the latest deep learning technology.
Turns text into life-like speech. Polly's text-to-speech service covers 18 languages and 58 voice options including 13 neural ones.
Offers 41 voices with 19 neural options in 11 languages. IBM Watson solution can learn from customer conversations.
Uses deep neural network models for speech recognition and synthesis and is used to create a financial voice assistant called Oleg. It’s available in Russian.
Allows you to recognize or speak any text in 3 languages. SpeechKit is what powers Alice, the Yandex voice assistant.
Sometimes it’s not enough to just capture words that customers say. There are cases when you need to capture specific keywords from speech, recognize customer intent, and ask for missing information. This is where you need an advanced NLP voicebot.
is suitable for short conversations and has been praised for its simplicity. ES is often used with voice apps in which a short utterance matches one intent. For instance, a food delivery voice app where you can say: «I would like to order a pizza». The voicebot will offer available pizzas, ask for size, quantity and delivery address.
is an advanced voicebot type that is suitable for complex and long conversations over 10 minutes. CX has two key features: voicebots can transfer calls to live agents and users can interrupt voicebots so that it starts to listen again.
A classic interface that does not require installation of additional programs, although it lacks flexibility. SIP is limited by the RTP protocol so you need to ensure data you exchange with your bot is compatible with RTP. You can connect to IBM Watson voicebot using SIP.
A more flexible option that is compatible with all AI providers. WebSockets are advanced enough to support media streams, metadata, and control messages in the same channel.
One of the leading solutions for voicebots is Google Dialogflow, and we have a one-click integration with it. Dialogflow is often considered the most powerful gRPC-based interface allowing you to recognize customer intents.
per 15 seconds
per 10 characters
per 15 seconds
per 10 characters
Default
Enhanced
Default
WaveNet
per 15 seconds
per 10 characters
Default
Default
Neural
per 15 seconds
per 10 characters
Default
Default
Neural
per 15 seconds
per 10 characters
Default
Neural
per 15 seconds
per 10 characters
Default
Default
per 15 seconds
per 10 characters
Default
Default
Neural
per 15 seconds
per 10 characters
Up to 1M characters
Over 1M characters
per 15 sec
per 1 min