We recently added IBM Watson™ Text-to-Speech to our list of speech synthesis engine options, expanding the number of voices you dynamically synthesize as part of phone and web calls. This update brings 27 new, unique voices across 11 different languages and 14 different Dialects. This new set of voices also includes 18 of IBM Watson’s newer, more natural sounding neural voices based Deep Learning technology.

With our direct integration with IBM Watson Text-to-Speech (TTS) you can access these voices without having to touch the Watson APIs or an IBM account. More voice options give you more opportunities to choose a voice that fits your unique brand and application requirements. IBM Watson TTS also includes unique SSML options for customizing the voice to your specific brand and customer needs.

With these recent additions, Voximplant now offers nearly 150 voices through leading speech engine providers like Google, Amazon, Yandex, and Tinkoff.

Check out the full list of IBM voice options available in Voximplant:

Code

Neural

Dialect

IBM Ver

Special features

ar_AR_Omar

 

Arabic

1

 

de_DE_Birgit

German

3

 

de_DE_Dieter

German

3

 

de_DE_Erika

German

3

 

en_GB_Kate

British English

3

 

en_US_Allison

American English

3

en_US_Emily

American English

3

 

en_US_Henry

American English

3

 

en_US_Kevin

American English

3

 

en_US_Lisa

American English

3

en_US_Michael

American English

3

en_US_Olivia

American English

3

 

es_ES_Enrique

Castilian Spanish

3

 

es_ES_Laura

Castilian Spanish

3

 

es_LA_Sofia

Latin American

3

 

es_US_Sofia

American Spanish

3

 

fr_FR_Renee

French

3

 

it_IT_Francesca

Italian

3

 

ja_JP_Emi

Japanese

3

 

ko_KR_Youngmi

 

Korean

1

 

ko_KR_Yuna

 

Korean

1

 

nl_NL_Emma

 

Dutch

1

 

nl_NL_Liam

 

Dutch

1

 

pt_BR_Isabela

 

Brazilian Portugese

3

 

zh_CN_LiNa

 

Mainland China

1

 

zh_CN_WangWei

 

Mainland China

1

 

zh_CN_ZhangJing

 

Mainland China

1

 

Using IBM Watson Voices in Voximplant is Easy

To use IBM Watson voices in Voximplant, - simply choose one of the IBM voices from the VoiceList options in the language object within ttsOptions when using call.say, IVR prompt, and createTTSPlayer.  The example below demonstrates how you’d use the neural version of IBM’s Allison voice, using the call.say function:

// Pick your voice here
let ttsVoice = VoiceList.IBM.Neural.en_US_Allison
 
function onCallConnected(e) {
 // Extract just the name portion from the voice without the prepended language code
 let ttsVoiceNameRegex = RegExp(/_([A-Z][a-z]{1,})/).exec(ttsVoice.voice)
 let ttsVoiceName = ttsVoiceNameRegex ? ttsVoiceNameRegex[1] : "unknown"
  
 let sayOptions = {"language": ttsVoice, "ttsOptions": { "rate": "slow"}};
 call.say(`<speak>Hi, I am <emphasis>${ttsVoiceName}</emphasis> from <say-as interpret-as="letters">${ttsVoice.provider}</say-as>. <break time="1s"> See <emphasis>Voximplant.com</emphasis> for more information on speech synthesis <break time="3s"></speak>`, sayOptions)
 
 call.addEventListener(CallEvents.PlaybackFinished, ()=>{
   VoxEngine.terminate()
 })
}
 
VoxEngine.addEventListener(AppEvents.CallAlerting, (e) => {
 call = e.call
 call.addEventListener(CallEvents.Connected, onCallConnected)
 call.addEventListener(CallEvents.Disconnected, VoxEngine.terminate)
 call.answer()
})

Accessing IBM Watson Voices in Voximplant Kit

IBM Watson voices are also available in Voximplant Kit. You will find IBM and the voices listed above in the Text to Speech, Interactive Menu, and Dialogflow Connector blocks.

Pricing

IBM Watson voices are priced at $25 USD per 1M chars for both standard and neural voices. Billing is quantized in units of 10 characters. See our pricing page for more details.

IBM Watson Voice Features

Neural voices

As noted earlier, IBM Watson offers a neural voice option based on recent advancements in Deep Neural Network (DNN) technology for voices in many languages. Select from VoiceList.IBM.Neural in VoxEngine or from the neural options in Voximplant Kit.. In most cases, you should use the neural voice option if it is available unless you are using one of the advanced SSML features noted below.

SSML

Voximplant passes along all SSML tags encoded in prompts sent to TTS engines, including IBM. Remember SSML support details vary by speech engine. See IBM’s SSML support page for details and limitations. Voximplant also has a generic HowTo on using SSML in VoxEngine.

Enhanced SSML

Besides some standard SSML features, IBM also offers some advanced speech modification options. Their Expressive SSML for VoiceList.IBM.en_US_Allison allows changing the speaking style to an good news, apologetic, or uncertain tone using a custom express-as tag - i.e. <express-as type="GoodNews">. Note this does not work on the neural version of this voice. See IBM’s Expressive SSML guide for details.
 
IBM also offers a <voice-transformation> element that can make larger changes to how a voice sounds. This feature can be used to create a voice that sounds very different from the base model, customized to your parameters. You can make a voice sound younger, breathier, or softer along with many other fine tuning options. See IBM’s documentation for details and their demo of this feature. Voice transformation is only available on the following voices:

 VoiceList.IBM.en_US_Allison
 VoiceList.IBM.en_US_Lisa
 VoiceList.IBM.en_US_Michael

Ready to enhance your voice experience?

●      Listen to sample voices from IBM here.