We recently added IBM Watson™ Text-to-Speech to our list of speech synthesis engine options, expanding the number of voices you dynamically synthesize as part of phone and web calls. This update brings 27 new, unique voices across 11 different languages and 14 different Dialects. This new set of voices also includes 18 of IBM Watson’s newer, more natural sounding neural voices based Deep Learning technology.
With our direct integration with IBM Watson Text-to-Speech (TTS) you can access these voices without having to touch the Watson APIs or an IBM account. More voice options give you more opportunities to choose a voice that fits your unique brand and application requirements. IBM Watson TTS also includes unique SSML options for customizing the voice to your specific brand and customer needs.
With these recent additions, Voximplant now offers nearly 150 voices through leading speech engine providers like Google, Amazon, Yandex, and Tinkoff.
Check out the full list of IBM voice options available in Voximplant:
Code |
Neural |
Dialect |
IBM Ver |
Special features |
---|---|---|---|---|
ar_AR_Omar |
Arabic |
1 |
||
de_DE_Birgit |
✓ |
German |
3 |
|
de_DE_Dieter |
✓ |
German |
3 |
|
de_DE_Erika |
✓ |
German |
3 |
|
en_GB_Kate |
✓ |
British English |
3 |
|
en_US_Allison |
✓ |
American English |
3 |
✓ |
en_US_Emily |
✓ |
American English |
3 |
|
en_US_Henry |
✓ |
American English |
3 |
|
en_US_Kevin |
✓ |
American English |
3 |
|
en_US_Lisa |
✓ |
American English |
3 |
✓ |
en_US_Michael |
✓ |
American English |
3 |
✓ |
en_US_Olivia |
✓ |
American English |
3 |
|
es_ES_Enrique |
✓ |
Castilian Spanish |
3 |
|
es_ES_Laura |
✓ |
Castilian Spanish |
3 |
|
es_LA_Sofia |
✓ |
Latin American |
3 |
|
es_US_Sofia |
✓ |
American Spanish |
3 |
|
fr_FR_Renee |
✓ |
French |
3 |
|
it_IT_Francesca |
✓ |
Italian |
3 |
|
ja_JP_Emi |
✓ |
Japanese |
3 |
|
ko_KR_Youngmi |
Korean |
1 |
||
ko_KR_Yuna |
Korean |
1 |
||
nl_NL_Emma |
Dutch |
1 |
||
nl_NL_Liam |
Dutch |
1 |
||
pt_BR_Isabela |
Brazilian Portugese |
3 |
||
zh_CN_LiNa |
Mainland China |
1 |
||
zh_CN_WangWei |
Mainland China |
1 |
||
zh_CN_ZhangJing |
Mainland China |
1 |
Using IBM Watson Voices in Voximplant is Easy
To use IBM Watson voices in Voximplant, - simply choose one of the IBM voices from the VoiceList options in the language
object within ttsOptions when using call.say, IVR prompt, and createTTSPlayer. The example below demonstrates how you’d use the neural version of IBM’s Allison voice, using the call.say
function:
// Pick your voice here
let ttsVoice = VoiceList.IBM.Neural.en_US_Allison
function onCallConnected(e) {
// Extract just the name portion from the voice without the prepended language code
let ttsVoiceNameRegex = RegExp(/_([A-Z][a-z]{1,})/).exec(ttsVoice.voice)
let ttsVoiceName = ttsVoiceNameRegex ? ttsVoiceNameRegex[1] : "unknown"
let sayOptions = {"language": ttsVoice, "ttsOptions": { "rate": "slow"}};
call.say(`<speak>Hi, I am <emphasis>${ttsVoiceName}</emphasis> from <say-as interpret-as="letters">${ttsVoice.provider}</say-as>. <break time="1s"> See <emphasis>Voximplant.com</emphasis> for more information on speech synthesis <break time="3s"></speak>`, sayOptions)
call.addEventListener(CallEvents.PlaybackFinished, ()=>{
VoxEngine.terminate()
})
}
VoxEngine.addEventListener(AppEvents.CallAlerting, (e) => {
call = e.call
call.addEventListener(CallEvents.Connected, onCallConnected)
call.addEventListener(CallEvents.Disconnected, VoxEngine.terminate)
call.answer()
})
Accessing IBM Watson Voices in Voximplant Kit
IBM Watson voices are also available in Voximplant Kit. You will find IBM and the voices listed above in the Text to Speech, Interactive Menu, and Dialogflow Connector blocks.
Pricing
IBM Watson voices are priced at $25 USD per 1M chars for both standard and neural voices. Billing is quantized in units of 10 characters. See our pricing page for more details.
IBM Watson Voice Features
Neural voices
As noted earlier, IBM Watson offers a neural voice option based on recent advancements in Deep Neural Network (DNN) technology for voices in many languages. Select from VoiceList.IBM.Neural
in VoxEngine
or from the neural options in Voximplant Kit.. In most cases, you should use the neural voice option if it is available unless you are using one of the advanced SSML features noted below.
SSML
Voximplant passes along all SSML tags encoded in prompts sent to TTS engines, including IBM. Remember SSML support details vary by speech engine. See IBM’s SSML support page for details and limitations. Voximplant also has a generic HowTo on using SSML in VoxEngine.
Enhanced SSML
Besides some standard SSML features, IBM also offers some advanced speech modification options. Their Expressive SSML for VoiceList.IBM.en_US_Allison
allows changing the speaking style to an good news, apologetic, or uncertain tone using a custom express-as tag - i.e. <express-as type="GoodNews">
. Note this does not work on the neural version of this voice. See IBM’s Expressive SSML guide for details.
IBM also offers a <voice-transformation>
element that can make larger changes to how a voice sounds. This feature can be used to create a voice that sounds very different from the base model, customized to your parameters. You can make a voice sound younger, breathier, or softer along with many other fine tuning options. See IBM’s documentation for details and their demo of this feature. Voice transformation is only available on the following voices:
VoiceList.IBM.en_US_Allison
VoiceList.IBM.en_US_Lisa
VoiceList.IBM.en_US_Michael
Ready to enhance your voice experience?
● Listen to sample voices from IBM here.