In this quickstart, you will learn how to build a first-line-support robot for a restaurant. The robot will handle:
Questions about open hours (intent - “openHours”)
Questions about delivery options (intent - “delivery”)
Reservation requests (intent - “reservation”)
Create an avatar
Go to the Avatars section of the control panel and click the Create button. Enter your avatar name and click Create again.
Now you have an empty avatar with predefined basic intents and a basic dialog scenario.
Set up intents
To add an intent, click the Add intent button, type “open_hours” in the modal window, and click Create:
Now you can set up your intent by adding phrases a user might say and default responses to those phrases.
Add 10-20 training phrases, for example:
What are your opening hours?
Do you work tomorrow?
How late are you open until tonight?
Then go to the Responses tab and add a default response: "We are open every day from 7 am till 9 pm."
Repeat this procedure for “delivery” and “reservation” intents. So as not to have to do it manually, import all the data from this file by clicking the Import button.
When you add the data you will see the “Training required” hint.
This means that you need to retrain your neural network with the added data. So click on it and wait till the training finishes.
Your avatar is ready to handle natural speech so let’s jump into building a scenario.
Write a dialog scenario
Go to the Dialog scenario tab. Here you will find the default conversation scenario that needs to be replaced with yours. Let’s start with a very simple one-state dialog.
Here your avatar will greet the user upon entering the state, and if they say something that fits into the list of utterances, it responds with a default response defined in the UI on the previous step. If the intent is unknown, it returns ‘can’t get that' and continues listening.
addState registers a dialog state. The dialog can be in only one state at a time. А state defines the avatar reactions to user inputs.
onEnter handler is called when the dialog enters a state. It is like a doorway where you can greet the user.
onUtterance handler is called when the user says something to the avatar (when the dialog is in a certain state). Here you can check what the user intent is, which entities are extracted, and then respond.
Response generates a special object that defines the avatar reaction upon triggering the handler. Using it, you can return the utterance to be sent to the user, set the listen flag that specifies whether the avatar should listen to further user input, and set nextState indicating the state to which the dialog should jump.
setStartState defines the dialog entry point.
Time to test your avatar! To do this, save your scenario and click Debug. This dialog box pops up:
Type some questions and check if your avatar responds correctly:
All works but you need to add exit points to the scenario to make it not infinite. For example:
If the avatar did not understand the user 3 times.
When you ask the user “Can we help you with anything else?“, and the user says 'no'.
With these exit points, the scenario becomes a bit more complex. See what you should do here:
- Check if the entry state is entered for the first time. If yes, let your avatar greet the user; if no, let it ask if the user needs help with something else. For this purpose, use visitsCounter and increment it every time the dialog enters the state.
- In the onUtterance handler, add a branch to handle the “no“ and “yes” intents for “can I help you with something else?“. Depending on the response, finish the dialog (by going to the final state) or continue the conversation.
- Every time the user says something while the dialog stays in the same handler, increment utteranceCounter. When leaving the handler, reset it. So if the user says something unknown again and again without leaving the handler, this counter will grow. When it becomes equal to 3, the avatar stops the dialog.
- The final state of the dialog causes termination. You can pass some additional information to the scenario at this state. For example, you can add a custom key – needRedirectionToOperator.
Now your avatar can understand user intents and respond to them, so it is time to teach it how to book tables.
Track the conversation context alongside the conversation state when filling out the reservation form. To do this, declare the reservationForm object at the very beginning of the scenario. Use it to store all the collected information about the reservation.
Since the user can request a reservation with both phrases: “Can I book a table?“ (no parameters) and “Can I book a table for two?“ (people: 2), start collecting extracted entities (date and number of people) from the very first phrase in the onUtterance handler of the start state. Store the collected info in the reservation form.
In the new reservation state, formulate a question to the user depending on the info in the form. If the user does not answer and the weirdUtterancesInRow counter overflows, stop trying to complete the form and return to the start state.
In the onUtterance handler of the reservation state, check if the requested information is given through entities. If yes, add it to the form and continue the loop. If not, increment the weirdUtterancesInRow counter to avoid getting stuck in an endless loop.
At the same time the avatar can continue answering clarifying questions without stopping the form filling process, using the intent check in the onUtterance handler.
In the reservationConfirm state, double-check the provided information. If everything is ok, ask if you can help with something else by returning to the start state.
The final state passes the information from the form to the scenario. Alternatively, you can send it to your CRM/backend right from the avatar scenario by calling the httpRequest method.
Now, you have a fully functional bot with a complex scenario that supports a flexible and natural flow of dialog. Let’s integrate it with telephony.
Integrate with telephony and chat
You have two integration options:
Using the Avatar class - you will have fine-grained control over integration
Using the VoiceAvatar class - it already has ASR + TTS integration with telephony and avatar logic
Let’s implement the second one.
To do that:
- Copy the code from the Integration section of your avatar.
- Create a platform scenario and paste the code there.
- Specify ASR and TTS options in the configuration step (select a language and voice).
- Set everything up to handle calls via this scenario and test it.
Congratulations! Now you can call your avatar and have a voice conversation with it.
Refer to the AvatarEngine and VoxEngine.VoximplantAvatar API references to create a more complex solution.