You can enrich your NLP/NLU experience by using additional services via the WebSocket protocol since Voximplant supports it. See how to use the WebSocket module in your scenarios in an example of a 3rd party NLP/NLU service.
The Voximplant cloud should open an outgoing WebSocket connection to send audio through it. This connection should be opened with a backend server which, in its turn, will exchange data with an NLP/NLU service.
The scenario should look like this:
While being in your application, create a rule (to enable proper scenario execution) and a user (will be used later).
Backend Server (Node.js implementation)
The backend server serves is an intermediary between the Voximplant cloud and an external speech recognition service, in our case it's going to be the Google Cloud Speech-to-Text API. The backend should accept audio from Voximplant, parse it and send in base64 format to Google:
As the server code uses the ws and @google-cloud/speech packages, you must install them before running this code.
Also, you have to obtain and provide your service account credentials to connect the Google library to its backend. Next, run this export command in the same workspace (the same Terminal tab) before executing
Finally, your locally running server must be exposed to the Internet via the ngrok utility. It will generate a unique public URL that you need to substitute for an example value in your Voximplant scenario, line 7:
How to See It in Action
You can log in as a user of the Voximplant application in a web phone, e.g., https://phone.voximplant.com/, click Call and start talking. You'll see the transcription results in your Terminal window in real time.
Any details on how the code of both JS scenario and backend server works are given in the WebSocket integration section.