Rate this page:

WebSocket protocol

WebSocket is an advanced standard for full-duplex (two-way) communication between a client and a third-party service in real-time. It is used to organize continuous data exchange while keeping the connection open. No extra HTTP-requests needed.

Our platform provides the WebSocket module to connect with other web services. With this module, you can create outgoing and accept incoming WebSocket connections and easily send data through them.

Outbound connection

Copy URL

The first thing you should do is create a WebSocket object via the VoxEngine.createWebSocket method. It accepts 2 parameters: URL in the format of 'wss: // + domain + path' and protocols (optional).

Then you can send data via the WebSocket via the call.sendMediaTo method. In case of audio, you can set a required encoding format, a tag, and some custom parameters. If you do not set an encoding, PCM8 is selected by default. The WebSocket.send method, in turn, sends a decoded data stream in JSON format via the WebSocket. Thus, you get messages from the service handling your requests.

The WebSocket.close method closes the connection. Please note that the connection can be closed both from the client-side and the server-side.

See the picture below to learn how it all works:

Outbound WebSocket

The code for connecting to a web service looks like this:

Outbound WebSocket

Outbound WebSocket

Inbound connection

Copy URL

To make incoming WebSocket connections available, use the VoxEngine.allowWebSocketConnections method. Then, subscribe to the AppEvents.WebSocket event. Now you will receive a corresponding event every time a connection is made to the session URL. Also, you can get a WebSocket object: event.WebSocket.

Session URL can be obtained from the API response of the StartScenarios method or directly from the AppEvents.Started event. Please note that 'https' should be changed to 'wss' in the URL.

Once the connection is established, you can send data via the WebSocket via the call.sendMediaTo method.

See the picture below to learn how it all works:

Inbound WebSocket

The sample code for accepting incoming connections looks like this:

Inbound WebSocket

Inbound WebSocket

Please note

The maximum number of incoming WebSocket connections cannot be bigger than the number of calls in one session + 3. Trying to make one more connection leads to an error and trigger the NewWebSocketFailed event. Existing connections are not destroyed after a call is ended.

Sending audio and text data via WebSocket

Copy URL

To send text via WebSocket, use the WebSocket.send method:

Sending text

Sending text

Use an echo server to check that everything works properly:

Server code

Server code

To send audio via WebSocket, use the call.sendMediaTo method. Here you can set a preferred encoding format, a tag, and some custom parameters. If you do not set an encoding, PCM8 is selected by default.

Sending audio to WebSocket

Sending audio to WebSocket

Sending audio from a WebSocket to a call is also possible. This way audio data is sent via the WebSocket to an echo server and back like this: webSocket.sendMediaTo(call, { "tag" : "incoming"}). The tag parameter (arbitrary value) allows you to select an audio stream from several simultaneously sent ones.

Sending audio from WebSocket to a call

Sending audio from WebSocket to a call

Write some server code to receive audio:

Server code

Server code

Here is the protocol for transmitting audio data via WebSocket (it works both ways, so use it when you send audio to WebSocket and from WebSocket to a call):

  1. Data stream description
{ 
 "event": "start", // audio stream start
 "sequenceNumber": "0", // message counter
 "start": {
   "tag": "incoming", // tag in sendMediaBetween
   "mediaFormat": { 
     "encoding": "audio/x-mulaw", 
     "sampleRate": 8000, 
     "channels": 1 
   },
    "customParameters": {
        "text1":"12312" // any text
  }
 }
}
  1. Data stream termination
{ 
 "event": "stop",
 "sequenceNumber": "777",
  "stop": {
    "tag": "incoming", // tag in sendMediaBetween
    "mediaInfo": { 
            "bytesSent": 21100, // audio bytes before base64
            "duration": 124, // sec 
        }
    }
}
  1. Data stream format
{ 
 "event": "media",
 "sequenceNumber": "2", 
 "media": { 
   "tag": "incoming", 
   "chunk": "1",      // message count within one tag
   "timestamp": "5",  // to synchronize audio streams if needed
   "payload": "no+JhoaJjpzSHxAKBgYJ...=="
 } 
}

Learn more about setting up the Voximplant side and the server side to send audio to a third-party service in the Connect external STT providers article.