Rate this page:

4. Speech Recognition

You’ve probably noticed that our backend code contains some lines related to Google Cloud.

The library itself is imported this way:

const speech = require('@google-cloud/speech');

Now you need to specify how to process the request. To do that, choose an encoding, sampleRateHertz, and languageCode in the config:

const config = {
    encoding: 'MULAW',
    sampleRateHertz: 8000,
    languageCode: 'en-US',
};

Then, create a new stream to be written into a binary file:

const wstream = fs.createWriteStream('myBinaryFile');

When everything is set up, you should parse the message and put base64 audio data to recognizeStream:

let data = JSON.parse(message)
if (data.event == "media") {
    b64data = data.media.payload;
    let buff = new Buffer.from(b64data, 'base64');
    recognizeStream.write(buff);
    wstream.write(buff);
}

Right after this, a recognition request will be initiated and therefore handled:

recognizeStream = client
    .streamingRecognize(request)
    .on('data', data => {
        ws.send(data.results[0].alternatives[0].transcript)
    });

Lastly, obtain and provide your service account credentials to connect the Google library to its backend. To do this, go to the Google Authentication page and complete all the steps listed there. Next, run this export command in the same workspace (the same Terminal tab) as the node your_file_name.js:

export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/[FILE_NAME].json"