Speech Analytics

How it works - Speech Analytics

MiaRec automatically uploads audio files to the Google Cloud Speech service for transcription.

Once transcription is completed, the results are shown in the call details.

The screenshot below shows transcription, a textual representation of the conversation

When you playback the recording, the transcript is automatically highlighted at that position (see the yellow background in the following screenshot). Click on any word in the transcription and the audio player will fast forwarded to that location.

MiaRec Speech Analytics

You can use "Advanced Search" page to locate recording with a particular keyword or transcription text.

MiaRec Speech Analytics

Check also:

Set up Google Cloud Speech API

This guide provides step-by-step instruction for configuring the Google Cloud Speech API, a speech to text conversion powered by machine learning.

MiaRec uses the Google Cloud Speech API to transcribe voice recordings to text. An transcribed text is used further for speech analytics in MiaRec application.

The Google Speech API recognizes over 110 languages and variants. MiaRec application automatically upload audio to Google Cloud for transcription and retrieves the results back into the application.

1. Create a Google Cloud Platform account

  1. Sign in to your Google account. If you don't already have one, sign up for a new account.

  2. Open GCP Console at console.cloud.google.com

  3. If you were not using Google Cloud Platform before, then click Sign up for a free trial button in the top of page or Try for free in the middle of screen.

  4. Provide Customer info (address, primary contact and payment method / credit card or bank account).

  5. The Welcome screen is displayed when account is activated.

2. Create new project

  1. Create new project by clicking on My First Project in the top menu and then clicking + button.

  2. Choose the name for the project. In our example, we choose miarec-speech-analytics. Note the Project ID for your project. Google requires the project ID to be a globally unique identifier.

3. Enable Google Cloud Speech API for your project

  1. Select the newly created project from the list.

  2. Navigate to APIs & Services.

  3. Click Enable APIs and Services

  4. Type speech in the Search box to and click on Google Cloud Speech API

  5. Click Enable button for Google Cloud Speech API

4. Create a service account key

  1. Navigate to Credentials in the left pane and click Create credentials button. Choose Service account key from the drop-down menu.

  2. Choose the Service account name and set Role to Project -> Owner and click Create button.

  3. Save the JSON file to secure place. You will need to import this file into MiaRec application.

    JSON file looks like (the private key is stored in the private_key attribute):

      "type": "service_account",
      "project_id": "miarec-speech-analytics",
      "private_key_id": "123456789f276ed94a5bd2a11ee645678945679",
      "private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvAIBA...
      "client_email": "miarec@miarec-speech-analytics.iam.gserviceaccount.com",
      "client_id": "12345678945678945613",
      "auth_uri": "https://accounts.google.com/o/oauth2/auth",
      "token_uri": "https://accounts.google.com/o/oauth2/token",
      "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
      "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/miarec%40miarec-speech- analytics.iam.gserviceaccount.com"

Create Google Cloud Storage bucket

This guide provides step-by-step instructions for configuring Google Cloud Storage bucket.

MiaRec needs to upload the audio file to Google Cloud Storage bucket before it is submitted to the Speech API service for transcription.

1. Create bucket

  1. Navigate to Google Cloud Storage console at https://console.cloud.google.com/storage

  2. Make sure the previously created project is selected. Then click Create bucket.

  3. Choose a globally unique name for the bucket, set Default storage class to Regional and choose a region closer to your datacenter (in our example, we choose us-west1.

2. Create lifecycle rule

The Cloud Storage bucket is used only for a temporary storage of audio files. MiaRec application uploads the files to this bucket and instructs the Speech API to take the file from there for transcription. Once the transcription is completed, the file can be deleted from the bucket.

In this step, we will configure automatic deletion of audio files after 24 hours.

  1. In the browser page, click on None in the Lifecycle column for the previously created bucket

  2. In the new window, click Add rule

  3. Select Object condition to Age 1 days and Action to Delete. Click Continue buttons and then Save

MiaRec configuration

1. Configure audio file format

First, you need to change the audio file format settings to increase transcription accuracy. Navigate to Administration -> Storage -> File format and apply the following changes:

  1. Set WAV file format
  2. Set Stereo format
  3. Disable Automatic Gain Control (AGC) filter
  4. Disable Packet Loss Concealment (PLC) filter

2. Configure speech recognition job

The speech recognition job automatically uploads audio recordings to the cloud service for transcription and then retrieves back the transcription results. Multiple jobs can be created with unique settings, for example, one job processes recordings in English and the second in Spanish.

  1. Navigate to Administration -> Speech Analytics -> Speech-to-Text Jobs, click "New Job".

  2. Choose a descriptive name for this job. Upload the Google Cloud Service Key JSON file, created in previous steps. Set the Mode to Incremental.

  3. Set Language.

    Optionally, provide the Phrase hints. You may use these phrase hints in a few ways:

    • Improve the accuracy for specific words and phrases that may tend to be overrepresented in your audio data. For example, if specific commands are typically spoken by the user, you can provide these as phrase hints. Such additional phrases may be particularly useful if the supplied audio contains noise or the contained speech is not very clear.

    • Add additional words to the vocabulary of the recognition task. The Cloud Speech API includes a very large vocabulary. However, if proper names or domain-specific words are out-of-vocabulary, you can add them to the phrases provided to your requests's speechContext.

    Phrases may be provided both as small groups of words or as single words. When provided as multi-word phrases, hints boost the probability of recognizing those words in sequence but also, to a lesser extent, boost the probability of recognizing portions of the phrase, including individual words.

    In general, be sparing when providing speech context hints. Better recognition accuracy can be achieved by limiting phrases to only those expected to be spoken. For example, if there are multiple dialog states or device operating modes, provide only the hints that correspond to the current state, rather than always supplying hints for all possible states.

  4. Specify Filtering criteria for recordings. For example, you can limit transcription to specific group, duration, date, etc.

  5. Configure a Schedule for transcription job. The job can be run either manually or by schedue (every hour/day/week or more often). In the example below, the transcription job will run every 2 minutes.

3. View results

If you run the job manually, then you can see the progress of uploading process:

MiaRec Speech Analytics

It takes some time for the cloud service to complete transcription and return results. Usually, the results are available in a couple of minutes after upload.

You can check the status of the recently uploaded files via menu Administration -> Speech Analytics -> Speech Analytics Processed Records.

After the status changes to "COMPLETE", you can view the call details and transcription by clicking "View call" right on this page. Or you can open the call details from "Recordings" page as usual.

The screenshot below shows the example of transcription.

When you playback the recording, the transcript is automatically highlighted at that position (see the yellow background in the following screenshot). Click on the interesting word in transcription and the audio player will be fast forwarded to that location