FAQs: Speech and text analytics

If an interaction is already summarized by Agent Copilot, will I be charged again for Virtual Supervisor’s AI Summary and Insights?

No. If Agent Copilot is enabled for an interaction and already generates a summary, Virtual Supervisor’s AI Summary & Insights uses that same summary without generating additional cost. You are not billed twice for the same interaction.

Sentiment analysis – Can I get a sentiment score of 100?

Q: What does a sentiment score of 100 represent?
A: A score of 100 is not a goal or KPI, it’s a signal of an exceptional emotional experience, and it indicates:

The customer expressed multiple strong positive sentiments, like delight, gratitude, or praise.
There was no negative sentiment detected during the interaction.

Q: Is it realistic to aim for a score of 100?
A: Not really. This score reflects emotionally impactful experiences, not to imply that every good interaction should score 100. Most effective, respectful interactions will score below that and still be considered excellent.

Q: So what should I aim for instead?
A: Focus on ensuring that:

The interaction is appropriate for the context. Don’t force positive sentiment where it’s not relevant.
In emotionally charged conversations, strive for a positive Sentiment Trend and expressions of empathy, especially toward the end.
In purely transactional interactions, clarity, speed, and resolution may be more important than emotional tone.
Regardless of tone, the customer should feel heard, respected, and that their issue was addressed.

Note: The goal is emotional impact, not a perfect number.

Sentiment analysis – What do sentiment metrics really mean, and how should I interpret them?

Q: What’s the difference between sentiment score and sentiment trend?
A: These two metrics answer different questions:

Sentiment score reflects how the customer felt overall, with greater weight given to their emotions at the end of the interaction. It’s a normalized weighted average of all sentiment events during the conversation.
Sentiment trend measures how the customer’s emotional state changed over time. It calculates the difference between average sentiment at the beginning and at the end, divided by two.

Q: How should I read specific sentiment scores?
A:

Score of 60: A lightly positive interaction. This might reflect a single positive phrase later in the call, with no major emotional highs or lows.
Score of 80: A more sustained positive interaction. Typically, multiple positive events occurred, especially later in the call.
Score of 100: Extremely rare. Indicates a peak emotional experience—several clear positive expressions (like praise or delight) and no negativity.

Q: What does the sentiment trend tell me?
A: It captures the emotional trajectory of the customer:

A positive trend indicates emotional improvement.
A negative trend may suggest increasing frustration—even if the overall score is high.

Examples:

Score 60 + Trend +30 = Emotional improvement, mild to positive.
Score 80 + Trend +5 = Consistently positive.
Score 80 + Trend -20 = Started well but declined—pay attention!

For more information about fomula calculation, see Sentiment Score Formula, and Sentiment Trend Formula.

Is there a charge to generate Supervisor Copilot summary for interactions that were already summarized by Agent Copilot?

Genesys Cloud will not double-bill for summaries when an interaction has already been summarized by Agent Copilot, even if the AI Insights toggle is also enabled and a conversation summary is generated.

For more information, see About Virtual Supervisor and Copilot.

Do the existing evaluation limits apply to AI Scoring driven evaluations?

Yes, agent and evaluation centered evaluation daily limits (50 per day), apply to both regular and AI driven evaluations. The limit can be increased on an individual customer basis to either 100 or 200 per day, by creating a customer care ticket with this request.

Is STA (Speech and Text Analytics) required for AI Scoring?

Yes, if you intend to use AI scoring to evaluate calls. AI Scoring requires interactions to be transcribed.

What is required to enable AI Scoring?

To use Virtual Supervisor AI Scoring, your organization must have Speech and Text Analytics, Quality Management, and AI Experience Tokens enabled.

How is AI Summary and Insights enabled?

AI Insights can be turned on at the Program level. Only interactions tied to queues, or flows mapped to the specific program will be processed.

Do permissions affect Virtual Supervisor billing?

No. Assigning Virtual Supervisor permissions does not trigger charges. Charges are consumption-based. For example, you are charged when summaries or insights are generated, when an interaction is translated, and when an interaction is evaluated with a form that has questions enabled with AI Scoring.

Is there a charge to view summaries already created by Virtual Supervisor?

No. If summaries, or insights are generated by Agent Copilot, viewing them via Virtual Supervisor does not incur an additional charge.

How is Virtual Supervisor billed?

AI Summary and Insights – Billing occurs only when summaries or insights are generated, not when they are viewed or retrieved via API. Charges are based on Genesys AI Tokens, where 1 token = 50 summaries and insights.
AI Translate – Billing occurs when an interaction transcript is translated from one language to another. Charges are based on Genesys AI tokens where 1 token = 2 interaction translated.
AI Scoring – Billing occurs when an interaction is scored with an evaluation form that has questions enabled with AI Scoring. Charges are based on Genesys AI tokens where 1 token = 20 evaluation analyzed with AI Scoring.

What does Virtual Supervisor include?

Virtual Supervisor includes:

AI Scoring
AI Summary and insights
AI Translate

For more information, see: AI scoring best practices.

How do I choose appropriate strictness values when entering a topic phrase?

To determine the optimal strictness for a phrase, start with a default setting, then evaluate the captured events and adjust the strictness accordingly. Experimentation is key to finding the right balance between precision and recall.

Longer phrases with more meaningful words often require less strict matching. For example, a three-word phrase might match effectively with medium strictness (two out of three words), while a five-word phrase might only need three matches (medium-low strictness). The ideal strictness depends on the specific phrase and its intended use.

Below are some examples where adjusting strictness can either help or hinder your results, highlighting the importance of tweaking strictness for the best precision/recall tradeoff.

For more information, see Work with a topic, and Work with a phrase.

Examples where raising the strictness will avoid mistakes (increase precision)Click to expand

Defined program phrase	Detected phrase	Topic matched	Selected strictness	Explanation
Called the other day	Called there the other day	Contacted Previously	Medium	“There” provides context that make this irrelevant to the matched topic. Increasing the strictness would deem this occurrence as ‘not a close enough match’ for Contacted Previously.
I called back	I may call back	Contacted Previously	MedLow	“May” provides context that make this irrelevant to the matched topic. Increasing the strictness would deem this occurrence as ‘not a close enough match’ for Contacted Previously.
You’d like to cancel	Cause it wouldn’t be like cancelling	Cancel Mention	MedHigh	The context of “cancel” is different, therefore making this irrelevant to the matched topic. Increasing the strictness would deem this occurrence as ‘not a close enough match’ for Cancel Mention.

Examples where raising the strictiness will miss good occurrences (decrease recall)Click to expand

Defined program phrase	Detected phrase	Topic matched	Selected strictness	Explanation
I called a while ago	Called a little while ago	Contacted Previously	Med-Low	Contacted Previously was correctly matched, despite the extra “little” in the detected phrase. Increasing the strictness would result in classifying this occurrence as “not a close enough match” for Contacted Previously.
I spoke to another person	Person I spoke to	Contacted Previously	Med-Low	Contacted Previously was correctly matched, despite the missing “another” in the detected phrase. Increasing the strictness would result in classifying this occurrence as “not a close enough match” for Contacted Previously.
Want to cancel	Want to just cancel	Cancel Mention	Med-Low	Cancel Mention was correctly matched, despite the extra “just” in the detected phrase. Increasing the strictness would result in classifying this occurrence as “not a close enough match” for Cancel Mention.
Cancel the order	Cancel the money order	Cancel Mention	Med-Low	Cancel Mention was correctly matched, despite the extra “money” in the detected phrase. Increasing the strictness would result in classifying this occurrence as “not a close enough match” for Cancel Mention.

Topic spotting – What is the best way to organize my topics ?

The best way to organize topics is to tag them.

Tags provide a versatile framework for categorizing and classifying topics across a wide range of contexts. Tags create contextual connections between seemingly unrelated topics. By tagging topics, you establish an intricate web of associations, enabling users to discover related information. For example, tags can be used to find topics in Topic Definitions.

Does Genesys Cloud stream media outside your region when using the Nuance Recognizer as a Service integration?

Genesys Cloud streams audio in real time to a user-configurable API URL endpoint for Nuance Recognizer as a Service (NRaaS). NRaaS is a Genesys Cloud integration with Nuance, an external provider, which requires Genesys to route requests to cloud services hosted outside the Genesys cloud infrastructure. Therefore, for example, even if the Genesys Cloud organization that uses the NRaaS integration is located in Mumbai, Genesys Cloud cannot guarantee that it keeps all streamed audio exclusively within the Mumbai region. Also, if a Genesys org is based in Mumbai, but the NRaaS integration supports languages like Canadian French or Italian, Nuance serves those languages from data centers in Canada and Europe, respectively. According to the Nuance documentation, ASR requests for these languages must be sent to Nuance URLs hosted in those regions, outside the Mumbai region. The same applies to all Genesys Cloud regions.

Note: Nuance has announced that Nuance Recognizer as a Service will reach end-of-life (EOL) by 2027. For more information, see .

Voice transcription – How much does Extended Voice Transcription Services or Native Voice Transcription cost?

Extended Voice Transcription Services (EVTS) and Native Voice Transcription are billed on a per-minute basis.

Regardless of the transcription engine used, the cost for Voice Transcription is the same and is billed in the currency of your contract.

Depending on your organization’s Voice Transcription offer, a fair use allocation will be included. The following table shows the latest pricing for both EVTS and Native Voice Transcription:

USD	CAD	AUD	NZD	GBP	EUR	BRL	JPY	ZAR
0.0100	0.0110	0.0130	0.0140	0.0070	0.0080	0.0400	1.2000	0.1420

As noted above, your organization will have a fair use allocation for Native Voice Transcription, EVTS, or both, depending on the offer available to you.

Voice transcription offers

There are two Voice Transcription offers in Genesys Cloud:

Voice Transcription (Native and Extended) consists of one SKU (this is the default offer):
- GC-170-NV-VOICETRANSCRIPTION
Voice Transcription (Legacy) consists of the following SKUs (this offer is not available for new contracts):
- GC-170-NV-VTFAIRUSEO
- GC-170-NV-EVTS

An organization can only subscribe to one of these offers at a time, not both.

Fair use allocation

Under the Voice Transcription (Native and Extended) offer (GC-170-NV-VOICETRANSCRIPTION), EVTS and Native Voice Transcription share a combined fair use allocation. This ensures consistent global support for a wide range of dialects and languages. See the Genesys Cloud fair use policy for details about current allocations.
Under the Voice Transcription (Legacy) offer (GC-170-NV-VTFAIRUSEO and GC-170-NV-EVTS), EVTS does not include a fair use allocation and is billed from the first minute of use. Native Voice Transcription does include a fair use allocation, as defined in the Genesys Cloud fair use policy.

Note:

To enable Extended Voice Transcription Services (EVTS), administrators may need to activate it through AppFoundry or Integrations, or request assistance from their CSM.
When using EVTS, transcribed users are not billed for Genesys Cloud CX 1 WEM Add-on II, Genesys Cloud CX 2 WEM Add-on I, or the Speech and Text Analytics Add-on, provided that Topic Spotting is not enabled for those interactions.

Voice transcription – How does Extended Voice Transcription Services – Azure provide customer data security?

Extended Voice Transcription Services streams media outside of Genesys Cloud to a third party to generate voice transcripts. Currently, these Extended Voice Transcription Services are provided by Microsoft through their Azure Speech-to-Text offering. As part of this combined offering, Genesys ensures data security in the following ways:

Note: Genesys Cloud is transitioning the Extended Voice Transcription Services engine from Microsoft Azure to AWS Transcribe. Impacted organizations will receive advance notice prior to any changes./bs_well]

Azure Speech-to-Text does not store any audio or transcription data at rest. All data in-transit is encrypted. For more information, see Microsoft Data and Privacy for Speech-to-Text.
The media sent to Azure Speech-to-Text services is processed only in Azure’s server memory and no data is stored at rest by the third party.
Once transcribed, all transcripts are encrypted and safely stored within Genesys Cloud.
All media sent to a third party is encrypted using TLS.
Transcripts created by Extended Voice Transcription and recorded interactions are stored by Genesys Cloud using the same type of encryption.

For more information, see Recording encryption key overview, Understand voice transcripts, and Azure regions for Extended Voice Transcription Services.

Voice transcription – What is the difference between Genesys Cloud Voice Transcription and Extended Voice Transcription Services?

Both Genesys Cloud Voice Transcription and Extended Voice Transcription Services (EVTS) can transcribe voice interactions.

The differences between Genesys Cloud Voice Transcription and Extended Voice Transcription Services (EVTS) are summarized in the following list.

EVTS extends Genesys Cloud’s own native transcription.
EVTS uses third party transcription services and may have different performance attributes.
EVTS can provide access to additional dialects and languages.
EVTS uses a non-customizable transcription model. Customization is only available with Genesys Voice Transcription.
For non-Genesys Cloud CX 3 customers (in addition to EVTS charges), the customer will also be billed for WEM Add-on when Topic Spotting is used.

Note: During call segments, WEM voice transcription may use transcripts using Google Dialogflow. For more information, see .

For more information about EVTS, see:

Content search – For how long can you search for an interaction?

The amount of time you can search for and retrieve an interaction depends on the query filter options you use:

Words or phrases – You can search for words or phrases in transcripts for up to 90 days from the date of the interaction.
Topics, sentiment score, and sentiment trend – You can search for topics and/or sentiment scores for up to 1.5 years from the date of the interaction.

For more information, see the Content Search view.

Topic spotting – Can I turn off topics detection in the IVR leg of the voice call?

Yes, it is possible to turn off topic spotting in the IVR portion of an interaction.

To turn off topic spotting in the IVR portion of an interaction, you must remove the default program and ensure that only queues are mapped to programs.

Follow the steps below to review your configuration:

Navigate to Admin > Quality > Speech and Text Analytics.
Click Menu > Conversation Intelligence > Speech and Text Analytics > Speech and Text Configuration.
Ensure that Voice Transcription is set to Enabled based on Queue configuration or Flow action.
Change the default program selection to NONE.
Navigate to Programs under the Quality section.
Review the program list and look for programs that have a non-zero (0) entry in the Assign Flows column.
If you find a program with one or more flows mapped to it, open it by clicking on the program name.
Expand the Queues menu and ensure that the queues that will be transcribed by this program are selected.
Expand the Flows drop down menu and ensure that no options are selected.
Click Publish.

Voice transcription – Are interaction transcripts encrypted when stored in the cloud?

Interaction transcripts are encrypted and safely stored to protect them from unauthorized access. Transcripts are encrypted with AES 256-bit encryption using customer/organization-specific encryption keys. For more information, see Encryption key overview.

An organization may choose to make transcripts searchable as a part of the content search feature. In this case, transcript information is indexed in this search cluster using a Genesys Cloud-wide encryption key, not an organization-specific encryption key.

Note: These transcripts are only searchable and are stored in this manner for 35 days. Organizations can opt in or out of having searchable transcript information.

Topic spotting – Are email and chat attachments analyzed?

No, email and chat attachments are not analyzed. Speech and text analytics features only analyze the email and chat itself.

Licensing and costs – Are there any additional costs when voice transcription is enabled?

Voice transcription is a speech and text analytics feature and it is included as part of the Genesys Cloud CX 1 WEM Add-on II or Genesys Cloud CX 2 WEM Add-on I, and Genesys Cloud CX 3 license. A fair use policy is in place for voice transcription that allows customers to use an allocated number of transcribed audio minutes, per Genesys Cloud user, per month, without incurring additional costs.

For more information, see: Fair use voice transcription charges, and Voice transcription – How much does Extended Voice Transcription Services or Native Voice Transcription cost?.

Topic spotting – Are there out-of-the-box topics?

The speech and text analytics solution includes a variety of out-of-the-box topics for the following areas of analysis:

Agent behavior
Contact reason
Customer experience

For more information, see Out-of-the-box topics.

Topic spotting – Are topics and intents from bots the same thing?

Although they are similar concepts, topics and bots are unique and serve different purposes in Genesys Cloud.

For speech and text analytics, topics are a collection of phrases that indicate a specific business opportunity that the organization is actively looking for within the interaction transcriptions.

For more information, see About programs, topics, and phrases.

Best practices for topic creation

To create topics for your business, best practice recommends that you follow these steps:

Create a default program if it does not already exist. For more information, see Speech and text analytics settings.
Define the topic and determine its scope by establishing the type of linguistic events you hope to find using this topic.
Name the topic using naming conventions that are easy for end users to understand (for example, customer dissatisfaction, agent rapport building, and so on).
Populate the topic with phrases.
1. Consider possible phrases.
2. Collect phrases from existing interactions.
3. Search for the words in the phrases you are considering, to better understand the context in which they appear.
4. Create additional variations of the phrases you already created.
  For more information, see Work with a phrase and Work with a topic.
Add the topic to the desired program. For more information, see Work with a program.

Voice transcription – Best practices when setting up voice transcription

To set up voice transcription, best practice recommends that you follow these steps:

Determine whether or not your organization will benefit from transcribing all agent interactions or only a specific set of lines of business.
Enable voice transcription. For more information, see Speech and text analytics settings.
Determine the best way to identify specific agents. Should you target specific queues, or should you create an Architect flow action?
Create a program and set it as the default program. For more information, see Work with a program and Speech and text analytics settings.
Assign the default program to the queues and/or flows that should have transcription enabled.

Voice transcription – Can I download a voice transcript?

You can export transcripts from one or more interactions using the speech and text analytics API.

Also, a transcript can be copied manually from the Interaction Details page by clicking the Copy Transcript option in the top right corner of the transcript. For more information, see Work with a digital transcript.

For more information, see Speech and text analytics API.

Sentiment analysis – Can I turn sentiment analysis on or off?

For transcripts, sentiment analysis is always automatically performed after transcription is complete, the two are interconnected.

For digital transcripts, sentiment analysis only occurs if the expected dialect is set to one specific supported language. For more information, see the Select one or more dialects for digital interactions section in the Speech and text analytics article.

For more information, see Understand sentiment analysis.

Licensing and costs – Can voice transcription usage be monitored?

Currently, voice transcription usage cannot be monitored externally. Genesys does measure voice transcription usage internally and contacts customers who are nearing their usage limit.

Since voice transcription usage is not available externally, Genesys exercises leniency when a customer goes over the allotted transcription usage quota, and provides the necessary time required to adjust their usage.

For more information, see View your billing and usage summary.

Topic spotting – How do I create and configure a default program?

To create a program and configure it as a default program, you must have administrator privileges.

The default program highlights and transcribes the topics and phrases associated with all of the interactions that are not serviced by another active program.

To create a program, click Admin > QM > Programs and click Create Program. To create a program, click Menu > Conversation Intelligence > Speech and Text Analytics. Click Programs. For more information, see Work with a program.

To configure the specific program as the default program, click Admin > Quality, and click Speech and text analytics settings. To configure the specific program as the default program, click Menu > Conversation Intelligence > Speech and Text Analytics. Click Programs. Select the specific program from the Default program list. For more information, see Speech and text analytics settings.

Topic spotting – How do I generate and use out-of-the-box topics?

To generate and use out-of-the-box topics, you must invoke the create general program API using the following parameters:

Dialect – en-US, es-US, en-AU or en-GB

Mode – merge or skip

For example, /api/v2/speechandtextanalytics/programs/general/jobs.

For more information, see Out-of-the-box topics.

Voice transcription – What is the accuracy of voice transcription and how do I increase it by including brand names, acronyms or internal terminology?

Genesys Cloud’s voice transcription accuracy is comparable to that of other leading providers and hyperscalers. Several factors can influence accuracy, including audio quality, speaker accents, background noise, and the complexity of the language.

For guidance on improving overall accuracy, see Improving transcription accuracy.

To help the system recognize business- or domain-specific terms, such as brand names, acronyms, or internal terminology. For more information, see Understand dictionary management.

Sentiment analysis – How do I use sentiment analysis to improve business operations?

Sentiment analysis automatically highlights the exact moment when a customer conveys positive, neutral, or negative sentiment.

The highlighted words and/or entire phrases become actionable business insights that can be used (among numerous use cases) to:

Track customer perception.
Identify and reward exceptional agent performance.
Improve customer experience.
Improve products and services.

Sentiment analysis – How is the overall customer sentiment score calculated?

Overall customer sentiment score

The customer sentiment score aims to capture the level of satisfaction experienced by the customer towards your company, products, or customer service. The customer sentiment score is computed by weighing each found sentiment event (positive or negative), with greater significance placed on the sentiment events that occurred towards the end of the interaction. In essence, the customer sentiment score answers the question, did the customer leave the interaction happy, or dissatisfied?

How is the overall customer sentiment score calculated?

The overall customer sentiment score is represented with a value that ranges from -100 to +100, with -100 being extremely dissatisfied and +100 being extremely satisfied.

The overall customers sentiment score is computed by weighing each found sentiment event (positive or negative) based on their relative location in the interaction with a greater weight towards the end of the interaction.

The relative location is calculated by taking the index of the customer’s phrase (in which the sentiment event takes place), and dividing that index by the total number of phrases on the customer side.

In the following examples, assume that each call has 42 total phrases, 20 of which are spoken by the customer.

Let’s assume the following events were detected by customer sentiment analysis:

Sentiment Event #	Sentiment Event	Event Location in the Interaction
1	Positive	0.50 (for example, 10th customer phrase in a 20 phrase call)
2	Positive	0.90 (for example, 18th customer phrase in a 20 phrase call)

As a result, the overall sentiment score would be f((+1 x 0.50) + (+1 x 0.90)) = f(1.40), where f is the function that normalized the customer sentiment score in the range -100 to +100. The function f is defined as Normalized Overall Sentiment Score = 100 x tanh(0.75 x Overall Sentiment Score), giving a Normalized Overall Sentiment Score of 78.

Another example:

Sentiment Event #	Sentiment Event	Event Location in the Interaction
1	Negative	0.10 (for example, 2nd customer phrase in a 20 phrase call)
2	Positive	0.40 (for example, 8^th customer phrase in a 20 phrase call)
3	Negative	0.85 (for example, 17^th customer phrase in a 20 phrase call)

As a result, the overall sentiment score would be f((-1 x 0.10) + (+1 x 0.40) + (-1 x 0.85)) = f(-0.55), where f is the function that normalized the customer sentiment score in the range -100 to +100. The function f is defined as Normalized Overall Sentiment Score = 100 x tanh(0.75 x Overall Sentiment Score), giving a Normalized Overall Sentiment Score of -39.

For more information, see Understand sentiment analysis.

How much storage do transcripts consume and how much does this storage cost?

Voice transcription storage does not generate an extra cost. Transcripts do not count towards the fair use data storage usage calculation.

Sentiment analysis – Is sentiment analysis based on tone or pitch?

Currently, sentiment analysis is based solely on textual content that conveys specific customer sentiment (positive, negative, neutral).

For more information, see Understand sentiment analysis.

Voice transcription – Is voice transcription supported using third parties such as Amazon, Google, or Microsoft?

Genesys Cloud uses its own native transcription engine and includes Extended Voice Transcription Services (EVTS) as an alternative to native voice transcription. The underlying provider for Extended Voice Transcription Services can be either Microsoft Azure Speech-to-Text, or AWS Transcribe.

EVTS provides customers with additional language support beyond the Genesys Cloud native transcription engine, and a choice between the engines when transcribing voice interactions.

For other voice transcription providers such as Google, you must integrate using existing AudioHook and Transcription connector capabilities.

For more information, see: About AudioHook Monitor, and Voice transcription – What is the difference between Genesys Cloud Voice Transcription and Extended Voice Transcription Services.

Topic spotting – What are programs used for?

A program is a group of topics that provide instructions to speech and text analytics about the specific phrases and conditions it should focus on.

Programs are essential, since different parts of the contact center may have different business intents of interest.

For example, the retention department can have very specific scripts or processes that do not apply to the rest of the organization. In this case, it would make sense to have a program that includes topics that apply to this specific department and a different program that includes topics that apply to the rest of the company.

For more information, see About programs, topics, and phrases.

Topic spotting – What are topics used for?

A topic encapsulates a business level intent detected within interactions. Key phrases represent business level intents, and they can be a call reason (for example, cancellation), an agent skill (for example, upsell attempt), or an indication of a customer’s dissatisfaction (for example, escalation).

The recognition engine locates and tags key phrases for easy data search and mining.

For more information, see Work with a topic, Topic Trends Summary view, and Topic Trends Detail view

Best practices – What is the best way to generate business value from speech and text analytics?

With speech and text analytics users within your organization can gain deep insight about what occurs within your contact center.

Speech and text analytics use cases

The following use cases are examples of how your organization can extract valuable information about your business.

Quality Managers and Supervisors can use speech and text analytics output to measure agent performance regarding key skills or behaviors, allowing them to identify areas of focus for coaching or recognition in a much more automated and comprehensive manner.

Business Analysts can use speech and text analytics results to visualize and explore information that describes business performance based on what was said by customers or agents during the interaction. This can be used to identify issues and find opportunities to improve the business.

Risk Managers can better protect customers and the business by identifying high risk interactions where there may be complaints or inappropriate agent behavior that should be investigated or mitigated.

Speech and text analytics best practice example

The following 3 steps represent how to setup and extract speech and text analytics data for the uses cases described above.

Step one

Apply voice transcription to turn the audio from the contact center voice interaction in to structured data (for example, text) that can then be used for large scale analysis. For more information, see Configure voice transcription.

Step two

Define key topics of interest within conversations. The topics of interest will depend on the KPIs (Key Performance Indicator) and use cases you are targeting. For more information, see Work with a topic.

For Quality Managers and Supervisors, decide which agent skills or behaviors you want to track. For example,

Greeting
Compliance language
Build rapport
Express empathy
Check for resolution

Most objective evaluation criteria and some subjective evaluation criteria can be measured by detecting phrases within transcripts.

For Business Analysts, understanding contact reasons is usually the starting point of any improvement program. These contact reasons are usually specific to your business and need some internal discussion to gain consensus on what needs to be tracked, however, some examples are:

Balance inquiry
Billing issue
Cancel mention
Make payment

When defining contact reasons, you should also create topics about specific products or services that your organization provides. Such data will increase your understanding about why customers are contacting you.

For Risk Managers, detecting phrases that put the business at risk is of utmost importance. This may range from language that indicates fraud by the customer or agent, severe complaints or legal action threatened by customers, or specific compliance language that must be communicated on calls. Topics can be created to watch for these markers in conversations.

Step three

Study the resulting data in the analytics views to help you determine how you should proceed in light of what the performance data reveals.

Quality Managers and Supervisors should look at the Topics tab in the Agents View to isolate top and bottom performers with regards to measured skills and behaviors. This information enables you to decide which coaching or recognition opportunities are relevant for the agents involved.

Business Analysts should look at the Topics Trend view to see if there are any unusual trends in call reasons or mentions of products or services as defined by topics. It is often useful to look at this information according to handle time, to see what types or call reasons result in longer calls. This information enables you to decide what process improvements can be implemented to speed up these interactions.

Risk Managers should periodically view trends for key topics, or perform Ad Hoc searches in content search to investigate any concerns raised by the business. This information enables you to protect customers and the business. For example, identifying high risk interactions that include complaints or inappropriate agent behavior should be investigated or mitigated.

Voice transcription – What is the expected latency and level of accuracy for voice transcription?

Within Genesys Cloud, audio is transcribed in near real time, within seconds, and is accessible through our Notifications APIs. The full interaction transcript becomes available in the Interaction Details UI immediately after the call, usually within 15 seconds.

Expected latency: approximately 3–5 seconds with this toggle enabled, compared to 35–40 seconds without it.
There is no additional cost for customers who use this feature.

For more information, see Genesys Cloud supported languages, and How do I increase the accuracy of voice transcription?, Configure voice transcription.

Licensing and costs – What license do I need to use speech and text analytics?

Speech and text analytics features are included as part of the Genesys Cloud CX 1 WEM Add-on II or Genesys Cloud CX 2 WEM Add-on I, and Genesys Cloud CX 3 license. For more information, see Manage licenses.

Voice transcription – What makes Genesys voice transcription unique and better than third parties?

The language model used within the Genesys voice transcription capability is trained based on contact center conversations. 

Since Genesys voice transcription focuses on your specific contact center conversations, it is best suited to transcribe your conversations. As a result, it consistently produces more accurate transcriptions of call center conversations when compared to general transcription engines.

The speech to text transcription model adapts and expands when phrases are added as part of a topic. By doing this, the recognition engine is tailored to find and highlight actionable areas in the transcription that facilitatetargeted data search and retrieval.

For more information, see Understand voice transcripts.

Sentiment analysis – What should I do to make sentiment analysis available for email, chat, and messages?

To run sentiment analysis for digital interactions (email, chats, and messages), you must set an expected dialect (language) in the speech and text analytics settings page.

Currently, English and Spanish dialects support sentiment analysis.

For more information, see Speech and text analytics settings.

Voice transcription – When I play back a recording the transcript time and the audio are not synchronized. What should I do?

Interaction player and transcription synchronization mismatches occur when there is a clock drift issue. To minimize a clock drift issue, Network Time Protocol (NTP) should be enabled.

For more information, see Create a site under BYOC Premises > Configure a site > Step #5.

Voice transcription – Why are there so many ellipses (…) in my voice transcriptions instead of words?

The transcription confidence filter dictates the frequency of finding ellipses (…), in transcriptions.

To change the number of ellipses in the transcript, you must lower your strictness level.

A lower strictness results in more words and fewer ellipses, since a higher number of transcription errors can be expected when a low transcription confidence level is set.

To lower the strictness level you must have administrator privileges.

Click Admin and select Quality > Speech and Text Analytics.
Click Menu > Conversation Intelligence.
Click Speech and TextAnalytics.
Lower the Transcript Confidence Filter to 20.

If the number of ellipses in the transcripts is still large, repeat steps 1 and 2.

For more information, see Speech and text analytics settings.

Sentiment analysis – What is the customer sentiment trend?

The sentiment trend is determined by comparing the sentiment events found closer to the start of the interaction, to the sentiment events found closer to the end of the interaction. For this reason, the sentiment trend may be updated when additional follow ups occur within the same interaction. There is a minimum number of customer phrases required for the sentiment trend to be calculated, usually around 6 or more customer phrases are required.

Customer sentiment trend

Sentiment events are clustered into two groups:

Sentiment events closer to the start of the interaction are defined as start-events.
Sentiment events closer to the end of the interaction are defined as end-events.

sentimentTrend = (sentiment score of end-events – sentiment score of start-events) / 2

The sentiment trend is presented to the user as follows:

Improving – If the trend score is +55 it is defined as improving.
Slightly improving – If the trend score is between +20 and +55 it is defined as slightly improving.
No change – If the trend score is between -20 – +20 it is defined as no change.
Slightly declining – If the trend score is between -19 and -55 it is defined as slightly declining.
Declining – If the trend score is less than -55 it is defined as declining.

For more information, see Understand sentiment analysis.