FAQs: AI Scoring

What is the agreement rate, and how is it used to fine-tune an auto-complete evaluation form?

The agreement rate measures how often the answers selected by Virtual Supervisor match those selected by a human evaluator. It is calculated both per question and as an overall metric for the evaluation form. A higher agreement rate indicates stronger alignment between AI-generated evaluations and human judgment, while lower agreement highlights areas that may need refinement.

Teams use the agreement rate to identify unclear questions, inconsistent scoring logic, or gaps in evaluation guidance, and to improve how effectively the auto-complete evaluation form performs at scale.

How to calculate and use the agreement rate

Evaluate interactions using the form
- Go to Menu > Analytics > Analytics Workspace > Interactions.
- Open an interaction and navigate to the Quality Summary tab.
- Click Create Evaluation.
- Select the Agent Auto-Complete evaluation form.
- Choose a human evaluator to manually review the auto-completed answers.
- Click Create to generate the evaluation.
Review and update the evaluation
- Review each question in the evaluation.
- Use the transcript as evidence to update any incorrect automated responses.
- Submit the completed evaluation.
Test the form across multiple interactions
- Repeat this process for at least 20 different interactions to ensure reliable agreement data.
Review agreement metrics
- Go to Conversation Intelligence > Quality Management > Evaluation Forms.
- Open the latest published version of the evaluation form.
- Review the overall agreement rate and the agreement rate for each question.
Refine the form
- Identify questions with low agreement rates.
- Update question wording, answer options, or scoring logic based on observed discrepancies.
- Retest the form as needed until the desired agreement rate is achieved.

Will reports include auto-complete evaluation data?

Q: Do current reports include data from auto-complete evaluations?
No. At this time, auto-complete evaluation data is not included in Quality Management reports.

Q: When will auto-complete evaluation data be available in reports?
Genesys is targeting support for auto-complete evaluation data in existing Quality Management reports, as well as in the upcoming question-level reports, by mid-Q2 2026.

Q: What does this mean for supervisors and analysts?
Until reporting support is available, auto-complete evaluation data will not appear in dashboards or exported reports. Once the update is released, supervisors and analysts will be able to view and analyze auto-complete and manually completed evaluations together, providing a more comprehensive view of quality performance.

Q: Will any action be required to access this data once it’s available?
No. Once reporting support is released, auto-complete evaluation data will be included automatically in applicable reports.

Can I use Quality Policies to create Agent Auto Complete Evaluations?

No, Quality Policies don’t support Agent Auto-Complete evaluations.

For more information on how to generate an agent auto complete evaluation, see How do I generate an Agent Auto Complete Evaluation?

How do I generate an Agent Auto Complete Evaluation?

You can generate evaluations using an Agent Auto-Complete evaluation form in two ways: manually (per interaction) or automatically using the AI Scoring Rules Management API.

Option 1: Generate an evaluation for a specific interactionClick to expand

You can generate an evaluation for a single interaction by using the API or creating an ad-hoc evaluation in the UI.

Using the API

Use the following endpoint: POST /api/v2/quality/conversations/{conversationId}/evaluations

You must provide the following information:

conversationId
evaluationFormId
evaluationFormType = Automated
agentId (optional)

Example JSON request body:

Using the UI (ad-hoc evaluation)

Create a new ad-hoc evaluation.
From the evaluation form dropdown, select an Agent Auto-Complete evaluation form.
Select Auto Submit and choose Virtual Supervisor as the evaluator type.
Click Create.

Option 2: Generate auto-complete evaluations automatically using AI Scoring RulesClick to expand

To automatically generate evaluations at scale, you must configure an Agent Scoring Rule using the AI Scoring Rules Management API.

Step 1: Create an Agent Scoring Rule

The scoring rule defines:

Which automated evaluation form to use
What percentage of interactions should be evaluated
How evaluations are generated

Use the following API endpoint:

Example request:

Field descriptions

programId – The ID of your existing Speech & Text Analytics (STA) program.
evaluationFormContextId – The context ID of the automated evaluation form.
samplingPercentage – The percentage of interactions that should automatically generate evaluations.
enabled – Must be set to true to activate the rule.
published – Must be set to true for the rule to take effect.
submissionType – Must be set to Automated to trigger auto-complete evaluations.

Once the rule is configured and enabled, the system automatically generates evaluations for interactions that meet the defined criteria.

Learn more

For information on creating manual evaluations, see Create a new evaluation.
For guidance on designing, testing, and fine-tuning auto-complete evaluations, see AI Scoring best practices.

When Does AI Scoring Generate a Charge?

A charge for AI Scoring is incurred whenever a quality evaluation form includes one or more AI-Scoring-enabled questions and that form is used to evaluate an interaction. Charges apply regardless of whether the AI-Scoring-enabled questions are ultimately answered.

The only exception is when the evaluation encounters an AI Scoring–related error at the evaluation level. In those cases, no charge is generated.

Will Reports Include Auto-Complete Evaluation Data?

Q: Do current reports include data from auto-complete evaluations?

A: Not yet. Currently, reports do not include data generated from auto-complete evaluations. Support for this data is planned for both existing Quality Management reports and the new question-level reports, with availability targeted for mid-Q2 2026.

Q: What does this mean for supervisors and analysts?

A: Until reporting support is released, auto-complete evaluation data will not appear in dashboards or exported reports. Once the update becomes available, you’ll be able to review and analyze auto-complete evaluations alongside manually completed evaluations, providing a more complete picture of overall quality performance.

Q: Will any action be required to access this data once it becomes available?

A: No. Once the reporting update is released, auto-complete evaluation data will be included automatically in all applicable reports—no configuration changes or additional setup required.

Can I use Quality Policies to create Agent Auto Complete Evaluations?

No, Quality Policies do not support Agent Auto-Complete evaluations.

For more information on how to generate an agent auto complete evaluation, see How do I generate an Agent Auto Complete Evaluation?

How do I generate an Agent Auto Complete Evaluation?

You can generate an evaluation for a specific interaction in one of two ways:

Use the Evaluations APIClick to expand

Call the following API endpoint:

POST /api/v2/quality/conversations/{conversationId}/evaluations

Required fields:

conversationId
evaluationFormId
evaluationFormType = “Automated”
agentId (optional)

Example JSON body:

{
“evaluationForm”: {
“id”: “965adb1f-6724-411a-8fff-b14914d34e16”,
“evaluationFormType”: “Automated”
},
“agent”: {
“id”: “6dc865cc-f5f3-4b45-af42-47f149cfb5c2”
}
}

Create an Ad-Hoc EvaluationClick to expand

Generating Auto-Complete Evaluations Using AI Scoring Rules Management

To automate the generation of evaluations at scale, configure an Agent Scoring Rule using the AI Scoring Rules Management API.

Step 1: Create an Agent Scoring Rule

Use the following API:

POST /api/v2/quality/programs/{programId}/agentscoringrules

Example Request:

POST /api/v2/quality/programs/bd27fab3-6e94-4a93-831e-6f92e664fc61/agentscoringrules HTTP/1.1
Host: api.inindca.com
Authorization: Bearer *******************
Content-Type: application/json

Example JSON body:

{
“programId”: “bd27fab3-6e94-4a93-831e-6f92e664fc61”,
“samplingType”: “Percentage”,
“submissionType”: “Automated”,
“evaluationFormContextId”: “14818b50-88c0-4cc5-8284-4ed0b76e3193”,
“enabled”: true,
“published”: true,
“samplingPercentage”: 97
}

Field Explanations

programId – ID of the Speech & Text Analytics (STA) program.
evaluationFormContextId – The contextId of the automated evaluation form to use.
samplingPercentage – Percentage of interactions that should automatically generate evaluations.
enabled – Must be true for the scoring rule to be active.
published – Must be true for the rule to take effect.
submissionType – Set to "Automated" to ensure evaluations are auto-generated.

Once the rule is active, evaluations will automatically be created for interactions that meet the rule’s criteria.

Additional Resources

For instructions on creating manual evaluations, see Create a new evaluation.
For guidance on designing, testing, and tuning auto-complete evaluations, see AI Scoring Best Practices.

Which Genesys Cloud regions support AI scoring, and how are they mapped to AWS Bedrock regions?

The following table shows the AWS region mappings used by Genesys Cloud for AI scoring with Bedrock models.

Genesys Region	Mapped to Bedrock Region for AI Scoring
us-east-1	us-east-1
me-central-1 eu-west-1 eu-west-2	eu-central-1
us-west-2	us-west-2
ap-southeast-2	ap-southeast-2
ap-northeast-2	ap-northeast-2
ap-northeast-1 ap-northeast-3	ap-northeast-1
eu-west-2	eu-west-2
sa-east-1	sa-east-1
ca-central-1	ca-central-1
ap-south-1	ap-south-1
FedRAMP – us-east-2	us-east-1 us-west-2 *Done via AWS using cross region inference
eu-central-1	eu-central-1

Is there a best practices guide for using AI Scoring?

Yes. To learn how to use AI Scoring effectively and get the most accurate results, see AI scoring best practices.

How should I confirm that the agent closed the conversation properly?

Include a question about summarizing outcomes or confirming satisfaction before ending the interaction.

Example: “Did the agent confirm customer satisfaction or summarize next steps before closing the conversation?”
AI marks Yes when the agent checks resolution or restates next steps clearly. This confirms that the customer’s issue was addressed before the call or chat ended.

How can I design a question to handle dead air or silence?

Ask whether the agent acknowledged or explained any pause longer than a set threshold.

Example: “Did the agent avoid unnecessary dead air or long silences without explaining the reason?”
AI marks Yes when the agent explains pauses (for example, “I’ll place you on a brief hold while I check this”). Unexplained silence longer than 15 seconds is marked No.

How do I handle compliance or disclosure questions in AI scoring?

Compliance questions should reference required statements that appear in the transcript.

Example: “Did the agent comply with mandatory disclosure or compliance statements (for example, terms, disclaimers, or legal requirements)?”
AI marks Yes when mandatory phrases—such as legal disclaimers or security verifications—are found. Keep help text specific to your industry’s compliance standards.

How should escalation be evaluated by AI?

Write questions that specify the elements of a complete escalation explanation.

Example: “Did the agent explain the escalation process clearly, including who to contact, what information is needed, and expected response times?”
AI marks Yes when all three elements are present. Include examples in the help text to illustrate acceptable responses.

What’s the best way to confirm that the agent provided a resolution?

Ask whether the agent clearly described an action or next step that resolves the customer’s concern.

Example: “Did the agent provide a clear resolution step, such as a refund, replacement, or troubleshooting instruction?”
AI marks Yes when specific actions are stated. General apologies or vague responses without follow-through are not considered resolutions.

How can I measure empathy without using subjective language?

Replace emotional judgments with observable acknowledgment.

Example: “Did the agent acknowledge the customer’s issue before moving to resolution?”
AI marks Yes when the transcript includes acknowledgment phrases such as “I’m sorry,” “I understand,” or “That must be frustrating.” Avoid vague words like “empathetic” or “polite,” which require human interpretation.

Should I include questions about using the customer’s name?

Yes. Personalization improves customer engagement and can be objectively measured.

Example: “Did the agent use the customer’s name at least once during the conversation?”
AI marks Yes when the transcript shows the name provided by the customer being used. Include guidance in the help text so evaluators understand when name usage is required.

How do I ensure AI accurately scores interruptions?

Ask direct questions focused on whether the agent allowed the customer to finish speaking.

Example: “Did the agent allow the customer to finish speaking without interruptions?”
AI marks Yes when the agent listens fully before responding. Short, polite interjections like “Sorry, please continue” after a pause are not counted as interruptions. Avoid combining this with other behaviors in the same question.

How can I include identity verification checks in AI-scored forms?

Include questions that ask whether the agent verified at least one customer credential before discussing account details.

Example: “Did the agent verify the customer’s identity before addressing account-specific concerns?”
AI marks Yes when a credential such as date of birth, account ID, or phone number is confirmed in the transcript. This helps ensure compliance and security in agent interactions.

How should I design greeting questions for AI scoring?

To evaluate greetings, write transcript-based questions that ask whether the agent used a standard greeting phrase at the start of the interaction.

Example: “Did the agent greet the customer using a standard phrase such as ‘hi,’ ‘hello,’ or ‘good morning’?”
AI marks Yes when a polite greeting is detected. Avoid subjective wording like “Was the agent friendly?” as AI can only recognize the presence of the greeting, not tone or emotion.

Do the existing evaluation limits apply to AI Scoring driven evaluations?

Yes, agent and evaluation centered evaluation daily limits (50 per day), apply to both regular and AI driven evaluations. The limit can be increased on an individual customer basis to either 100 or 200 per day, by creating a customer care ticket with this request.

Is STA (Speech and Text Analytics) required for AI Scoring?

Yes, if you intend to use AI scoring to evaluate calls. AI Scoring requires interactions to be transcribed.