Design and optimize topics for topic spotting – best practices
Topic spotting helps organizations identify interactions that contain specific customer intents, requests, outcomes, or business events. Effective topic design enables supervisors, business analysts, and operational leaders to uncover trends, monitor customer behavior, and automate workflows based on conversation content.
The accuracy of topic spotting depends on thoughtful topic construction, representative phrases, and ongoing refinement. Well-designed topics produce actionable insights and reduce the effort required to review large volumes of interactions.
This guide explains how to choose the right topic spotting approach, design effective topics, build representative phrases, and continuously improve topic performance.
Step 1: Choose your topic spotting approach
Topic spotting supports two primary approaches: Lexical topic spotting and Semantic topic spotting. Choosing the right approach is one of the most important decisions when designing a topic.
How lexical topic spotting works
Lexical topic spotting identifies interactions based on specific words or phrases. The system searches conversations for language that closely matches the phrases configured in the topic. This is done using a process known as stemming, which first converts words into their root form. This means that for the purpose of matching, talk, talking, and talks are all considered the same.
Example:
- Topic: Credit card cancellation
Matching phrases:
- Cancel my credit card
- Close my card
- Terminate my account
- Cancel this card
Scoring
Lexical topic matches are scored between 0 and 100 based on the percentage of stemmed words in the topic phrase that match a sequence of stemmed words in the conversation text. A lower weighting is applied to stop words (common words that add little semantic meaning, such as at, our, and this) compared to other words.
Double quotation marks can be used around words to enforce an exact match. For the phrase to be considered a match, the quoted words must appear in the conversation text exactly as written, regardless of other words in the phrase.
Scoring is important when choosing the strictness of the phrase matching that you use.
Best use cases
- Exact wording matters (using double quotes)
- Compliance language must be monitored
- Regulatory disclosures must be detected
- Product names are unique
- False positives must be minimized
- Customer language is predictable
Advantages
- Predictable results
- Easy to troubleshoot
- Greater control over matching behavior
- Strong precision
Limitations
- May miss alternative wording
- Requires phrase maintenance
- Less effective when customer language varies significantly
How semantic topic spotting works
Semantic topic spotting identifies conversations based on intent and meaning rather than exact wording.
It uses a technology called embeddings to represent words and phrases as numerical vectors, allowing the system to recognize semantically similar meanings even when different wording is used. The system evaluates the context of a conversation and determines whether it matches the intended topic.
Example
- Topic: Credit card cancellation
Matching phrases:
- I don’t need this card anymore
- Please close my account
- I want to stop using this card
- I’m switching to another provider
Although the wording differs, the customer intent remains the same.
Scoring
Semantic topic matches are scored between 0 and 100 based on semantic similarity, using a formula called cosine similarity, which checks how similar the angles of the embedding vectors are.
Double quotation marks can be used around words to enforce an exact match. For the phrase to be considered a match, the quoted words must appear in the conversation text, regardless of other words in the phrase.
Best use cases
- Customers express the same intent in many ways
- Customer language varies across regions
- Emerging trends must be identified
- Customer feedback is being analyzed
- Broad intent discovery is required
Advantages
- Captures more interactions
- Requires fewer examples
- Better for discovering customer trends
- More resilient to language variation
Limitations
- May return broader matches
- Can require additional validation
- Less deterministic than lexical matching
Use case recommendations
| Use case | Recommended approach |
|---|---|
| Compliance monitoring | Lexical |
| Required disclosures | Lexical |
| Product-specific requests | Lexical |
| Policy acknowledgements | Lexical |
| Intent discovery | Semantic |
| Customer feedback analysis | Semantic |
| Emerging issue detection | Semantic |
| Customer journey analysis | Semantic |
Step 2: Design your topic
Each topic should represent a single customer intent, business event, or operational outcome.
Combining multiple intents into one topic reduces accuracy and makes reporting difficult.
Good examples:
- Cancel service
- Refund request
- Billing dispute
- Upgrade service
- Address change
Poor examples:
- Billing issues and cancellations
- Complaints and refunds
- Service requests and account maintenance
Design topics around what the customer is trying to accomplish.
Customers rarely use internal terminology. Topics should reflect customer language and customer goals.
Better: Topic: Upgrade service
Less effective: Topic: Product migration workflow
Build topics using phrases observed in actual interactions.
Real customer language consistently produces better topic performance than assumed language.
Sources include:
- Interaction transcripts
- Speech analytics results
- Text analytics results
- Quality evaluations
- Customer surveys
- Agent feedback
Every topic should support a business question.
If a topic cannot support a measurable business outcome, reconsider its purpose.
Examples include:
- How often do customers request refunds?
- How many customers mention competitors?
- How frequently do customers discuss billing issues?
- What percentage of customers express churn risk?
Step 3: Build effective phrases
Building phrases for your topics should be approached differently for lexical and semantic topic spotting. Applying the same phrase strategy to both approaches often reduces accuracy.
The AI-powered Generate Phrases button can help you get suggestions for phrases, but you should review each suggestion one-by-one.
Use the right phrase strategy for the topic type you are building. Lexical topics rely on exact language, while semantic topics focus on intent and meaning.
When possible, validate phrases using the Test phrase feature and refine them based on actual interaction results.
Focus on customer wording
Use phrases that customers actually say. Lexical topics work best when they match common customer language instead of formal internal wording.
Example topic: Cancel service
Recommended phrases:
- Cancel my service
- Close my account
- End my subscription
- Stop the service
Avoid:
- Service termination request
- Subscription discontinuation
Customers rarely use formal business language.
Include common variations
Customers express the same intent differently, so include natural variations that reflect how they speak.
Example topic: Refund request
Recommended phrases:
- I want a refund
- Can I get my money back
- Refund this purchase
- Return my payment
- I’d like reimbursement
Include abbreviations and common terminology
Consider common customer terminology, product names, and abbreviations.
Example topic: Password reset
Recommended phrases:
- Reset my password
- Forgot my password
- Can’t log in
- Login isn’t working
- Need a password reset
Avoid overly broad phrases
Broad phrases often generate excessive matches.
Avoid:
- Problem
- Help
- Charge
- Payment
Better:
- Incorrect charge
- Charged twice
- Billing problem
- Payment error
Continuously refine phrases
After deployment:
- Review matched interactions.
- Identify missed interactions.
- Add proven phrase variations.
- Remove phrases that generate false positives.
Lexical topics generally require more maintenance than semantic topics.
Focus on intent diversity
Semantic topics focus on intent rather than exact wording. The goal is to show the model how customers express a particular intent.
Example topic: Cancel service
Recommended phrases:
- I don’t need this service anymore
- Please close my account
- I’m switching providers
- Stop charging me
- I want to discontinue the service
These phrases represent different expressions of the same intent.
Include multiple communication styles
Customers communicate intent through requests, questions, complaints, explanations, and outcome statements.
Example topic: Refund request
Request: I want a refund
Question: Can I get my money back?
Complaint: I was charged incorrectly
Outcome statement: I’d like this payment reversed
Prioritize conceptual variety
A common mistake is creating phrases that differ by only one or two words. Strong semantic examples represent multiple conceptual expressions, not one repeated pattern.
Weak semantic examples:
- Cancel my service
- Cancel my account
- Cancel this service
- Please cancel my subscription
Strong semantic examples:
- I no longer need this service
- I’m moving to another provider
- Stop billing me
- Close my account
- I’m ending my subscription
These phrases represent multiple conceptual expressions.
Include edge cases
Consider less obvious statements that may indicate the same intent.
Example topic: Churn risk
Potential phrases:
- I’m looking at other companies
- Your competitor offered a better price
- I’m considering leaving
- I don’t know if I want to stay
These interactions may indicate churn risk even though the customer never explicitly says “cancel.”
Keep the topic focused
Do not combine multiple intents into a single semantic topic.
Avoid: Topic: Billing issues and cancellations
Better:
- Billing dispute
- Refund request
- Cancel service
Step 4: Configure and optimize topics
The Strictness setting on topics and phrases allows you to decide how close the match must be. The greater the strictness, the higher the match score must be.
Depending on your use case, you may need to raise or lower the strictness in order to maximize accuracy. You can use the Test Phrase feature to evaluate which strictness level works best.
The Test phrase button on each topic is an essential tool to ensure that your topics will work as expected. It can be surprising how concepts are phrased in everyday language, and you will not truly know how effective your topics are until you see them in action.
Testing phrases allows you to preview this effectiveness. You can test each of your phrases on up to 5,000 transcripts. You can tune your phrase by editing the text and strictness and repeating the test.
You may also find that over time, the language used in conversations evolves, so testing and revising topics over time can help you improve accuracy.
After deployment, review topic performance regularly and refine topics as needed.
- Review matched interactions.
- Identify missed interactions.
- Add proven phrase variations.
- Remove phrases that generate false positives.
- Retest periodically as customer language evolves.
Step 5: Quick reference – lexical vs semantic
Key differences between lexical and semantic phrase building
| Area | Lexical Topic Spotting | Semantic Topic Spotting |
|---|---|---|
| Primary goal | Match language | Match intent |
| Focus | Exact words and phrases | Meaning and context |
| Phrase strategy | Wording variations | Intent variations |
| Phrase count | Typically higher | Typically lower |
| Maintenance | Higher | Lower |
| Precision | Higher | Moderate |
| Discovery capability | Lower | Higher |
| Best for | Compliance, disclosures, product names | Intent discovery, customer feedback, trend analysis |
[NEXT] Was this article helpful?
Get user feedback about articles.