Symptom checker

Use this guide to diagnose and fix search quality issues. Start by identifying the symptom, then follow the corresponding troubleshooting steps.

Common symptoms:

Testing Tool: Use Knowledge Base > Test to reproduce issues and see detailed scoring information for debugging.

Issue: No matches found

Symptom: Customer questions receive no results, or the chatbot uses a fallback message like "I don't know" or "Let me connect you with support."

Diagnostic checklist:

  1. □ Check if Q&A pair exists: Search your knowledge base for content related to the question
  2. □ Verify embeddings are generated: Semantic search won't work without embeddings
  3. □ Check Q&A pair status: Ensure the Q&A is enabled/active, not disabled or in draft
  4. □ Test in Knowledge Base test tool: Reproduce the issue to see if any matches appear with scores
  5. □ Review confidence threshold: Matches may exist but fall below your chatbot's minimum threshold

Solutions:

If no Q&A pair exists:

  • Add a new Q&A pair covering this topic
  • Use customer's actual phrasing for the question
  • Write a clear, concise answer
  • Generate embeddings after adding

If Q&A exists but embeddings are missing:

  1. Go to Knowledge Base
  2. Click "Generate Embeddings"
  3. Wait for generation to complete
  4. Re-test the question

If Q&A exists but is disabled:

  • Edit the Q&A pair and set status to "Active"
  • Ensure the "Enabled" toggle is on
  • Save changes

If matches exist but scores are too low:

Common Mistake: Adding Q&A pairs but forgetting to generate embeddings. Semantic search won't work until embeddings exist.

Issue: Wrong answer returned

Symptom: The chatbot returns an answer, but it's not relevant or addresses a different question than the customer asked.

Diagnostic checklist:

  1. □ Test the question: Use Knowledge Base test tool to see all matches and their scores
  2. □ Check if correct answer exists: Verify you have a Q&A for what the customer is actually asking
  3. □ Compare scores: See if the correct answer appears but ranks lower than the wrong one
  4. □ Review question phrasing: Check if questions in knowledge base use different terminology
  5. □ Look for duplicates: Multiple similar Q&A pairs may confuse matching

Solutions:

If correct answer doesn't exist:

  • Add a new Q&A pair with precise question phrasing
  • Include key terms the customer used
  • Generate embeddings

If correct answer exists but ranks lower:

  • Improve the question phrasing: Edit to better match how customers ask
  • Add keywords to the question: Include specific terms customers use
  • Consider search weights: If technical terms are ignored, increase BM25 weight (see Tuning Search Weights)
  • Regenerate embeddings after editing

If duplicate Q&A pairs exist:

  • Identify semantic duplicates (similar questions with different wording)
  • Merge or delete duplicates, keeping only the best-phrased one
  • See Knowledge Base Hygiene for duplicate management

Example: Customer asks "How much does shipping cost?" but gets an answer about return policy. Check if your knowledge base has a shipping Q&A, and verify its question includes "shipping cost" or "delivery fee."

Issue: Low confidence scores

Symptom: Questions match Q&A pairs, but combined scores are below 0.70. The chatbot may use fallback responses even though a match exists.

Diagnostic checklist:

  1. □ Test the question: Check semantic, BM25, and combined scores
  2. □ Compare customer question to Q&A question: How different is the phrasing?
  3. □ Check for keyword overlap: Does the Q&A question include terms from customer question?
  4. □ Review semantic score: Low semantic score indicates phrasing/meaning mismatch
  5. □ Review BM25 score: Low BM25 score indicates missing keywords

Solutions:

If semantic score is low (< 0.60):

  • Rephrase the question to better match customer language
  • Use more natural, conversational phrasing
  • Consider creating multiple Q&A pairs for different phrasings
  • Regenerate embeddings after editing

If BM25 score is low (< 0.50):

  • Add key terms from the customer question into your Q&A question
  • Include synonyms and alternate terminology
  • Ensure important technical terms or product names appear in the question

If both scores are moderate but combined is low:

  • This is expected - no single improvement needed
  • Consider if this truly should match, or if a new Q&A is needed
  • You may need to adjust confidence threshold settings

Example improvement:

Customer question: "Can I get a refund?"

Before (low scores):
  Q: "What is our refund policy?"
  Semantic: 0.65  (okay match)
  BM25: 0.40      (missing "refund" keyword prominently)
  Combined: 0.58  (below threshold)

After (improved):
  Q: "Can I get a refund or return?"
  Semantic: 0.88  (much better match)
  BM25: 0.75      (includes "refund" keyword)
  Combined: 0.84  (above threshold) ✓

Important: Always regenerate embeddings after editing questions. The semantic score won't improve until new embeddings are generated.

Issue: Too many similar matches

Symptom: A customer question matches multiple Q&A pairs with similar scores. The chatbot may return inconsistent answers or show low confidence.

Diagnostic checklist:

  1. □ Test the question: Check how many Q&A pairs match with similar scores
  2. □ Review matched Q&As: Are they duplicates or genuinely different topics?
  3. □ Compare answers: Do the matched Q&As provide the same or conflicting information?
  4. □ Check for vague questions: Overly broad Q&A questions match too many customer questions

Solutions:

If matched Q&As are semantic duplicates:

  • Merge duplicates: Keep the best-phrased question and delete the rest
  • If answers differ slightly, combine into one comprehensive answer
  • See Knowledge Base Hygiene for duplicate detection

If Q&As cover different topics but seem similar:

  • Make questions more specific: Add distinguishing details to each question
  • Include context in questions: "How do I return a damaged item?" vs. "How do I return an unwanted item?"
  • Add specific keywords that differentiate the topics

If Q&As conflict (provide different answers to similar questions):

  • This is a serious knowledge base issue
  • Identify the correct answer and delete or fix the incorrect one
  • If both are correct for different scenarios, clarify the questions to distinguish them

Example: Customer asks "How long does shipping take?" Multiple Q&As match: "Domestic shipping times," "International shipping times," "Express shipping options." Make each question more specific to reduce ambiguity.

Issue: Technical terms ignored

Symptom: Questions with specific technical terms, product codes, or model numbers match incorrect or overly general answers.

Diagnostic checklist:

  1. □ Test with technical term: Check if the specific term appears in matched Q&A questions
  2. □ Review BM25 score: Should be high if technical term matches
  3. □ Check search weights: May be too semantic-heavy for your use case
  4. □ Verify Q&A exists: Do you have a Q&A pair specifically for this technical term?

Solutions:

If Q&A with technical term doesn't exist:

  • Add specific Q&A pairs for each technical term, product code, or model
  • Include the exact technical term in the question
  • Example: "How do I configure OAuth2?" not just "How do I configure authentication?"

If Q&A exists but doesn't rank first:

  • Increase BM25 weight: Change from 70/30 to 50/50 or 40/60 (see Tuning Search Weights)
  • This gives keyword matching more influence
  • Test with multiple technical questions before deploying

If technical term appears in question but isn't prominent:

  • Rephrase question to emphasize the technical term
  • Put technical terms early in the question
  • Example: "OAuth2 configuration steps" rather than "How do I set up authentication using OAuth2?"

Best Practice: For technical content, API documentation, or product catalogs, use 50/50 or 40/60 semantic/BM25 weights to ensure precise matching of technical terms.

Issue: Paraphrases don't match

Symptom: Customers rephrase questions in different ways, and the chatbot fails to match even though you have the information.

Diagnostic checklist:

  1. □ Verify embeddings exist: Semantic matching requires embeddings
  2. □ Test the paraphrase: Check semantic score in test tool
  3. □ Review search weights: May be too keyword-heavy (high BM25 weight)
  4. □ Check question phrasing: Is your Q&A question too specific or formal?

Solutions:

If embeddings are missing:

  • Generate embeddings immediately (semantic search won't work without them)
  • Re-test after generation completes

If semantic score is low for paraphrases:

  • Rephrase Q&A question to be more natural and conversational
  • Use common, everyday language instead of formal or technical phrasing
  • Consider creating multiple Q&A pairs for common variations
  • Regenerate embeddings after editing

If BM25 weight is too high:

  • Increase semantic weight: Change from 70/30 to 80/20 or 85/15
  • This prioritizes meaning over exact keyword matches
  • See Tuning Search Weights

Example improvement:

Customer variations:
  "Can I send this back?"
  "What if I don't like it?"
  "How do I get my money back?"

Before (formal phrasing - low semantic match):
  Q: "What is the return policy for purchased items?"

After (natural phrasing - high semantic match):
  Q: "Can I return or get a refund?"

Pro Tip: Review real customer questions from support tickets or chat logs. Use their exact language when writing Q&A questions.

General diagnostics

If none of the specific issues above apply, use these general troubleshooting steps.

Quick diagnostic workflow:

  1. Reproduce the issue: Use Knowledge Base > Test to test the problematic question
  2. Review all matches: See what Q&A pairs match and with what scores
  3. Analyze score breakdown: Check semantic, BM25, and combined scores
  4. Compare to expectations: Determine which Q&A should match and why it doesn't
  5. Apply targeted fix: Use appropriate solution from above sections
  6. Regenerate embeddings: If you edited any questions or answers
  7. Re-test: Verify the issue is resolved

Common root causes:

  • 70% of issues: Missing Q&A pairs or poor question phrasing
  • 20% of issues: Embeddings not generated after changes
  • 10% of issues: Search weight configuration or duplicates

When to contact support:

Reach out to SoundMinds.ai support if:

  • Embeddings generation consistently fails
  • Search behaves inconsistently (same question, different results)
  • All troubleshooting steps fail to improve search quality
  • You need help analyzing complex scoring issues

Prevention: Regular knowledge base maintenance prevents most issues. See Knowledge Base Hygiene for best practices.