Skip to main content

Purpose

Simulations provide automated, multi-turn testing of your chatbot. We have enabled 6 different simulation types to help you easily test anything that you might want to test.

Available Simulators

Launching Simulations

New Simulator Page Workflow

When you click “New Simulation,” you’ll be setting up a batch of automated conversations. Each batch can run up to 100 simulations in parallel, with each simulation taking approximately 40 seconds to complete. Here’s how to set it up:
  1. Select the Simulator Type: Choose from one of the available simulator types (e.g., Goal Simulator, Topic Simulator, etc.).
  2. Configure Parameters: Set up the parameters for your simulation. For example, you can specify product categories, personas, max turns, and other relevant settings.
  3. Set Execution Runs: Determine the number of runs for each parameter combination. The total number of simulations will be the product of all your parameter combinations multiplied by the number of execution runs.
  4. Launch Simulation: Once configured, launch your batch of simulations. They will run in parallel for efficiency.
  5. Monitor on Dashboard: As the simulations run, you can monitor their progress on a dashboard, which provides real-time updates on the status of each simulation.

Parameter Combinations

The system will automatically create simulations for every possible combination of your parameters. Each simulator type has its own specific parameters, but there are several common parameters shared across most simulators: Common Parameters:
  • Max Turns: Maximum number of conversation turns (default: 10). This limits how long each simulated conversation can go.
  • Personas: Simulated personalities that affect conversation style. Personas help test how your chatbot handles different types of users:
    • Different conversation styles and communication preferences
    • Varied expertise levels (e.g., beginner vs professional)
    • Language preferences (you can specify language in persona descriptions)
  • Execution Runs: Number of times to run each parameter combination
Here are two examples showing both shared and simulator-specific parameters: Product Recommendations Simulator Example:
Simulator: Product Recommendations Simulator
Parameters:
- Product Categories: ["Power Tools", "Paint & Stain"]     // 2 values
- Deep Search: False                                       // 1 value (only 1 value allowed for entire batch)
- Max Turns: 4                                            // 1 value (only 1 value allowed for entire batch)
- Execution Runs: 2                                       // 2 runs each
Total Simulations: 2 × 1 × 1 × 2 = 4 simulations
Topic Simulator Example:
Simulator: Topic Simulator
Parameters:
- Topics: ["troubleshooting", "product_info", "billing"]   // 3 values
- Personas: ["technical", "casual", "frustrated"]          // 3 values
- Max Turns: [5]                                          // 1 value (only 1 value allowed for entire batch)
- Execution Runs: 3                                       // 3 runs each
Total Simulations: 3 × 3 × 1 × 3 = 27 simulations

Evaluation System

Available Evaluators

Select from 6 specialized evaluators to assess different aspects:
  • Follow-up Refusal Detector: Ensures appropriate follow-up suggestions
  • Language Detection: Validates that input and output languages are the same
  • Product Relevance: Checks that recommended products are relevant
  • Product Specs Contradiction: Detects information inconsistencies between chatbot’s
  • Response Style: Evaluates stylistic elements of the response
  • Search Term Relevance: Ensures recommended search terms/product categories align with user needs