Creating and Managing Benchmarks

Benchmarks in Blast are collections of single-turn test cases that validate your chatbot’s performance on specific prompts. Each benchmark serves as a focused test suite that you can run repeatedly to track improvements and catch regressions.

Creating New Benchmarks

Benchmark Setup

Create a new benchmark from the benchmarks dashboard:

New Benchmark: Click the “New Benchmark” button
Basic Information:
- Name: Descriptive benchmark name (e.g., “Customer Service Basics”, “Product Search Tests”)
- Description: Purpose and scope of the benchmark
- Tags: Optional tags for organization and filtering
Add Test Cases: Add initial test cases using either:
- Bulk Add Rows: Type multiple test prompts directly
- CSV Upload: Import test cases from a CSV file
Create: Save the benchmark with your test cases

Benchmark Organization

Use clear naming and tagging strategies to keep benchmarks organized:

Functional Names: “Payment Processing”, “Return Policy”, “Product Recommendations”
Priority Tags: “critical”, “regression”, “new-features”
Domain Tags: “customer-service”, “e-commerce”, “technical-support”

Adding Test Cases

CSV File Upload

Upload multiple test cases at once using a CSV file:

CSV Format Requirements

Your CSV file must contain a single column:

prompt: The user input or query to test

Example CSV Format

prompt
"What are your store hours?"
"How do I return a defective item?"
"Do you ship internationally?"
"What payment methods do you accept?"
"I can't find my order confirmation"

Upload Process

File Upload Tab: Select the “File Upload” option when creating or editing a benchmark
Choose File: Select your prepared CSV file
Validation: Blast automatically validates the format
Preview: Review the parsed test cases
Import: Confirm to add all test cases to your benchmark

Manual Test Addition (Bulk Add Rows)

Add multiple test cases manually using the bulk interface:

Bulk Add Rows: Select this option when creating or editing a benchmark
Add Prompts: Type each test prompt in the provided fields
Add More Rows: Click “Add Row” to include additional test cases
Save: Add all prompts to your benchmark

Example Test Cases

Create prompts that represent real user interactions:

prompt
"What's your return policy for electronics?"
"Can I change my shipping address after ordering?"
"Do you price match competitor offers?"
"How long does standard shipping take?"
"What if I receive a damaged product?"
"Can I cancel my order?"
"Do you have a loyalty program?"
"What are your customer service hours?"

Adding Tests from Playground

Database Icon Workflow

Convert playground testing sessions into benchmark test cases:

Playground Testing: Test scenarios manually in the playground
Database Icon: Click the database icon next to any query you want to save
Add to Benchmark Flow:
- Edit Prompt: Modify the query text if needed
- Select Benchmark: Choose an existing benchmark or create a new one
- Add Test: Save the prompt as a new test case

Workflow Example

Playground Session → Identify Valuable Tests → Click Database Icon → 
Select/Create Benchmark → Save as Test Case → Run Benchmark

Adding Tests from Simulation Conversations

Database Icon in Conversations

Convert simulation interactions into focused benchmark tests:

Review Simulations: Analyze completed simulation conversations
Identify Key Interactions: Find specific user inputs worth testing individually
Database Icon: Click the database icon next to the relevant user input
Add to Benchmark:
- Edit Prompt: Adjust the input text if needed (e.g if there is context missing from the question alone that needs to be added)
- Select Benchmark: Choose target benchmark or create new one
- Save Test: Add as a single-turn test case

## Managing Existing Benchmarks

### Benchmark Dashboard
Navigate and manage your benchmark collection:

- **Benchmark List**: View all benchmarks with key metrics
- **Columns**: Name, Description, Number of Rows, Tags, Last Updated, Created

### Individual Benchmark Management
Once inside a specific benchmark:

- **Test List**: View all test cases with their recent results
- **Add Tests**: Add new test cases using any of the methods above
- **Run Tests**: Execute individual tests or the entire benchmark
- **Delete Tests**: Remove outdated or irrelevant test cases
- **Export**: Export test cases


## Next Steps

- [Run your benchmarks](/benchmarks/running-tests) to validate chatbot performance
- [Analyze results](/benchmarks/results-analysis) to identify improvement opportunities
- [Compare performance over time](/benchmarks/results-analysis) to track progress 

Getting Started

Simulations

Benchmarks

Creating Benchmarks

Creating and Managing Benchmarks

Creating New Benchmarks

Benchmark Setup

Benchmark Organization

Adding Test Cases

CSV File Upload

CSV Format Requirements

Example CSV Format

Upload Process

Manual Test Addition (Bulk Add Rows)

Example Test Cases

Adding Tests from Playground

Database Icon Workflow

Workflow Example

Adding Tests from Simulation Conversations

Database Icon in Conversations

Getting Started

Simulations

Benchmarks

​Creating and Managing Benchmarks

​Creating New Benchmarks

​Benchmark Setup

​Benchmark Organization

​Adding Test Cases

​CSV File Upload

​CSV Format Requirements

​Example CSV Format

​Upload Process

​Manual Test Addition (Bulk Add Rows)

​Example Test Cases

​Adding Tests from Playground

​Database Icon Workflow

​Workflow Example

​Adding Tests from Simulation Conversations

​Database Icon in Conversations

Creating and Managing Benchmarks

Creating New Benchmarks

Benchmark Setup

Benchmark Organization

Adding Test Cases

CSV File Upload

CSV Format Requirements

Example CSV Format

Upload Process

Manual Test Addition (Bulk Add Rows)

Example Test Cases

Adding Tests from Playground

Database Icon Workflow

Workflow Example

Adding Tests from Simulation Conversations

Database Icon in Conversations