Creating and Managing Benchmarks
Benchmarks in Blast are collections of single-turn test cases that validate your chatbot’s performance on specific prompts. Each benchmark serves as a focused test suite that you can run repeatedly to track improvements and catch regressions.Creating New Benchmarks
Benchmark Setup
Create a new benchmark from the benchmarks dashboard:- New Benchmark: Click the “New Benchmark” button
- Basic Information:
- Name: Descriptive benchmark name (e.g., “Customer Service Basics”, “Product Search Tests”)
- Description: Purpose and scope of the benchmark
- Tags: Optional tags for organization and filtering
- Add Test Cases: Add initial test cases using either:
- Bulk Add Rows: Type multiple test prompts directly
- CSV Upload: Import test cases from a CSV file
- Create: Save the benchmark with your test cases
Benchmark Organization
Use clear naming and tagging strategies to keep benchmarks organized:- Functional Names: “Payment Processing”, “Return Policy”, “Product Recommendations”
- Priority Tags: “critical”, “regression”, “new-features”
- Domain Tags: “customer-service”, “e-commerce”, “technical-support”
Adding Test Cases
CSV File Upload
Upload multiple test cases at once using a CSV file:CSV Format Requirements
Your CSV file must contain a single column:- prompt: The user input or query to test
Example CSV Format
Upload Process
- File Upload Tab: Select the “File Upload” option when creating or editing a benchmark
- Choose File: Select your prepared CSV file
- Validation: Blast automatically validates the format
- Preview: Review the parsed test cases
- Import: Confirm to add all test cases to your benchmark
Manual Test Addition (Bulk Add Rows)
Add multiple test cases manually using the bulk interface:- Bulk Add Rows: Select this option when creating or editing a benchmark
- Add Prompts: Type each test prompt in the provided fields
- Add More Rows: Click “Add Row” to include additional test cases
- Save: Add all prompts to your benchmark
Example Test Cases
Create prompts that represent real user interactions:Adding Tests from Playground
Database Icon Workflow
Convert playground testing sessions into benchmark test cases:- Playground Testing: Test scenarios manually in the playground
- Database Icon: Click the database icon next to any query you want to save
- Add to Benchmark Flow:
- Edit Prompt: Modify the query text if needed
- Select Benchmark: Choose an existing benchmark or create a new one
- Add Test: Save the prompt as a new test case
Workflow Example
Adding Tests from Simulation Conversations
Database Icon in Conversations
Convert simulation interactions into focused benchmark tests:- Review Simulations: Analyze completed simulation conversations
- Identify Key Interactions: Find specific user inputs worth testing individually
- Database Icon: Click the database icon next to the relevant user input
- Add to Benchmark:
- Edit Prompt: Adjust the input text if needed (e.g if there is context missing from the question alone that needs to be added)
- Select Benchmark: Choose target benchmark or create new one
- Save Test: Add as a single-turn test case