Purpose

Benchmarks provide deterministic, single-turn testing for your chatbot, complementing the multi-turn simulation testing offered by Blast’s simulation engine. While simulations explore conversational flows and uncover new and previously unknown issues, benchmarks focus on testing specific, predetermined inputs to avoid regressions.

Single-Turn Testing Concept

Benchmarks test individual question-answer pairs in isolation. Each test case consists of:

Input: A specific user query or message
Evaluation: Automated scoring against defined criteria

This approach allows you to validate specific functionality, edge cases, and compliance requirements with consistent, repeatable results.

Integration Strategy

The most effective testing approach combines both methods:

Simulations for discovery and comprehensive testing
Benchmarks for validation and regression testing
Playground for manual verification and test case creation

Use simulation results to identify specific scenarios worth adding to your benchmark suite, creating a comprehensive testing strategy that covers both exploration and validation.

Getting Started

Simulations

Benchmarks

Overview

Purpose

Single-Turn Testing Concept

Integration Strategy

Next Steps

Getting Started

Simulations

Benchmarks

​Purpose

​Single-Turn Testing Concept

​Integration Strategy

​Next Steps

Purpose

Single-Turn Testing Concept

Integration Strategy

Next Steps