Skip to content

yb list

List available built-in evaluators and their descriptions.

Terminal window
yb list [options]

The list command displays all built-in evaluators available in youBencha. It shows evaluator names, descriptions, and key capabilities to help you choose the right evaluators for your use case.

OptionShortTypeDefaultDescription
--format-fstringtableOutput format: table, json, yaml
--verbose-vflagfalseShow detailed evaluator information
--help-hflag-Show help message

List all available evaluators:

Terminal
yb list

Output:

Available Evaluators:
┌───────────────┬────────────────────────────────────────────┐
│ Evaluator │ Description │
├───────────────┼────────────────────────────────────────────┤
│ git-diff │ Analyzes Git changes made by the agent │
│ expected-diff │ Compares output against expected reference │
│ agentic-judge │ AI-powered code quality assessment │
└───────────────┴────────────────────────────────────────────┘
Use 'yb list -v' for detailed information.

Show detailed information about each evaluator:

Terminal
yb list -v
Output
git-diff
────────────────────────────────────────────
Analyzes Git changes made by the AI agent.
Metrics:
files_changed Number of files modified
lines_added Total lines added
lines_removed Total lines removed
change_entropy Distribution of changes
Assertions (optional):
max_files_changed Maximum allowed files
max_lines_added Maximum lines added
max_lines_removed Maximum lines removed
Use when: You want to track scope of changes or enforce limits.

Get evaluator list in JSON format:

Terminal
yb list --format json
Output
{
"evaluators": [
{
"name": "git-diff",
"description": "Analyzes Git changes made by the agent",
"metrics": ["files_changed", "lines_added", "lines_removed", "change_entropy"],
"requires_config": false
},
{
"name": "expected-diff",
"description": "Compares output against expected reference",
"metrics": ["similarity_score", "matching_files", "differing_files"],
"requires_config": true
},
{
"name": "agentic-judge",
"description": "AI-powered code quality assessment",
"metrics": ["custom assertions"],
"requires_config": true
}
]
}

Get evaluator list in YAML format:

Terminal
yb list --format yaml
Output
evaluators:
- name: git-diff
description: Analyzes Git changes made by the agent
requires_config: false
- name: expected-diff
description: Compares output against expected reference
requires_config: true
- name: agentic-judge
description: AI-powered code quality assessment
requires_config: true

Measures the scope and distribution of changes made by the AI agent.

MetricTypeDescription
files_changednumberCount of modified files
lines_addednumberTotal lines added
lines_removednumberTotal lines removed
change_entropynumberDistribution score (0-1)

Use when: Track change scope, enforce limits, analyze patterns.

Configuration Example
evaluators:
- name: git-diff
config:
assertions:
max_files_changed: 5
max_lines_added: 100

Compares the agent’s output against a known-correct reference implementation.

MetricTypeDescription
similarity_scorenumberOverall similarity (0-1)
matching_filesnumberFiles that match exactly
differing_filesnumberFiles with differences

Requires: expected_source and expected in suite config.

Configuration Example
expected_source: branch
expected: feature/completed
evaluators:
- name: expected-diff
config:
threshold: 0.85

Uses an AI agent to evaluate code quality based on custom assertions.

MetricTypeDescription
Custom assertionsnumberEach returns 0-1 score

Use when: Subjective quality assessment or complex criteria.

Configuration Example
evaluators:
- name: agentic-judge
config:
type: copilot-cli
assertions:
has_tests: "Unit tests were added. Score 1 if yes, 0 if no."
clean_code: "Code follows best practices. Score 0-1."
CodeMeaning
0List displayed successfully