Suite YAML Reference

This page documents all available fields in the suite configuration file (suite.yaml).

Schema Overview

Field	Type	Required	Default	Description
`name`	string	No	-	Test case name
`description`	string	No	-	Test case description
`repo`	string	Yes	-	Repository URL (HTTP/HTTPS)
`branch`	string	No	default branch	Git branch to checkout
`commit`	string	No	-	Specific commit SHA
`expected_source`	string	No	-	Reference type (`branch`, `commit`)
`expected`	string	No	-	Reference branch/commit
`timeout`	number	No	`300000`	Timeout in milliseconds
`workspace_dir`	string	No	`.youbencha-workspace`	Workspace directory
`agent`	object	Yes	-	Agent configuration
`evaluators`	array	Yes	-	List of evaluators
`pre_execution`	array	No	`[]`	Pre-execution hooks
`post_evaluation`	array	No	`[]`	Post-evaluation hooks

Complete Example

# Test case metadata
name: add-auth-feature
description: Evaluate adding JWT authentication to the API

# Repository configuration
repo: https://github.com/example/api-server.git
branch: main
commit: abc123  # Optional: specific commit

# Expected reference (for comparison)
expected_source: branch
expected: feature/auth-complete

# Timeout configuration
timeout: 300000  # 5 minutes

# Workspace directory
workspace_dir: .youbencha-workspace

# AI Agent
agent:
  type: copilot-cli
  model: claude-sonnet-4.5
  agent_name: my-custom-agent
  config:
    prompt_file: ./prompts/add-auth.md

# Pre-execution hooks
pre_execution:
  - name: script
    config:
      command: bash
      args: ["-c", "echo 'Setup complete'"]
      timeout_ms: 30000

# Evaluators
evaluators:
  - name: git-diff
    config:
      assertions:
        max_files_changed: 10

  - name: expected-diff
    config:
      threshold: 0.85

  - name: agentic-judge
    config:
      type: copilot-cli
      assertions:
        auth_implemented: "JWT auth is implemented. Score 1 if yes, 0 if no."

# Post-evaluation hooks
post_evaluation:
  - name: database
    config:
      type: json-file
      output_path: ./results.jsonl
      append: true

  - name: webhook
    config:
      url: ${SLACK_WEBHOOK_URL}
      method: POST

Repository Configuration

repo (required)

The repository URL to clone and evaluate.

repo: https://github.com/example/repo.git

Requirements:

Must be HTTP or HTTPS URL
No localhost or internal network URLs (security)

branch

The branch to checkout.

branch: main

Default: Repository’s default branch

commit

Checkout a specific commit SHA.

commit: abc123def456

Note: If both branch and commit are specified, the commit is checked out after the branch.

Expected Reference

Configure a reference for comparison using expected-diff evaluator.

expected_source

Type of reference source.

expected_source: branch  # or: commit

expected

The reference branch name or commit SHA.

expected_source: branch
expected: feature/completed

Agent Configuration

agent.type (required)

The agent adapter to use.

agent:
  type: copilot-cli

Supported values: copilot-cli

agent.model

Specify the AI model to use.

agent:
  type: copilot-cli
  model: claude-sonnet-4.5

Supported models:

claude-sonnet-4.5, claude-sonnet-4, claude-haiku-4.5
gpt-5, gpt-5.1, gpt-5.1-codex-mini, gpt-5.1-codex
gemini-3-pro-preview

agent.agent_name

Use a named agent from .github/agents/ directory.

agent:
  type: copilot-cli
  agent_name: my-custom-agent

When specified:

The .github/agents/ directory is copied to the workspace
Agent is invoked with --agent <name> flag

agent.config.prompt

Inline prompt for the agent.

agent:
  type: copilot-cli
  config:
    prompt: "Add error handling to all API endpoints"

agent.config.prompt_file

Load prompt from an external file.

agent:
  type: copilot-cli
  config:
    prompt_file: ./prompts/add-auth.md

Timeout Configuration

timeout

Maximum time for the entire evaluation run in milliseconds.

timeout: 300000  # 5 minutes (default)

Value	Duration
`60000`	1 minute
`300000`	5 minutes (default)
`600000`	10 minutes
`1800000`	30 minutes

workspace_dir

Directory where the repository is cloned and agent operates.

workspace_dir: .youbencha-workspace

Default: .youbencha-workspace

The directory is created relative to where yb run is executed.

Evaluators Configuration

Evaluator Schema

Each evaluator in the evaluators array follows this schema:

Field	Type	Required	Description
`name`	string	Yes*	Built-in evaluator name
`file`	string	Yes*	Path to external evaluator file
`config`	object	No	Evaluator-specific configuration

*Either name or file is required, not both.

Basic Evaluator

evaluators:
  - name: git-diff

Evaluator with Config

evaluators:
  - name: agentic-judge
    config:
      type: copilot-cli
      assertions:
        test_added: "Tests were added. Score 1 if yes, 0 if no."

External Evaluator Definition

Reference an evaluator defined in a separate file:

evaluators:
  - file: ./evaluators/test-coverage.yaml

git-diff Configuration

Option	Type	Description
`assertions.max_files_changed`	number	Maximum files that can be modified
`assertions.max_lines_added`	number	Maximum lines that can be added
`assertions.max_lines_removed`	number	Maximum lines that can be removed
`assertions.max_total_changes`	number	Maximum total changes
`assertions.min_change_entropy`	number	Minimum entropy (enforce distributed changes)
`assertions.max_change_entropy`	number	Maximum entropy (enforce focused changes)

evaluators:
  - name: git-diff
    config:
      assertions:
        max_files_changed: 10
        max_lines_added: 200
        max_total_changes: 300

expected-diff Configuration

Option	Type	Default	Description
`threshold`	number	`0.85`	Similarity threshold (0.0 - 1.0)

evaluators:
  - name: expected-diff
    config:
      threshold: 0.85

Threshold Guidelines:

Range	Description
`1.0`	Exact match (very strict)
`0.9-0.99`	Very similar, minor differences
`0.7-0.89`	Mostly similar, moderate differences
`<0.7`	Significantly different (lenient)

agentic-judge Configuration

Option	Type	Description
`type`	string	Agent type (e.g., `copilot-cli`)
`agent_name`	string	Named agent from `.github/agents/`
`model`	string	AI model to use
`prompt_file`	string	Custom instructions file
`assertions`	object	Key-value pairs of assertions

evaluators:
  - name: agentic-judge
    config:
      type: copilot-cli
      model: claude-sonnet-4.5
      assertions:
        code_quality: "Code follows best practices. Score 0-1."
        tests_added: "Appropriate tests were added. Score 0-1."

Hooks Configuration

Pre-Execution Hooks

Run after workspace setup but before agent execution.

Field	Type	Required	Description
`name`	string	Yes	Hook type (`script`)
`config.command`	string	Yes	Command to run
`config.args`	array	No	Command arguments
`config.timeout_ms`	number	No	Timeout in milliseconds

pre_execution:
  - name: script
    config:
      command: bash
      args: ["-c", "echo 'Setup complete'"]
      timeout_ms: 30000

Available Environment Variables:

Variable	Description
`WORKSPACE_DIR`	Absolute path to workspace directory

Post-Evaluation Hooks

Run after all evaluators complete.

Database Hook

post_evaluation:
  - name: database
    config:
      type: json-file
      output_path: ./results.jsonl
      append: true

Option	Type	Description
`type`	string	Storage type (`json-file`)
`output_path`	string	Path to output file
`append`	boolean	Append to existing file

Webhook Hook

post_evaluation:
  - name: webhook
    config:
      url: ${SLACK_WEBHOOK_URL}
      method: POST

Option	Type	Description
`url`	string	Webhook URL
`method`	string	HTTP method (`POST`)

Environment Variables

Use ${VAR_NAME} syntax to reference environment variables:

post_evaluation:
  - name: webhook
    config:
      url: ${SLACK_WEBHOOK_URL}