yb suggest-suite

Generate evaluation suite suggestions using AI agent interaction.

Synopsis

yb suggest-suite --agent <type> --output-dir <path> [options]

Description

The suggest-suite command uses an AI agent to analyze an existing agent output and generate appropriate evaluation criteria. This helps you bootstrap comprehensive suite configurations by:

Analyzing what the agent produced
Detecting patterns in the changes
Recommending appropriate evaluators
Generating complete configuration files

Options

Option	Short	Type	Default	Description
`--agent`	`-a`	string	Required	Agent to use (e.g., `copilot-cli`)
`--output-dir`	`-o`	string	Required	Path to successful agent output folder
`--agent-file`		string	`agents/suggest-suite.agent.md`	Custom agent file for suggestions
`--save`	`-s`	string	stdout	Save generated suite to file
`--interactive`	`-i`	flag	`true`	Ask clarifying questions
`--no-interactive`		flag	`false`	Skip questions, use defaults
`--help`	`-h`	flag	-	Show help message

Examples

Basic Usage

Analyze an agent output and get suggestions:

yb suggest-suite \
  --agent copilot-cli \
  --output-dir .youbencha-workspace/run-2024-11-15/src-modified

Save to File

Generate and save the suggested configuration:

yb suggest-suite \
  --agent copilot-cli \
  --output-dir ./agent-output \
  --save suggested-suite.yaml

Non-Interactive Mode

Skip questions and use defaults:

yb suggest-suite \
  --agent copilot-cli \
  --output-dir ./agent-output \
  --no-interactive \
  --save auto-suite.yaml

Custom Agent File

Use a custom suggestion agent:

yb suggest-suite \
  --agent copilot-cli \
  --output-dir ./agent-output \
  --agent-file ./my-custom-suggester.agent.md

Interactive Workflow

When you run suggest-suite, the AI agent performs these steps:

Analyze output directory - Examine what the agent produced
Ask about baseline - Determine if there’s a reference to compare against
Request original intent - Understand what the agent was supposed to do
Detect patterns - Identify auth changes, tests, API modifications, docs updates
Recommend evaluators - Suggest appropriate evaluators with reasoning
Generate configuration - Output a complete suite configuration

$ yb suggest-suite --agent copilot-cli --output-dir ./agent-output

🔍 Analyzing output directory...

Found 3 modified files:
  - src/auth/login.ts
  - src/auth/middleware.ts
  - tests/auth.test.ts

📋 Questions:

1. Is there a reference branch to compare against?
   > Yes, feature/auth-complete

2. What was the original task?
   > Add JWT authentication to the login endpoint

3. Should tests be required?
   > Yes

🎯 Recommended Suite:

Based on your answers, I recommend:

✅ expected-diff (threshold: 0.85)
   - You have a reference branch for comparison

✅ git-diff
   - Track scope: max 5 files, max 200 lines

✅ agentic-judge
   - Assertions:
     - jwt_implemented: "JWT authentication is implemented"
     - tests_added: "Unit tests cover the new auth flow"
     - middleware_secured: "Routes are protected by auth middleware"

Generating suite.yaml...

📄 Configuration saved to: suggested-suite.yaml

$ yb suggest-suite --agent copilot-cli --output-dir ./agent-output --no-interactive

🔍 Analyzing output directory...

Found 3 modified files:
  - src/auth/login.ts
  - src/auth/middleware.ts
  - tests/auth.test.ts

🎯 Auto-detected patterns:

✅ Authentication changes detected
✅ Test files present
✅ No reference branch (using git-diff only)

📄 Generated configuration:

repo: https://github.com/...
branch: main
agent:
  type: copilot-cli
  config:
    prompt: "[auto-detected from changes]"
evaluators:
  - name: git-diff
  - name: agentic-judge
    config:
      type: copilot-cli
      assertions:
        auth_implemented: "Authentication functionality is implemented correctly"
        tests_present: "Tests cover the new functionality"

Output Example

The generated configuration includes comprehensive evaluator settings:

# Generated by yb suggest-suite
# Review and customize before running

repo: https://github.com/your-org/your-repo.git
branch: main
expected_source: branch
expected: feature/auth-complete

agent:
  type: copilot-cli
  config:
    prompt: "Add JWT authentication to the login endpoint"

evaluators:
  - name: expected-diff
    config:
      threshold: 0.85

  - name: git-diff
    config:
      assertions:
        max_files_changed: 5
        max_lines_added: 200

  - name: agentic-judge
    config:
      type: copilot-cli
      assertions:
        jwt_implemented: "JWT authentication is properly implemented. Score 1 if complete, 0.5 if partial, 0 if missing."
        tests_added: "Unit tests cover the authentication flow. Score 1 if comprehensive, 0.5 if basic, 0 if none."
        middleware_secured: "API routes are protected by authentication middleware. Score 1 if all routes secured, 0 if not."

Pattern Detection

The AI agent can detect these common patterns:

Pattern	Detection Criteria	Suggested Evaluators
Authentication	Auth-related file changes	agentic-judge with security assertions
Tests	Test file additions	git-diff with test requirements
API Changes	Route/endpoint modifications	agentic-judge with API assertions
Documentation	README/docs updates	git-diff with doc requirements
Configuration	Config file changes	agentic-judge with config validation
Database	Migration/model changes	agentic-judge with data integrity checks

Tips

Exit Codes

Code	Meaning
`0`	Suite generated successfully
`1`	Output directory not found
`2`	Agent not available
`3`	User cancelled interactive prompts

Configuration Reference - Complete configuration options
Evaluators Overview - Available evaluators
agentic-judge - AI-powered evaluation