Reusable Evaluator Definitions
Define evaluators in separate files to reuse them across multiple suites and keep configurations DRY.
Overview
Section titled “Overview”Instead of defining evaluators inline in each suite, you can:
- Create evaluator definition files
- Reference them in your suites
- Share across multiple test cases
Creating Evaluator Files
Section titled “Creating Evaluator Files”Evaluator Definition Structure
Section titled “Evaluator Definition Structure”name: agentic-judge:test-coveragedescription: "Ensures code changes include appropriate test coverage"
config: type: copilot-cli agent_name: agentic-judge assertions: unit_tests: "New code has unit tests. Score 1 if comprehensive, 0.5 if basic, 0 if none." edge_cases: "Edge cases are tested. Score 1 if covered, 0 if missing." test_quality: "Tests are well-structured and maintainable. Score 0-1."Referencing in Suite
Section titled “Referencing in Suite”evaluators: # External evaluator definition - file: ./evaluators/test-coverage.yaml
# Can mix with inline definitions - name: git-diff config: assertions: max_files_changed: 10Organization Patterns
Section titled “Organization Patterns”By Focus Area
Section titled “By Focus Area”evaluators/├── security.yaml├── testing.yaml├── documentation.yaml├── code-quality.yaml└── performance.yamlBy Project Type
Section titled “By Project Type”evaluators/├── frontend/│ ├── react-best-practices.yaml│ └── accessibility.yaml├── backend/│ ├── api-security.yaml│ └── database-safety.yaml└── shared/ ├── error-handling.yaml └── logging.yamlExample: Security Evaluator
Section titled “Example: Security Evaluator”name: agentic-judge:securitydescription: "Evaluates security best practices in code changes"
config: type: copilot-cli model: claude-sonnet-4.5 prompt_file: ./prompts/security-expert.txt assertions: input_validation: | All user inputs are validated before use. Score 1 if properly validated with sanitization. Score 0.5 if validated but not sanitized. Score 0 if no validation. no_injection: | Code is protected against injection attacks (SQL, XSS, command). Score 1 if parameterized queries/escaped output used. Score 0 if raw input used in queries or output. auth_check: | Protected routes verify authentication. Score 1 if all routes check auth. Score 0 if any route is unprotected. secrets_safe: | No hardcoded secrets or credentials. Score 1 if secrets from env vars. Score 0 if hardcoded values found.Example: Documentation Evaluator
Section titled “Example: Documentation Evaluator”name: agentic-judge:documentationdescription: "Ensures code is properly documented"
config: type: copilot-cli assertions: jsdoc_comments: | All public functions have JSDoc comments. Score 1 if all documented with @param and @returns. Score 0.5 if documented but incomplete. Score 0 if missing documentation. readme_updated: | README is updated for new features. Score 1 if README documents new functionality. Score 0.5 if README exists but not updated. Score 0 if no README or severely outdated. inline_comments: | Complex logic has explanatory comments. Score 1 if complex sections are explained. Score 0 if confusing code lacks comments.Using Multiple Evaluator Files
Section titled “Using Multiple Evaluator Files”repo: https://github.com/example/app.gitbranch: main
agent: type: copilot-cli config: prompt: "Add user authentication feature"
evaluators: # Reusable evaluators - file: ./evaluators/security.yaml - file: ./evaluators/testing.yaml - file: ./evaluators/documentation.yaml
# Task-specific inline evaluator - name: git-diff config: assertions: max_files_changed: 15Overriding Evaluator Config
Section titled “Overriding Evaluator Config”You can override settings from the file:
evaluators: - file: ./evaluators/security.yaml config: model: gpt-5 # Override model from fileSharing Across Teams
Section titled “Sharing Across Teams”Git Submodule
Section titled “Git Submodule”git submodule add https://github.com/org/shared-evaluators.git evaluators/sharedevaluators: - file: ./evaluators/shared/security.yamlnpm Package
Section titled “npm Package”Create a package with evaluator definitions:
{ "name": "@org/evaluators", "files": ["evaluators/"]}evaluators: - file: ./node_modules/@org/evaluators/security.yamlBest Practices
Section titled “Best Practices”- Descriptive names - Use
agentic-judge:focus-areanaming - Clear descriptions - Document what the evaluator checks
- Multi-line assertions - Use YAML multi-line for readable assertions
- Version control - Track evaluator changes in git
- Centralize shared evaluators - Keep team standards in one place