Skip to content

Post-Evaluation Hooks

Post-evaluation hooks run after all evaluators complete. Use them for result export, notifications, and custom analysis.

  • Run in parallel (not sequential)
  • Never fail the main evaluation
  • Have read-only access to results
  • Execute after all evaluators complete
HookPurpose
databaseExport to JSON/JSONL files
webhookPOST to HTTP endpoints
scriptCustom shell scripts

Export results to structured storage:

post_evaluation:
- name: database
config:
type: json-file
output_path: ./results-history.jsonl
include_full_bundle: true
append: true
OptionDescriptionDefault
typeStorage type (json-file)Required
output_pathFile path for outputRequired
include_full_bundleInclude complete resultsfalse
appendAppend to existing filefalse
post_evaluation:
- name: database
config:
type: json-file
output_path: ./evaluation-history.jsonl
append: true

Each run appends a JSON line:

{"timestamp":"2024-11-15T10:00:00Z","status":"passed","suite":"auth-eval"}
{"timestamp":"2024-11-15T11:00:00Z","status":"passed","suite":"auth-eval"}
{"timestamp":"2024-11-15T12:00:00Z","status":"failed","suite":"auth-eval"}

POST results to HTTP endpoints:

post_evaluation:
- name: webhook
config:
url: ${SLACK_WEBHOOK_URL}
method: POST
headers:
Content-Type: "application/json"
retry_on_failure: true
OptionDescriptionDefault
urlHTTP endpoint URLRequired
methodHTTP methodPOST
headersCustom headers{}
retry_on_failureRetry on errorfalse
timeout_msRequest timeout30000
post_evaluation:
- name: webhook
config:
url: ${SLACK_WEBHOOK_URL}
method: POST
headers:
Content-Type: "application/json"

youBencha sends a JSON payload:

{
"text": "Evaluation Complete",
"attachments": [{
"color": "#36a64f",
"fields": [
{ "title": "Status", "value": "passed", "short": true },
{ "title": "Suite", "value": "auth-evaluation", "short": true }
]
}]
}
post_evaluation:
- name: webhook
config:
url: https://api.example.com/evaluations
method: POST
headers:
Authorization: "Bearer ${API_TOKEN}"
Content-Type: "application/json"
retry_on_failure: true

Run custom analysis scripts:

post_evaluation:
- name: script
config:
command: ./scripts/analyze-results.sh
args: ["${RESULTS_PATH}"]
env:
SLACK_WEBHOOK: "${SLACK_WEBHOOK_URL}"
OptionDescriptionDefault
commandScript/command to runRequired
argsCommand arguments[]
envEnvironment variables{}
timeout_msExecution timeout60000
scripts/notify-slack.sh
#!/bin/bash
RESULTS=$1
STATUS=$(jq -r '.summary.overall_status' "$RESULTS")
PASSED=$(jq -r '.summary.passed' "$RESULTS")
FAILED=$(jq -r '.summary.failed' "$RESULTS")
COLOR="good"
if [ "$STATUS" != "passed" ]; then
COLOR="danger"
fi
curl -X POST "$SLACK_WEBHOOK" -H "Content-Type: application/json" -d "{
\"attachments\": [{
\"color\": \"$COLOR\",
\"title\": \"youBencha Evaluation: $STATUS\",
\"fields\": [
{\"title\": \"Passed\", \"value\": \"$PASSED\", \"short\": true},
{\"title\": \"Failed\", \"value\": \"$FAILED\", \"short\": true}
]
}]
}"
post_evaluation:
- name: script
config:
command: ./scripts/notify-slack.sh
args: ["${RESULTS_PATH}"]
env:
SLACK_WEBHOOK: "${SLACK_WEBHOOK_URL}"

Post-evaluation hooks have access to:

VariableDescription
WORKSPACE_DIRWorkspace directory path
REPO_DIRRepository directory
ARTIFACTS_DIRArtifacts directory
RESULTS_PATHPath to results.json
TEST_CASE_NAMETest case name

All post-evaluation hooks run in parallel:

post_evaluation:
# These all run at the same time
- name: database
config:
type: json-file
output_path: ./history.jsonl
append: true
- name: webhook
config:
url: ${SLACK_WEBHOOK_URL}
- name: script
config:
command: ./scripts/upload-to-s3.sh
args: ["${RESULTS_PATH}"]

Post-evaluation hooks never fail the main evaluation:

  • Hook failures are logged but don’t affect overall status
  • Other hooks continue even if one fails
  • Results are still saved regardless of hook failures
  1. Use environment variables for sensitive data
  2. Set appropriate timeouts for external calls
  3. Enable retry for unreliable endpoints
  4. Keep scripts idempotent for safe retries
  5. Log hook actions for debugging