Workspace Management
Every youBencha evaluation runs in an isolated workspace. Understanding workspace management helps with debugging and optimization.
Workspace Structure
Section titled “Workspace Structure”Each run creates a directory:
.youbencha-workspace/└── run-{timestamp}-{hash}/ ├── src-modified/ # Code after agent execution ├── src-expected/ # Reference code (if configured) ├── artifacts/ │ ├── results.json # Machine-readable results │ ├── report.md # Human-readable report │ ├── youbencha.log.json # Agent execution log │ ├── git-diff.patch # Git diff output │ └── expected-diff.json # Similarity analysis └── .youbencha.lock # Workspace metadataDirectory Purposes
Section titled “Directory Purposes”src-modified/
Section titled “src-modified/”Contains the repository after agent execution:
- Cloned from specified repo/branch
- Modified by pre-execution hooks
- Modified by the AI agent
Use for debugging what the agent produced.
src-expected/
Section titled “src-expected/”Contains the reference code (when using expected-diff):
- Cloned from
expectedbranch/commit - Unmodified, used for comparison
artifacts/
Section titled “artifacts/”Evaluation outputs:
| File | Purpose |
|---|---|
results.json | Complete evaluation results |
report.md | Human-readable summary |
youbencha.log.json | Agent execution details |
git-diff.patch | Raw git diff output |
expected-diff.json | File similarity analysis |
.youbencha.lock
Section titled “.youbencha.lock”Workspace metadata:
{ "created_at": "2024-11-15T10:30:00Z", "suite_name": "auth-evaluation", "status": "completed", "repo": "https://github.com/example/repo.git", "branch": "main"}Workspace Naming
Section titled “Workspace Naming”Format: run-{timestamp}-{hash}
- timestamp: ISO 8601 format (YYYYMMDD-HHMMSS)
- hash: Short unique identifier
Example: run-20241115-103000-a1b2c3
Custom Workspace Location
Section titled “Custom Workspace Location”Override the default location:
workspace_dir: /tmp/youbencha-evaluationsOr via environment:
WORKSPACE_DIR=/tmp/youbencha yb run -c suite.yamlWorkspace Cleanup
Section titled “Workspace Cleanup”Automatic Cleanup
Section titled “Automatic Cleanup”Delete workspace after successful run:
yb run -c suite.yaml --delete-workspaceManual Cleanup
Section titled “Manual Cleanup”Remove all workspaces:
rm -rf .youbencha-workspace/Remove specific run:
rm -rf .youbencha-workspace/run-20241115-103000-a1b2c3/Cleanup Script
Section titled “Cleanup Script”#!/bin/bash# Remove workspaces older than 7 daysfind .youbencha-workspace -maxdepth 1 -type d -mtime +7 -exec rm -rf {} \;Inspecting Workspaces
Section titled “Inspecting Workspaces”View Agent Output
Section titled “View Agent Output”# See what files changedgit -C .youbencha-workspace/run-*/src-modified diff --stat
# View specific filecat .youbencha-workspace/run-*/src-modified/src/auth.tsView Results
Section titled “View Results”# Machine-readablecat .youbencha-workspace/run-*/artifacts/results.json | jq
# Human-readablecat .youbencha-workspace/run-*/artifacts/report.mdView Agent Log
Section titled “View Agent Log”cat .youbencha-workspace/run-*/artifacts/youbencha.log.json | jqWorkspace Isolation
Section titled “Workspace Isolation”Each evaluation is fully isolated:
- No shared state between runs
- Fresh clone for each evaluation
- Independent artifacts per run
- No mutation of original repository
Debugging Failed Evaluations
Section titled “Debugging Failed Evaluations”1. Check Workspace Exists
Section titled “1. Check Workspace Exists”ls -la .youbencha-workspace/2. Review Agent Log
Section titled “2. Review Agent Log”jq '.agent_output' .youbencha-workspace/run-*/artifacts/youbencha.log.json3. Check Pre-Execution
Section titled “3. Check Pre-Execution”jq '.pre_execution' .youbencha-workspace/run-*/artifacts/youbencha.log.json4. Compare Diffs
Section titled “4. Compare Diffs”# What agent changedcat .youbencha-workspace/run-*/artifacts/git-diff.patch
# Expected vs actual (if using expected-diff)cat .youbencha-workspace/run-*/artifacts/expected-diff.json | jqCI/CD Considerations
Section titled “CI/CD Considerations”Upload Artifacts
Section titled “Upload Artifacts”- name: Upload Workspace uses: actions/upload-artifact@v4 if: always() with: name: workspace-${{ github.run_id }} path: .youbencha-workspace/ retention-days: 7Ephemeral Runners
Section titled “Ephemeral Runners”On CI, workspaces are automatically cleaned when jobs complete. Use artifacts for persistence.
Best Practices
Section titled “Best Practices”- Clean up regularly - Old workspaces consume disk space
- Archive on failure - Keep failed runs for debugging
- Use custom paths - Avoid conflicts on shared systems
- Inspect before cleanup - Review results before deleting