Skip to content

Workspace Management

Every youBencha evaluation runs in an isolated workspace. Understanding workspace management helps with debugging and optimization.

Each run creates a directory:

.youbencha-workspace/
└── run-{timestamp}-{hash}/
├── src-modified/ # Code after agent execution
├── src-expected/ # Reference code (if configured)
├── artifacts/
│ ├── results.json # Machine-readable results
│ ├── report.md # Human-readable report
│ ├── youbencha.log.json # Agent execution log
│ ├── git-diff.patch # Git diff output
│ └── expected-diff.json # Similarity analysis
└── .youbencha.lock # Workspace metadata

Contains the repository after agent execution:

  • Cloned from specified repo/branch
  • Modified by pre-execution hooks
  • Modified by the AI agent

Use for debugging what the agent produced.

Contains the reference code (when using expected-diff):

  • Cloned from expected branch/commit
  • Unmodified, used for comparison

Evaluation outputs:

FilePurpose
results.jsonComplete evaluation results
report.mdHuman-readable summary
youbencha.log.jsonAgent execution details
git-diff.patchRaw git diff output
expected-diff.jsonFile similarity analysis

Workspace metadata:

{
"created_at": "2024-11-15T10:30:00Z",
"suite_name": "auth-evaluation",
"status": "completed",
"repo": "https://github.com/example/repo.git",
"branch": "main"
}

Format: run-{timestamp}-{hash}

  • timestamp: ISO 8601 format (YYYYMMDD-HHMMSS)
  • hash: Short unique identifier

Example: run-20241115-103000-a1b2c3

Override the default location:

workspace_dir: /tmp/youbencha-evaluations

Or via environment:

Terminal window
WORKSPACE_DIR=/tmp/youbencha yb run -c suite.yaml

Delete workspace after successful run:

Terminal window
yb run -c suite.yaml --delete-workspace

Remove all workspaces:

Terminal window
rm -rf .youbencha-workspace/

Remove specific run:

Terminal window
rm -rf .youbencha-workspace/run-20241115-103000-a1b2c3/
scripts/cleanup-old-workspaces.sh
#!/bin/bash
# Remove workspaces older than 7 days
find .youbencha-workspace -maxdepth 1 -type d -mtime +7 -exec rm -rf {} \;
Terminal window
# See what files changed
git -C .youbencha-workspace/run-*/src-modified diff --stat
# View specific file
cat .youbencha-workspace/run-*/src-modified/src/auth.ts
Terminal window
# Machine-readable
cat .youbencha-workspace/run-*/artifacts/results.json | jq
# Human-readable
cat .youbencha-workspace/run-*/artifacts/report.md
Terminal window
cat .youbencha-workspace/run-*/artifacts/youbencha.log.json | jq

Each evaluation is fully isolated:

  • No shared state between runs
  • Fresh clone for each evaluation
  • Independent artifacts per run
  • No mutation of original repository
Terminal window
ls -la .youbencha-workspace/
Terminal window
jq '.agent_output' .youbencha-workspace/run-*/artifacts/youbencha.log.json
Terminal window
jq '.pre_execution' .youbencha-workspace/run-*/artifacts/youbencha.log.json
Terminal window
# What agent changed
cat .youbencha-workspace/run-*/artifacts/git-diff.patch
# Expected vs actual (if using expected-diff)
cat .youbencha-workspace/run-*/artifacts/expected-diff.json | jq
.github/workflows/youbencha.yml
- name: Upload Workspace
uses: actions/upload-artifact@v4
if: always()
with:
name: workspace-${{ github.run_id }}
path: .youbencha-workspace/
retention-days: 7

On CI, workspaces are automatically cleaned when jobs complete. Use artifacts for persistence.

  1. Clean up regularly - Old workspaces consume disk space
  2. Archive on failure - Keep failed runs for debugging
  3. Use custom paths - Avoid conflicts on shared systems
  4. Inspect before cleanup - Review results before deleting