Workspace Management

Every youBencha evaluation runs in an isolated workspace. Understanding workspace management helps with debugging and optimization.

Workspace Structure

Each run creates a directory:

.youbencha-workspace/
└── run-{timestamp}-{hash}/
    ├── src-modified/              # Code after agent execution
    ├── src-expected/              # Reference code (if configured)
    ├── artifacts/
    │   ├── results.json           # Machine-readable results
    │   ├── report.md              # Human-readable report
    │   ├── youbencha.log.json     # Agent execution log
    │   ├── git-diff.patch         # Git diff output
    │   └── expected-diff.json     # Similarity analysis
    └── .youbencha.lock            # Workspace metadata

Directory Purposes

src-modified/

Contains the repository after agent execution:

Cloned from specified repo/branch
Modified by pre-execution hooks
Modified by the AI agent

Use for debugging what the agent produced.

src-expected/

Contains the reference code (when using expected-diff):

Cloned from expected branch/commit
Unmodified, used for comparison

artifacts/

Evaluation outputs:

File	Purpose
`results.json`	Complete evaluation results
`report.md`	Human-readable summary
`youbencha.log.json`	Agent execution details
`git-diff.patch`	Raw git diff output
`expected-diff.json`	File similarity analysis

.youbencha.lock

Workspace metadata:

{
  "created_at": "2024-11-15T10:30:00Z",
  "suite_name": "auth-evaluation",
  "status": "completed",
  "repo": "https://github.com/example/repo.git",
  "branch": "main"
}

Workspace Naming

Format: run-{timestamp}-{hash}

timestamp: ISO 8601 format (YYYYMMDD-HHMMSS)
hash: Short unique identifier

Example: run-20241115-103000-a1b2c3

Custom Workspace Location

Override the default location:

workspace_dir: /tmp/youbencha-evaluations

Or via environment:

WORKSPACE_DIR=/tmp/youbencha yb run -c suite.yaml

Workspace Cleanup

Automatic Cleanup

Delete workspace after successful run:

yb run -c suite.yaml --delete-workspace

Manual Cleanup

Remove all workspaces:

rm -rf .youbencha-workspace/

Remove specific run:

rm -rf .youbencha-workspace/run-20241115-103000-a1b2c3/

Cleanup Script

#!/bin/bash
# Remove workspaces older than 7 days
find .youbencha-workspace -maxdepth 1 -type d -mtime +7 -exec rm -rf {} \;

Inspecting Workspaces

View Agent Output

# See what files changed
git -C .youbencha-workspace/run-*/src-modified diff --stat

# View specific file
cat .youbencha-workspace/run-*/src-modified/src/auth.ts

View Results

# Machine-readable
cat .youbencha-workspace/run-*/artifacts/results.json | jq

# Human-readable
cat .youbencha-workspace/run-*/artifacts/report.md

View Agent Log

cat .youbencha-workspace/run-*/artifacts/youbencha.log.json | jq

Workspace Isolation

Each evaluation is fully isolated:

No shared state between runs
Fresh clone for each evaluation
Independent artifacts per run
No mutation of original repository

Debugging Failed Evaluations

1. Check Workspace Exists

ls -la .youbencha-workspace/

2. Review Agent Log

jq '.agent_output' .youbencha-workspace/run-*/artifacts/youbencha.log.json

3. Check Pre-Execution

jq '.pre_execution' .youbencha-workspace/run-*/artifacts/youbencha.log.json

4. Compare Diffs

# What agent changed
cat .youbencha-workspace/run-*/artifacts/git-diff.patch

# Expected vs actual (if using expected-diff)
cat .youbencha-workspace/run-*/artifacts/expected-diff.json | jq

CI/CD Considerations

Upload Artifacts

- name: Upload Workspace
  uses: actions/upload-artifact@v4
  if: always()
  with:
    name: workspace-${{ github.run_id }}
    path: .youbencha-workspace/
    retention-days: 7

Ephemeral Runners

On CI, workspaces are automatically cleaned when jobs complete. Use artifacts for persistence.

Best Practices

Clean up regularly - Old workspaces consume disk space
Archive on failure - Keep failed runs for debugging
Use custom paths - Avoid conflicts on shared systems
Inspect before cleanup - Review results before deleting