|
| 1 | +--- |
| 2 | +name: blacksmith-testbox |
| 3 | +description: > |
| 4 | + Validate code changes against real CI. Use for all tests, builds, |
| 5 | + migrations, and any command that depends on secrets or services. |
| 6 | + Use when testing, validating, checking, verifying, or before any |
| 7 | + commit or push. |
| 8 | +--- |
| 9 | + |
| 10 | +# Blacksmith Testbox |
| 11 | + |
| 12 | +## Install the CLI |
| 13 | + |
| 14 | +If `blacksmith` is not installed, install it: |
| 15 | + |
| 16 | + curl -fsSL https://get.blacksmith.sh | sh |
| 17 | + |
| 18 | +For the canary channel (bleeding-edge): |
| 19 | + |
| 20 | + BLACKSMITH_CHANNEL=canary sh -c 'curl -fsSL https://get.blacksmith.sh | sh' |
| 21 | + |
| 22 | +Then authenticate: |
| 23 | + |
| 24 | + blacksmith auth login |
| 25 | + |
| 26 | +## Agent-triggered browser auth (non-interactive) |
| 27 | + |
| 28 | +When an agent needs to ensure the user is authenticated before running testbox |
| 29 | +commands (e.g. warmup, run), use browser-based auth with non-interactive mode. |
| 30 | +This opens the browser for the user to sign in; the agent does not interact with |
| 31 | +the browser. The org selector in the dashboard is skipped, so the user only sees |
| 32 | +the sign-in flow. |
| 33 | + |
| 34 | +**Required command** (`--organization` is required with `--non-interactive`): |
| 35 | + |
| 36 | + blacksmith auth login --non-interactive --organization <org-slug> |
| 37 | + |
| 38 | +The org slug can come from `BLACKSMITH_ORG` env var or the `--org` global flag. |
| 39 | +If neither is set, the agent should use the project's known org (e.g. from repo |
| 40 | +config or user context). Example: |
| 41 | + |
| 42 | + blacksmith auth login --non-interactive --organization acme-corp |
| 43 | + blacksmith --org acme-corp auth login --non-interactive --organization acme-corp |
| 44 | + |
| 45 | +**Flow**: The CLI starts a local callback server, opens the browser to the |
| 46 | +dashboard auth page, and blocks for up to 2 minutes. The user completes sign-in |
| 47 | +and authorization in the browser. The dashboard redirects to localhost with the |
| 48 | +token; the CLI saves credentials and exits. The agent then proceeds. |
| 49 | + |
| 50 | +**Do not use** `--api-token` for this flow — that is for headless/token-based |
| 51 | +auth. This skill focuses on browser-based auth when the user prefers signing in |
| 52 | +via the web UI. |
| 53 | + |
| 54 | +Optional flags: |
| 55 | +- `--dashboard-url <url>` — Override dashboard URL (e.g. for staging) |
| 56 | + |
| 57 | +## Setup: Warmup before coding |
| 58 | + |
| 59 | +Before writing any code, warm up a testbox. This returns an ID instantly |
| 60 | +and boots the CI environment in the background while you work: |
| 61 | + |
| 62 | + blacksmith testbox warmup code-quality-testbox.yml |
| 63 | + # → tbx_01jkz5b3t9... |
| 64 | + |
| 65 | +Save this ID. You need it for every `run` command. |
| 66 | + |
| 67 | +Warmup dispatches a GitHub Actions workflow that provisions a VM with the |
| 68 | +full CI environment: dependencies installed, services started, secrets |
| 69 | +injected, and a clean checkout of the repo at the default branch. |
| 70 | + |
| 71 | +Options: |
| 72 | + |
| 73 | + --ref <branch> Git ref to dispatch against (default: repo's default branch) |
| 74 | + --job <name> Specific job within the workflow (if it has multiple) |
| 75 | + --idle-timeout <min> Idle timeout in minutes (default: 30) |
| 76 | + |
| 77 | +## CRITICAL: Always run from the repo root |
| 78 | + |
| 79 | +ALWAYS invoke `blacksmith testbox` commands from the **root of the git |
| 80 | +repository**. The CLI syncs the current working directory to the testbox |
| 81 | +using rsync with `--delete`. If you run from a subdirectory (e.g. |
| 82 | +`cd backend && blacksmith testbox run ...`), rsync will mirror only that |
| 83 | +subdirectory and **delete everything else** on the testbox — wiping other |
| 84 | +directories like `dashboard/`, `cli/`, etc. |
| 85 | + |
| 86 | + # CORRECT — run from repo root, use paths in the command |
| 87 | + blacksmith testbox run --id <ID> "cd backend && php artisan test" |
| 88 | + blacksmith testbox run --id <ID> "cd dashboard && npm test" |
| 89 | + |
| 90 | + # WRONG — do NOT cd into a subdirectory before invoking the CLI |
| 91 | + cd backend && blacksmith testbox run --id <ID> "php artisan test" |
| 92 | + |
| 93 | +If your shell is in a subdirectory, `cd` back to the repo root first: |
| 94 | + |
| 95 | + cd "$(git rev-parse --show-toplevel)" |
| 96 | + blacksmith testbox run --id <ID> "cd backend && php artisan test" |
| 97 | + |
| 98 | +## Running commands |
| 99 | + |
| 100 | + blacksmith testbox run --id <ID> "<command>" |
| 101 | + |
| 102 | +The `run` command automatically waits for the testbox to become ready if |
| 103 | +it is still booting, so you can call `run` immediately after warmup without |
| 104 | +needing to check status first. |
| 105 | + |
| 106 | +## Downloading files from a testbox |
| 107 | + |
| 108 | +Use the `download` command to retrieve files or directories from a running |
| 109 | +testbox to your local machine. This is useful for fetching build artifacts, |
| 110 | +test results, coverage reports, or any output generated on the testbox. |
| 111 | + |
| 112 | + blacksmith testbox download --id <ID> <remote-path> [local-path] |
| 113 | + |
| 114 | +The remote path is relative to the testbox working directory (same as `run`). |
| 115 | +If no local path is specified, the file is saved to the current directory |
| 116 | +using the same base name. |
| 117 | + |
| 118 | +To download a directory, append a trailing `/` to the remote path — this |
| 119 | +triggers recursive mode: |
| 120 | + |
| 121 | + # Download a single file |
| 122 | + blacksmith testbox download --id <ID> coverage/report.html |
| 123 | + |
| 124 | + # Download a file to a specific local path |
| 125 | + blacksmith testbox download --id <ID> build/output.tar.gz ./output.tar.gz |
| 126 | + |
| 127 | + # Download an entire directory |
| 128 | + blacksmith testbox download --id <ID> test-results/ ./results/ |
| 129 | + |
| 130 | +Options: |
| 131 | + |
| 132 | + --ssh-private-key <path> Path to SSH private key (if warmup used --ssh-public-key) |
| 133 | + |
| 134 | +## How file sync works |
| 135 | + |
| 136 | +Understanding this model is critical for using Testbox correctly. |
| 137 | + |
| 138 | +When you call `run`, the CLI performs a **delta sync** of your local changes |
| 139 | +to the remote testbox before executing your command: |
| 140 | + |
| 141 | +1. The testbox VM starts from a clean `actions/checkout` at the warmup ref. |
| 142 | + The workflow's setup steps (e.g. `npm install`, `pip install`, `composer install`) |
| 143 | + run during warmup and populate dependency directories on the remote VM. |
| 144 | + |
| 145 | +2. On each `run`, the CLI uses **git** to detect which files changed locally |
| 146 | + since the last sync. It syncs ONLY tracked files and untracked non-ignored |
| 147 | + files (i.e. files that `git ls-files` reports). |
| 148 | + |
| 149 | +3. **`.gitignore`'d directories are never synced.** This means directories |
| 150 | + like `node_modules/`, `vendor/`, `.venv/`, `build/`, `dist/`, etc. are |
| 151 | + NOT transferred from your local machine. The testbox uses its own copies |
| 152 | + of those directories, populated during the warmup workflow steps. |
| 153 | + |
| 154 | +4. If nothing has changed since the last sync (same git commit and working |
| 155 | + tree state), the sync is skipped entirely for speed. |
| 156 | + |
| 157 | +### Why this matters |
| 158 | + |
| 159 | +- **Changing dependencies**: If you modify `package.json`, `requirements.txt`, |
| 160 | + `composer.json`, `go.mod`, or similar dependency manifests, the lock/manifest |
| 161 | + file will be synced but the actual dependency directory will NOT. You must |
| 162 | + re-run the install command on the testbox: |
| 163 | + |
| 164 | + blacksmith testbox run --id <ID> "npm install && npm test" |
| 165 | + blacksmith testbox run --id <ID> "pip install -r requirements.txt && pytest" |
| 166 | + blacksmith testbox run --id <ID> "composer install && phpunit" |
| 167 | + |
| 168 | +- **Generated/build artifacts**: If your tests depend on a build step (e.g. |
| 169 | + `npm run build`, `make`), and you changed source files that affect the build |
| 170 | + output, re-run the build on the testbox before testing. |
| 171 | + |
| 172 | +- **New untracked files**: New files you create locally ARE synced (as long as |
| 173 | + they are not gitignored). You do not need to `git add` them first. |
| 174 | + |
| 175 | +- **Deleted files**: Files you delete locally are also deleted on the remote |
| 176 | + testbox. The sync model keeps the remote in lockstep with your local managed |
| 177 | + file set. |
| 178 | + |
| 179 | +## CRITICAL: Never run tests locally |
| 180 | + |
| 181 | +Your local machine does not have the secrets, services, or environment |
| 182 | +parity needed to validate changes reliably. Every test run, migration, |
| 183 | +build, or integration check MUST go through Testbox. |
| 184 | + |
| 185 | +## When to use |
| 186 | + |
| 187 | +ALWAYS use Testbox when: |
| 188 | +- Running any test (unit, integration, e2e) |
| 189 | +- Running database migrations |
| 190 | +- Building the project to check for compile errors |
| 191 | +- Running any command that depends on secrets or environment variables |
| 192 | +- Validating changes before committing |
| 193 | + |
| 194 | +The ONLY exception is trivial checks with zero external dependencies |
| 195 | +(e.g., running a linter or formatter locally). |
| 196 | + |
| 197 | +## Workflow |
| 198 | + |
| 199 | +1. Warm up immediately when you receive a coding task: |
| 200 | + `blacksmith testbox warmup code-quality-testbox.yml` → save the ID |
| 201 | +2. Write code while the testbox boots in the background. |
| 202 | +3. Run tests (the CLI auto-waits if the testbox isn't ready yet): |
| 203 | + `blacksmith testbox run --id <ID> "npm test"` |
| 204 | +4. If tests fail, fix code and re-run (fast — same warm testbox, only |
| 205 | + changed files are synced). |
| 206 | +5. If you changed dependency manifests (package.json, etc.), prepend |
| 207 | + the install command: `blacksmith testbox run --id <ID> "npm install && npm test"` |
| 208 | +6. If you need artifacts (coverage reports, build outputs, etc.), download them: |
| 209 | + `blacksmith testbox download --id <ID> coverage/ ./coverage/` |
| 210 | +7. Once green, commit and push. |
| 211 | + |
| 212 | +## Examples |
| 213 | + |
| 214 | + blacksmith testbox warmup code-quality-testbox.yml |
| 215 | + # → tbx_01jkz5b3t9... |
| 216 | + |
| 217 | + # Run tests |
| 218 | + blacksmith testbox run --id <ID> "npm test -- --testPathPattern=handler.test" |
| 219 | + blacksmith testbox run --id <ID> "go test ./pkg/api/... -run TestHandler -v" |
| 220 | + blacksmith testbox run --id <ID> "python -m pytest tests/test_api.py -k test_auth" |
| 221 | + |
| 222 | + # Re-install deps after changing package.json, then test |
| 223 | + blacksmith testbox run --id <ID> "npm install && npm test" |
| 224 | + |
| 225 | + # Build and test |
| 226 | + blacksmith testbox run --id <ID> "npm run build && npm test" |
| 227 | + |
| 228 | + # Download artifacts from the testbox |
| 229 | + blacksmith testbox download --id <ID> coverage/lcov-report/ ./coverage/ |
| 230 | + blacksmith testbox download --id <ID> build/output.tar.gz |
| 231 | + |
| 232 | +## Waiting for the testbox to be ready |
| 233 | + |
| 234 | +The `run` command automatically waits for the testbox, so explicit waiting is |
| 235 | +usually unnecessary. If you do need to check readiness separately (e.g. before |
| 236 | +a series of runs), use the `--wait` flag. Do NOT use a sleep-and-recheck loop. |
| 237 | + |
| 238 | +Correct: block until ready with a timeout: |
| 239 | + |
| 240 | + blacksmith testbox status --id <ID> --wait [--wait-timeout 5m] |
| 241 | + |
| 242 | +Wrong: never use sleep + status in a loop: |
| 243 | + |
| 244 | + # BAD — do not do this |
| 245 | + sleep 30 && blacksmith testbox status --id <ID> |
| 246 | + while ! blacksmith testbox status --id <ID> | grep ready; do sleep 5; done |
| 247 | + |
| 248 | +`--wait` polls the status and exits as soon as the testbox is ready (or when the |
| 249 | +timeout is reached). Default timeout is 5m; use `--wait-timeout` for longer |
| 250 | +(e.g. `10m`, `1h`). |
| 251 | + |
| 252 | +## Managing testboxes |
| 253 | + |
| 254 | + # Check status of a specific testbox |
| 255 | + blacksmith testbox status --id <ID> |
| 256 | + |
| 257 | + # List all active testboxes for the current repo |
| 258 | + blacksmith testbox list |
| 259 | + |
| 260 | + # Stop a testbox when you're done (frees resources) |
| 261 | + blacksmith testbox stop --id <ID> |
| 262 | + |
| 263 | +Testboxes automatically shut down after being idle (default: 30 minutes). |
| 264 | +If you need a longer session, increase the timeout at warmup time: |
| 265 | + |
| 266 | + blacksmith testbox warmup code-quality-testbox.yml --idle-timeout 60 |
| 267 | + |
| 268 | +## With options |
| 269 | + |
| 270 | + blacksmith testbox warmup code-quality-testbox.yml --ref main |
| 271 | + blacksmith testbox warmup code-quality-testbox.yml --idle-timeout 60 |
| 272 | + blacksmith testbox run --id <ID> "go test ./..." |
0 commit comments