[roadmap] /ba:prove — capture screenshot evidence of a feature for the MR

## Idea

A last-mile command that captures **objective screenshot evidence that a feature works** and stages it for attachment to the MR/PR. Lately this has been done by hand with chrome-devtools MCP (or `agent-browser`): drive the app, often **mock network requests to reach different scenarios** (empty / error / loading / success), snap screenshots, attach to the MR.

This lives at the *same last-mile moment* as `/ba:polish` (server up, browser open, last step before shipping) and shares the **ui-driver / browser seam** — but it is a **different lane** and should not be folded into polish.

## Why it is not polish (and not review)

Same moment + same tooling, but different on every axis the polish brainstorm locked down:

| | **polish** | **/ba:prove** (this) | *(verification — separately deferred)* |
|---|---|---|---|
| Signal | subjective (feel) | objective (it works) | objective (it works) |
| Audience | **you, now** | **the reviewer, later** | the agent, as a gate |
| Output | ephemeral, **no artifacts** | **durable artifacts** (PNGs on the MR) | pass/fail |
| Drives state | observes current state | **constructs states** (mock network per scenario) | constructs states |

Folding it into polish would break polish's locked identity ("purely conversational, no persisted artifacts, subjective feel"). It also doesn't belong in `/ba:review` (server-less and already slow).

## Proposed shape

- **Own command, `/ba:prove`** (its own lane: distinct audience + a non-trivial mocking core), **orchestrated by `/ba:propose`** since the artifact is MR-facing — e.g. `/ba:propose --capture` calls it, or it runs standalone.
- **Shares the ui-driver seam** with polish; does not reach into polish's conversational loop.
- **Cheap → expensive split:**
  - **v1 (mock-free):** capture the happy path — navigate (or hand it routes), snap, stage images for the MR body. Immediately useful.
  - **v2 (scenario-driven):** define states, mock network per state, drive + snap each. The real value, the real cost. Likely overlaps with the repo's existing mocking (MSW-style / fixtures).

## Prior art — Every's compound-engineering plugin

- [`ce-demo-reel`](https://github.com/EveryInc/compound-engineering-plugin/blob/main/plugins/compound-engineering/skills/ce-demo-reel/SKILL.md) — closest match. Core principle: *"Evidence means USING THE PRODUCT, not running tests."* Captures tiers (browser GIF / terminal recording / screenshot reel / static PNGs / no-evidence), **scans for secrets before upload**, uploads to a public host, returns `{tier, description, url, path}`, and **the caller integrates it into the PR description** under a "Demo"/"Screenshots" label. Strong template for the artifact + propose hand-off.
- [`ce-test-browser`](https://github.com/EveryInc/compound-engineering-plugin/blob/63b6b260c345ba70ce9d9a393eeedefb64e4e0a0/plugins/compound-engineering/skills/ce-test-browser/SKILL.md) — diff→route mapping, test each page, pass/fail table. This is the **verification** lane (objective gate for the agent), *not* this issue — useful reference for the deferred verification idea instead.

**Key divergence:** neither Every skill does **network mocking / scenario states** — they capture whatever happy-path state the app is in. The scenario-mocking layer is the net-new, expensive part of this idea and the thing to scope carefully.

## Borrow from `ce-demo-reel`
- Tiered capture with a graceful "no evidence needed" tier (text/config-only changes).
- **Secret scanning before any upload** (patterns like `sk-`, `ghp_`, `Bearer`, `?token=`); set credentials outside the captured region.
- Standardized return that `propose` splices into the MR body.

## Meta-insight (ui-driver seam)
This is the **second concrete consumer** of the ui-driver seam (after polish; verification would be a third). That's the justification we said was missing when we deferred the seam-as-extension-point — it upgrades "ui-driver seam" from polish-private convenience to deliberate shared last-mile browser infra. Doesn't change v0 polish scope; it's a reason to keep the seam clean.

## Open questions
- Standalone `/ba:prove` vs. `/ba:propose --capture` vs. both?
- Where do scenario definitions live — inline args, a small config, or reuse existing test fixtures/mocks?
- chrome-devtools MCP vs. `agent-browser` as the driver (request interception for mocking is the deciding capability).
- Host/attach screenshots how — upload to a public host (ce-demo-reel style) vs. commit into the MR vs. inline?

## Status
Deliberately a roadmap item, not scheduled. Gated on the ui-driver seam existing. Distinct from `/ba:polish` (feel) and the separately-deferred verification/gate idea.

https://claude.ai/code/session_014AfNMUnKn3oAsZhvNk5Vxa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[roadmap] /ba:prove — capture screenshot evidence of a feature for the MR #20

Idea

Why it is not polish (and not review)

Proposed shape

Prior art — Every's compound-engineering plugin

Borrow from `ce-demo-reel`

Meta-insight (ui-driver seam)

Open questions

Status

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

	polish	/ba:prove (this)	(verification — separately deferred)
Signal	subjective (feel)	objective (it works)	objective (it works)
Audience	you, now	the reviewer, later	the agent, as a gate
Output	ephemeral, no artifacts	durable artifacts (PNGs on the MR)	pass/fail
Drives state	observes current state	constructs states (mock network per scenario)	constructs states

[roadmap] /ba:prove — capture screenshot evidence of a feature for the MR #20

Description

Idea

Why it is not polish (and not review)

Proposed shape

Prior art — Every's compound-engineering plugin

Borrow from ce-demo-reel

Meta-insight (ui-driver seam)

Open questions

Status

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Borrow from `ce-demo-reel`