From 5351546e46edfca07f1cfcc8aba3f57bd3f404c0 Mon Sep 17 00:00:00 2001
From: "Yoshiaki Ueda (bootjp)" <contact@bootjp.me>
Date: Fri, 24 Apr 2026 19:57:37 +0900
Subject: [PATCH 01/12] ops(deploy): rolling-update via GitHub Actions over
 Tailscale
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Today's rolling-update flow is manual: operators SSH from a workstation
with required env vars and invoke scripts/rolling-update.sh. That has
no audit trail, no approval gate, no dry-run, and relies on per-operator
secret handling.

This change adds a workflow_dispatch workflow that joins the Tailnet
via tailscale/github-action (ephemeral OAuth node, tag:ci-deploy),
SSHes into each cluster node over MagicDNS, and invokes the existing
scripts/rolling-update.sh — unchanged. Nodes stay as they are; the
script's env-var contract is the integration boundary.

Highlights:
- workflow_dispatch only (no auto-deploy); production GitHub environment
  gates non-dry-run runs on required reviewers.
- inputs: ref (required), image_tag (rollback override), nodes (subset
  filter), dry_run (default true).
- Dry-run does everything up to the container touch: renders NODES +
  SSH_TARGETS from env variables, verifies the image exists on ghcr.io,
  and tailscale-pings every target. Catches typo'd inputs before any
  production effect.
- Concurrency group "rolling-update", cancel-in-progress: false, so
  parallel invocations queue rather than stomp.
- No node-side changes required: nodes are assumed to already run
  tailscaled and expose authorized_keys for the deploy user. Plain SSH
  over Tailscale; Tailscale SSH (keyless) is called out as a follow-up.

Design doc: docs/design/2026_04_24_proposed_deploy_via_tailscale.md
Operator runbook: docs/deploy_via_tailscale_runbook.md

Validation:
- actionlint passes (0 errors)
- YAML parses
- No changes to scripts/rolling-update.sh; the workflow calls it with
  the same env var contract already documented in the script's usage().

Out of scope (follow-ups): post-deploy health verification, auto-
rollback on script failure, Jepsen-gating, image-signature verification,
Tailscale SSH (keyless), shared deploy user.
---
 .github/workflows/rolling-update.yml          | 199 ++++++++++++++++++
 docs/deploy_via_tailscale_runbook.md          | 150 +++++++++++++
 ...026_04_24_proposed_deploy_via_tailscale.md | 191 +++++++++++++++++
 3 files changed, 540 insertions(+)
 create mode 100644 .github/workflows/rolling-update.yml
 create mode 100644 docs/deploy_via_tailscale_runbook.md
 create mode 100644 docs/design/2026_04_24_proposed_deploy_via_tailscale.md

diff --git a/.github/workflows/rolling-update.yml b/.github/workflows/rolling-update.yml
new file mode 100644
index 000000000..25a1c992f
--- /dev/null
+++ b/.github/workflows/rolling-update.yml
@@ -0,0 +1,199 @@
+name: Rolling update
+
+# Manually-triggered production rollout. Joins the Tailnet, SSHes over
+# MagicDNS into each node, and invokes scripts/rolling-update.sh.
+# See docs/design/2026_04_24_proposed_deploy_via_tailscale.md.
+
+on:
+  workflow_dispatch:
+    inputs:
+      ref:
+        description: Git ref (tag or sha) to deploy. Also used as the image tag unless image_tag is set.
+        required: true
+        type: string
+      image_tag:
+        description: Override the image tag (default = ref). Used for rollbacks.
+        required: false
+        type: string
+        default: ""
+      nodes:
+        description: Comma-separated raft IDs to roll (e.g. "n1,n2"). Empty = all nodes in NODES_RAFT_MAP.
+        required: false
+        type: string
+        default: ""
+      dry_run:
+        description: Render the plan and run a reachability check only; do NOT touch containers.
+        required: true
+        type: boolean
+        default: true
+
+permissions:
+  contents: read
+  id-token: write   # required by tailscale/github-action OIDC flow
+
+concurrency:
+  group: rolling-update
+  cancel-in-progress: false
+
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    # Approval gate — see GitHub environment settings for required reviewers.
+    # Dry-runs also use this environment so the secret wiring is identical;
+    # the environment's approval rule should be configured to auto-approve
+    # dry-runs if that distinction is desired (GitHub UI: "Deployment
+    # protection rules").
+    environment: production
+    timeout-minutes: 60
+
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v6
+        with:
+          ref: ${{ inputs.ref }}
+
+      - name: Install jq
+        run: sudo apt-get install -y --no-install-recommends jq
+
+      - name: Verify image exists on ghcr.io
+        env:
+          IMAGE_BASE: ${{ vars.IMAGE_BASE }}
+          IMAGE_TAG: ${{ inputs.image_tag || inputs.ref }}
+          GHCR_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+        run: |
+          set -euo pipefail
+          if [[ -z "$IMAGE_BASE" ]]; then
+            echo "::error::IMAGE_BASE repository variable is not set"
+            exit 1
+          fi
+          echo "Checking $IMAGE_BASE:$IMAGE_TAG"
+          echo "$GHCR_TOKEN" | docker login ghcr.io -u "${{ github.actor }}" --password-stdin >/dev/null
+          if ! docker manifest inspect "$IMAGE_BASE:$IMAGE_TAG" >/dev/null; then
+            echo "::error::image $IMAGE_BASE:$IMAGE_TAG not found on ghcr.io"
+            exit 1
+          fi
+
+      - name: Join Tailnet (ephemeral)
+        uses: tailscale/github-action@v3
+        with:
+          oauth-client-id: ${{ secrets.TS_OAUTH_CLIENT_ID }}
+          oauth-secret: ${{ secrets.TS_OAUTH_SECRET }}
+          tags: tag:ci-deploy
+
+      - name: Configure SSH
+        env:
+          SSH_KEY: ${{ secrets.DEPLOY_SSH_PRIVATE_KEY }}
+          KNOWN_HOSTS: ${{ secrets.DEPLOY_KNOWN_HOSTS }}
+        run: |
+          set -euo pipefail
+          mkdir -p ~/.ssh
+          chmod 700 ~/.ssh
+          printf '%s\n' "$SSH_KEY" > ~/.ssh/id_ed25519
+          chmod 600 ~/.ssh/id_ed25519
+          printf '%s\n' "$KNOWN_HOSTS" > ~/.ssh/known_hosts
+          chmod 644 ~/.ssh/known_hosts
+          # Sanity: no stray CRLF in the key, no empty file.
+          test -s ~/.ssh/id_ed25519 || { echo "::error::DEPLOY_SSH_PRIVATE_KEY is empty"; exit 1; }
+          ssh-keygen -lf ~/.ssh/id_ed25519 >/dev/null
+
+      - name: Render NODES and SSH_TARGETS
+        id: render
+        env:
+          NODES_RAFT_MAP: ${{ vars.NODES_RAFT_MAP }}
+          SSH_TARGETS_MAP: ${{ vars.SSH_TARGETS_MAP }}
+          NODES_FILTER: ${{ inputs.nodes }}
+        run: |
+          set -euo pipefail
+          if [[ -z "$NODES_RAFT_MAP" || -z "$SSH_TARGETS_MAP" ]]; then
+            echo "::error::NODES_RAFT_MAP or SSH_TARGETS_MAP is not set in the production environment variables"
+            exit 1
+          fi
+          if [[ -n "$NODES_FILTER" ]]; then
+            # Filter NODES_RAFT_MAP and SSH_TARGETS_MAP to the requested subset.
+            filter_csv() {
+              local all="$1"
+              local filter="$2"
+              local out=""
+              IFS=',' read -r -a entries <<< "$all"
+              IFS=',' read -r -a wanted <<< "$filter"
+              for e in "${entries[@]}"; do
+                key="${e%%=*}"
+                for w in "${wanted[@]}"; do
+                  if [[ "$key" == "$w" ]]; then
+                    out+="${e},"
+                    break
+                  fi
+                done
+              done
+              echo "${out%,}"
+            }
+            NODES_RAFT_MAP="$(filter_csv "$NODES_RAFT_MAP" "$NODES_FILTER")"
+            SSH_TARGETS_MAP="$(filter_csv "$SSH_TARGETS_MAP" "$NODES_FILTER")"
+            if [[ -z "$NODES_RAFT_MAP" ]]; then
+              echo "::error::nodes filter '$NODES_FILTER' matches nothing in NODES_RAFT_MAP"
+              exit 1
+            fi
+          fi
+          {
+            echo "NODES=$NODES_RAFT_MAP"
+            echo "SSH_TARGETS=$SSH_TARGETS_MAP"
+          } >> "$GITHUB_OUTPUT"
+          echo "::group::Deploy plan"
+          echo "NODES=$NODES_RAFT_MAP"
+          echo "SSH_TARGETS=$SSH_TARGETS_MAP"
+          echo "::endgroup::"
+
+      - name: Tailscale reachability check
+        env:
+          SSH_TARGETS: ${{ steps.render.outputs.SSH_TARGETS }}
+        run: |
+          set -euo pipefail
+          IFS=',' read -r -a entries <<< "$SSH_TARGETS"
+          failed=0
+          for e in "${entries[@]}"; do
+            host="${e##*=}"
+            host="${host%%:*}"
+            # strip user@ if present
+            host="${host##*@}"
+            if tailscale ping --c 2 --timeout 3s "$host" >/dev/null 2>&1; then
+              echo "  ok   $host"
+            else
+              echo "::error::$host not reachable over tailnet"
+              failed=1
+            fi
+          done
+          if [[ "$failed" -ne 0 ]]; then
+            exit 1
+          fi
+
+      - name: Dry-run summary
+        if: ${{ inputs.dry_run }}
+        env:
+          NODES: ${{ steps.render.outputs.NODES }}
+          SSH_TARGETS: ${{ steps.render.outputs.SSH_TARGETS }}
+          IMAGE_BASE: ${{ vars.IMAGE_BASE }}
+          IMAGE_TAG: ${{ inputs.image_tag || inputs.ref }}
+          SSH_USER: ${{ vars.SSH_USER }}
+        run: |
+          set -euo pipefail
+          cat <<EOF
+          ==== DRY RUN — no containers were touched ====
+          image:       ${IMAGE_BASE}:${IMAGE_TAG}
+          SSH user:    ${SSH_USER}
+          NODES:       ${NODES}
+          SSH_TARGETS: ${SSH_TARGETS}
+          ref:         ${{ inputs.ref }}
+          Re-run with dry_run=false to apply.
+          EOF
+
+      - name: Roll cluster
+        if: ${{ !inputs.dry_run }}
+        env:
+          NODES: ${{ steps.render.outputs.NODES }}
+          SSH_TARGETS: ${{ steps.render.outputs.SSH_TARGETS }}
+          SSH_USER: ${{ vars.SSH_USER }}
+          IMAGE: ${{ vars.IMAGE_BASE }}:${{ inputs.image_tag || inputs.ref }}
+          SSH_STRICT_HOST_KEY_CHECKING: "yes"
+        run: |
+          set -euo pipefail
+          ./scripts/rolling-update.sh
diff --git a/docs/deploy_via_tailscale_runbook.md b/docs/deploy_via_tailscale_runbook.md
new file mode 100644
index 000000000..e1ddeb9d2
--- /dev/null
+++ b/docs/deploy_via_tailscale_runbook.md
@@ -0,0 +1,150 @@
+# Deploy via Tailscale + GitHub Actions — Runbook
+
+Companion doc to `docs/design/2026_04_24_proposed_deploy_via_tailscale.md`. This
+runbook is for operators: what to configure on GitHub and Tailscale so the
+`rolling-update` workflow can execute a production deploy.
+
+## 1. Precondition: Tailscale on every node
+
+Each cluster node must have `tailscale` installed, logged into the tailnet, and
+tagged so the CI runner's ACL can reach it.
+
+```
+# on each kv0X node
+sudo tailscale up \
+  --ssh=false \
+  --advertise-tags=tag:elastickv-node \
+  --accept-routes=false
+```
+
+Verify the node is reachable by MagicDNS from another tailnet peer:
+
+```
+tailscale status | grep kv0X
+ping kv0X.<tailnet>.ts.net
+```
+
+## 2. Tailscale ACL
+
+In the Tailscale admin console, add the deploy rule to the tailnet ACL:
+
+```jsonc
+"tagOwners": {
+  "tag:ci-deploy":      ["autogroup:admin"],
+  "tag:elastickv-node": ["autogroup:admin"],
+},
+"acls": [
+  {
+    "action": "accept",
+    "src":    ["tag:ci-deploy"],
+    "dst":    ["tag:elastickv-node:22"],
+  },
+],
+```
+
+`tag:ci-deploy` must NOT have access to any other port on the tailnet. The
+deploy workflow only needs SSH.
+
+## 3. Tailscale OAuth client
+
+Admin console → Settings → OAuth clients → New client:
+
+- Description: `elastickv GitHub Actions deploy`
+- Scopes: `auth_keys` (write)
+- Tags: `tag:ci-deploy`
+
+Copy the client ID and secret; they go into GitHub in the next step.
+
+## 4. GitHub environment: `production`
+
+Repo → Settings → Environments → New environment: `production`.
+
+### Required reviewers
+Configure "Required reviewers" on the environment. Non-dry-run deploys will
+pause until one of the reviewers approves. Configure "Deployment protection
+rules" to auto-approve if the workflow input `dry_run == true` (optional; cuts
+friction for previews).
+
+### Environment secrets
+
+| Name | Value |
+|------|-------|
+| `TS_OAUTH_CLIENT_ID`        | Tailscale OAuth client ID from step 3 |
+| `TS_OAUTH_SECRET`           | Tailscale OAuth secret from step 3 |
+| `DEPLOY_SSH_PRIVATE_KEY`    | OpenSSH private key, authorized on every node under the deploy user |
+| `DEPLOY_KNOWN_HOSTS`        | `ssh-keyscan kv01.<tailnet>.ts.net kv02.<tailnet>.ts.net …` output (one host per line) |
+
+The SSH key should be ed25519, dedicated to CI (not a reused developer key).
+Regenerate on operator rotation.
+
+### Environment variables
+
+| Name | Value | Example |
+|------|-------|---------|
+| `IMAGE_BASE`      | Container image path (no tag)     | `ghcr.io/bootjp/elastickv` |
+| `SSH_USER`        | SSH login on every node           | `bootjp` |
+| `NODES_RAFT_MAP`  | Comma-separated `raftId=host:port` | `n1=kv01:50051,n2=kv02:50051,n3=kv03:50051,n4=kv04:50051,n5=kv05:50051` |
+| `SSH_TARGETS_MAP` | Comma-separated `raftId=ssh-host` | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,...` |
+
+## 5. Running a deploy
+
+Actions tab → "Rolling update" → Run workflow.
+
+Inputs:
+
+- `ref` — the git tag or sha to deploy (also used as the container image tag)
+- `image_tag` — override only for rollbacks (e.g., deploy tag `v1.2.3` of a
+  commit that was also `v1.2.3`)
+- `nodes` — subset of raft IDs, e.g., `n1,n2`. Empty rolls all nodes.
+- `dry_run` — default `true`. Renders the plan and checks reachability without
+  touching containers.
+
+Recommended first-run sequence:
+
+1. `dry_run: true`, `nodes: n1`, `ref: <target>` — confirms tailnet join,
+   SSH config, image availability, target mapping. No production impact.
+2. `dry_run: false`, `nodes: n1` — roll a single node, verify the cluster
+   stays healthy and the image is correct.
+3. `dry_run: false`, `nodes:` (empty) — full roll.
+
+## 6. Rollback
+
+Re-run the workflow with `image_tag` set to the previous-known-good sha. The
+`nodes` input can target specific nodes if only some carry the bad image.
+
+## 7. What the workflow does NOT do (yet)
+
+- **No post-deploy health verification beyond tailnet reachability.** The
+  script itself blocks on `raftadmin` leadership transfer and health-gate
+  timeouts, but the workflow does not independently probe Prometheus or
+  Redis after the roll. Add this when we have a canonical post-deploy
+  assertion suite.
+- **No auto-rollback on failure.** If the script exits non-zero mid-roll,
+  the cluster is left in whatever state the script reached. The operator
+  must inspect and either re-roll or roll back manually.
+- **No Jepsen gate.** The deploy does not require a green Jepsen run on
+  `ref` before proceeding.
+- **No image-signature check.** `cosign verify` on the image is a follow-up.
+
+## 8. Troubleshooting
+
+### Job pauses indefinitely at "Waiting for approval"
+Expected for non-dry-run deploys — a reviewer from the `production` environment
+must click Approve. Check the "Required reviewers" list in the environment
+settings.
+
+### `tailscale ping` fails for a node
+The node may not be running `tailscaled`, not tagged `tag:elastickv-node`, or
+the tailnet ACL may have drifted. `tailscale status` on the node should show
+the tag; the admin console should show the IP in the `tag:elastickv-node`
+group.
+
+### `image ... not found on ghcr.io`
+The verification step hit the ghcr manifest API and got a 404. Either the
+image tag was not pushed (check the `Docker Image CI` workflow for `ref`) or
+the tag is a moving tag (`latest`) that the verification step can't
+distinguish from stale. Specify an immutable tag.
+
+### SSH `Host key verification failed`
+`DEPLOY_KNOWN_HOSTS` is stale. Re-run `ssh-keyscan` against every node and
+update the secret.
diff --git a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
new file mode 100644
index 000000000..17610ab60
--- /dev/null
+++ b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
@@ -0,0 +1,191 @@
+# Deploy via Tailscale + GitHub Actions
+
+**Status:** Proposed
+**Author:** bootjp
+**Date:** 2026-04-24
+
+---
+
+## 1. Background
+
+Today the rolling-update flow is manual: an operator SSHes to their workstation,
+exports the required env vars (`NODES`, `SSH_TARGETS`, image tag, etc.),
+invokes `scripts/rolling-update.sh`, and watches it roll the cluster.
+
+Problems:
+
+- **No audit trail.** Who rolled what, when, and from which commit is only
+  visible in each operator's local shell history.
+- **Manual secret handling.** SSH keys, Tailscale auth, and S3 creds live on
+  operator workstations. Joining and leaving the ops rotation requires key
+  shuffling.
+- **No approval gate.** The production cluster is rolled by whoever types the
+  command. A typo can take out the cluster before anyone else sees it.
+- **No dry-run.** The script supports neither `--dry-run` nor a preview mode;
+  operators who want to verify targeting have to read the script.
+
+The 2026-04-24 incident compounds the risk: the cluster is fragile enough that
+a rolling update executed against the wrong `NODES` list could cascade into an
+election storm.
+
+## 2. Proposal
+
+Move rolling deploys to a GitHub Actions workflow that joins the Tailnet via
+`tailscale/github-action`, SSHes into each node over Tailscale MagicDNS, and
+invokes the existing `scripts/rolling-update.sh`. All secrets live in GitHub
+environments; every deploy becomes a PR-linked, reviewable event.
+
+**Precondition (operator responsibility):** Tailscale is already installed and
+logged in on every node, with SSH access enabled over the tailnet.
+
+### 2.1 Workflow shape
+
+```
+name: Rolling update
+on: workflow_dispatch:
+  inputs:
+    ref:           # git sha/tag of the image to deploy
+    image_tag:     # defaults to $ref; override only for rollbacks
+    nodes:         # subset of raft IDs; empty = full roll
+    dry_run:       # bool, default TRUE — renders plan but doesn't roll
+
+jobs:
+  deploy:
+    environment: production    # requires approval
+    concurrency:
+      group: rolling-update
+      cancel-in-progress: false
+    runs-on: ubuntu-latest
+    steps:
+      - checkout
+      - join tailnet (tailscale/github-action, ephemeral)
+      - configure SSH (add DEPLOY_SSH_PRIVATE_KEY to agent)
+      - render NODES + SSH_TARGETS from repo config
+      - if dry_run: print the derived env and exit
+      - else: ./scripts/rolling-update.sh
+```
+
+### 2.2 Secrets and variables
+
+Stored in a GitHub `production` environment (not repo-wide):
+
+**Secrets:**
+- `TS_OAUTH_CLIENT_ID`, `TS_OAUTH_SECRET` — Tailscale OAuth client scoped to
+  "devices:write" on a single tag (e.g., `tag:ci-deploy`). Ephemeral nodes;
+  cleaned up automatically after the job.
+- `DEPLOY_SSH_PRIVATE_KEY` — SSH key authorized on every node. Restricted to
+  the `deploy` user (if we split it out) or `bootjp` (initial).
+- `DEPLOY_KNOWN_HOSTS` — pre-populated `known_hosts` with the Tailnet MagicDNS
+  entries. Prevents the first-connect TOFU prompt.
+
+**Variables (non-secret):**
+- `NODES_RAFT_MAP` — `n1=kv01:50051,n2=kv02:50051,...` (advertised addresses
+  as seen from inside the tailnet).
+- `SSH_TARGETS_MAP` — `n1=kv01.tailnet.ts.net,...` (MagicDNS).
+- `IMAGE_BASE` — `ghcr.io/bootjp/elastickv` (tag is appended from the input).
+- `SSH_USER` — e.g., `bootjp`.
+
+### 2.3 Tailscale authentication
+
+Use OAuth ephemeral nodes (not a long-lived auth key):
+
+- Create an OAuth client in Tailscale admin console with scope
+  `devices:write` on tag `tag:ci-deploy`.
+- Store client ID + secret in GitHub env secrets.
+- `tailscale/github-action@v3` joins the tailnet for the duration of the job
+  as an ephemeral tagged node; disconnects automatically on job exit.
+
+ACLs on the Tailnet side should limit `tag:ci-deploy` to SSH (tcp/22) on
+`tag:elastickv-node` only. No other ports, no other tags.
+
+### 2.4 SSH
+
+Two options:
+
+- **A. Tailscale SSH.** Lets CI SSH in without managing an SSH keypair: the
+  Tailnet ACL is the authorization model. Requires the nodes to have
+  `--ssh` flag on `tailscaled` (or `tailscale up --ssh`) and the Tailnet ACL
+  to grant `tag:ci-deploy` SSH access to node tag + user. No SSH keys in
+  GitHub at all.
+- **B. Plain SSH over Tailscale.** CI brings an SSH key; nodes continue to
+  use `~/.ssh/authorized_keys`. Tailscale is just the network layer.
+
+**Recommendation for v1: B** (plain SSH). Nodes already have `authorized_keys`
+for the current manual flow; nothing to change on the node side. Tailscale
+SSH (A) can be a follow-up once the key-rotation story is written up.
+
+### 2.5 Dry-run semantics
+
+With `dry_run: true` (the default):
+
+- Everything up to script invocation runs (checkout, tailnet join, SSH agent
+  load, `NODES`/`SSH_TARGETS` render).
+- The script is invoked with `--help` + the rendered env is printed as a
+  collapsed log group.
+- `tailscale ping` is run against each SSH target to confirm reachability.
+- The actual `docker stop/rm/run` loop does NOT execute.
+
+This catches the common failure modes (bad secret, bad env mapping, a node
+unreachable over the tailnet) before touching any live container.
+
+### 2.6 Production environment approval
+
+Mark the `production` GitHub environment as requiring approval from a list of
+reviewers. A non-dry-run deploy will pause until approved; the dry-run run
+itself does not need approval (it only needs the tailnet join).
+
+Alternative: require approval unconditionally and treat the dry-run as a
+"preview" that an approver must ack. Simpler policy, slightly more friction.
+
+**Recommendation:** approval required for non-dry-run only. Dry-runs are
+cheap and useful.
+
+### 2.7 Rollback
+
+Rolling back uses the same workflow with `image_tag: <previous-sha>`. The
+script already supports the rollout order env var (`ROLLING_ORDER`) so an
+operator can force-roll only the affected nodes.
+
+**Gap:** there is no "stop mid-rollout" control today. If the workflow is
+cancelled via GitHub UI during a roll, the in-flight node may be mid-recreate.
+`rolling-update.sh` is supposed to be idempotent and crash-safe, but this
+should be verified before we call the workflow production-ready.
+
+## 3. Open questions
+
+- **SSH user.** Continue using `bootjp` (personal) or provision a shared
+  `deploy` user on each node? v1 sticks with `bootjp` to keep scope tight;
+  follow-up can introduce `deploy` with a limited sudo rule for `docker`.
+- **Secret scope.** Environment-scoped secrets (as proposed) vs.
+  repository-scoped. Environment-scoped wins on blast radius but requires
+  the GitHub environment to be pre-created. Assume pre-created.
+- **Image availability check.** Should the workflow verify the image tag
+  exists on ghcr.io before starting the roll? Cheap to add (`docker manifest
+  inspect` in a pre-step) and prevents a half-rolled cluster when the tag is
+  typo'd.
+- **Jepsen gating.** The existing `jepsen-test.yml` workflow exists.
+  Option: require a green Jepsen run on `ref` within the last N hours before
+  allowing deploy. Skipped for v1; worth revisiting before rolling this out
+  to high-traffic periods.
+
+## 4. Out of scope for v1
+
+- Automatic deploys on merge to main (needs more test coverage before we'd
+  trust it).
+- Blue-green or canary strategies (we don't have the traffic-routing layer
+  for it).
+- Metrics-based rollback trigger (watch p99, auto-revert if it jumps).
+- Tailscale SSH (option A above).
+- A shared `deploy` user with restricted sudo.
+
+## 5. Implementation plan
+
+1. Write `.github/workflows/rolling-update.yml` implementing §2.1.
+2. Document the secrets/variables setup in
+   `docs/operations/deploy_runbook.md` (new).
+3. Run once with `dry_run: true` on a feature branch to validate secrets
+   wiring without touching prod.
+4. Run once with `dry_run: false` targeting a single node (via the `nodes`
+   input) to prove the happy path.
+5. Cut over: archive the operator-local rolling flow, document the new one
+   as the canonical path.

From 6322748fe47e9cc59d39124bd56249cc9707acb1 Mon Sep 17 00:00:00 2001
From: "Yoshiaki Ueda (bootjp)" <contact@bootjp.me>
Date: Sat, 25 Apr 2026 00:10:01 +0900
Subject: [PATCH 02/12] fix(docs): align deploy-via-tailscale with script's
 NODES format
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- NODES_RAFT_MAP examples no longer include ":50051". scripts/rolling-update.sh
  auto-appends RAFT_PORT to each entry, so a port in NODES produces
  host:port:port when the script constructs --address (Gemini HIGH).
- OAuth client scope: design doc §2.3 said "devices:write" while the
  runbook §3 said "auth_keys (write)". Aligned both on "auth_keys (write)"
  which is what tailscale/github-action requires to mint the ephemeral
  join key. Added a note that newer action versions may additionally
  need "devices:core" (write) and to consult the action's README as
  the authoritative scope list (addresses the Gemini MEDIUM comment
  while hedging against action-version drift).
---
 docs/deploy_via_tailscale_runbook.md                   |  7 +++++--
 .../design/2026_04_24_proposed_deploy_via_tailscale.md | 10 +++++++---
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/docs/deploy_via_tailscale_runbook.md b/docs/deploy_via_tailscale_runbook.md
index e1ddeb9d2..cdca043e3 100644
--- a/docs/deploy_via_tailscale_runbook.md
+++ b/docs/deploy_via_tailscale_runbook.md
@@ -50,7 +50,10 @@ deploy workflow only needs SSH.
 Admin console → Settings → OAuth clients → New client:
 
 - Description: `elastickv GitHub Actions deploy`
-- Scopes: `auth_keys` (write)
+- Scopes: `auth_keys` (write). Recent `tailscale/github-action` versions
+  may additionally require `devices:core` (write); enable that if the
+  join step fails with an authorization error. The action's README is
+  the definitive source for current scope requirements.
 - Tags: `tag:ci-deploy`
 
 Copy the client ID and secret; they go into GitHub in the next step.
@@ -83,7 +86,7 @@ Regenerate on operator rotation.
 |------|-------|---------|
 | `IMAGE_BASE`      | Container image path (no tag)     | `ghcr.io/bootjp/elastickv` |
 | `SSH_USER`        | SSH login on every node           | `bootjp` |
-| `NODES_RAFT_MAP`  | Comma-separated `raftId=host:port` | `n1=kv01:50051,n2=kv02:50051,n3=kv03:50051,n4=kv04:50051,n5=kv05:50051` |
+| `NODES_RAFT_MAP`  | Comma-separated `raftId=host` (no port — the script appends `RAFT_PORT`) | `n1=kv01,n2=kv02,n3=kv03,n4=kv04,n5=kv05` |
 | `SSH_TARGETS_MAP` | Comma-separated `raftId=ssh-host` | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,...` |
 
 ## 5. Running a deploy
diff --git a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
index 17610ab60..d69216d3f 100644
--- a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
+++ b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
@@ -79,8 +79,9 @@ Stored in a GitHub `production` environment (not repo-wide):
   entries. Prevents the first-connect TOFU prompt.
 
 **Variables (non-secret):**
-- `NODES_RAFT_MAP` — `n1=kv01:50051,n2=kv02:50051,...` (advertised addresses
-  as seen from inside the tailnet).
+- `NODES_RAFT_MAP` — `n1=kv01,n2=kv02,...` (advertised hostnames as seen
+  from inside the tailnet; the script appends `RAFT_PORT` automatically,
+  so do NOT include a port here).
 - `SSH_TARGETS_MAP` — `n1=kv01.tailnet.ts.net,...` (MagicDNS).
 - `IMAGE_BASE` — `ghcr.io/bootjp/elastickv` (tag is appended from the input).
 - `SSH_USER` — e.g., `bootjp`.
@@ -90,7 +91,10 @@ Stored in a GitHub `production` environment (not repo-wide):
 Use OAuth ephemeral nodes (not a long-lived auth key):
 
 - Create an OAuth client in Tailscale admin console with scope
-  `devices:write` on tag `tag:ci-deploy`.
+  `auth_keys` (write) on tag `tag:ci-deploy`. (`tailscale/github-action`
+  uses the OAuth client to mint a short-lived auth key on each run;
+  recent action versions may also require `devices:core` — consult the
+  action's README for the current scope list.)
 - Store client ID + secret in GitHub env secrets.
 - `tailscale/github-action@v3` joins the tailnet for the duration of the job
   as an ephemeral tagged node; disconnects automatically on job exit.

From ad00bdc0badd0e402be45a27712ec83b761abab2 Mon Sep 17 00:00:00 2001
From: "Yoshiaki Ueda (bootjp)" <contact@bootjp.me>
Date: Sat, 25 Apr 2026 00:34:08 +0900
Subject: [PATCH 03/12] fix(docs,workflow): address round-2
 deploy-via-tailscale review
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- workflow: add `packages: read` to the job permissions so the
  `Verify image exists on ghcr.io` step's `docker manifest inspect`
  call works against private ghcr.io images (Codex P1).
- runbook §1: explain that `--ssh=false` disables Tailscale SSH and
  the workflow relies on the system sshd — operators who use
  Tailscale SSH elsewhere need to keep that in mind (Gemini Medium).
- runbook §4: change `ssh-keyscan` example + troubleshooting to
  `ssh-keyscan -H` so known_hosts entries are hashed and the secret
  does not leak tailnet topology in plaintext (Gemini Security
  Medium).
- runbook §4 variables: document that `NODES_RAFT_MAP` /
  `SSH_TARGETS_MAP` are workflow-side names the render step maps to
  the script's `NODES` / `SSH_TARGETS`; manual invocation from a
  workstation must use the script-side names (Gemini Medium).

Not addressed: Gemini HIGH claim that the workflow file is missing
(line 187) — it IS included at `.github/workflows/rolling-update.yml`
in this PR; the reviewer misread the file list.

Not addressed: Gemini HIGH re native --dry-run flag + zero-downtime
strategy (line 128) — dry-run is deliberately a workflow-level
input, not a script-level flag, so the script stays invokable from
a workstation without CI-specific options; zero-downtime cutover is
outside the scope of a CI wrapper and is tracked in the
resilience-roadmap follow-ups.
---
 .github/workflows/rolling-update.yml |  1 +
 docs/deploy_via_tailscale_runbook.md | 22 ++++++++++++++++++----
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/.github/workflows/rolling-update.yml b/.github/workflows/rolling-update.yml
index 25a1c992f..1e691fde5 100644
--- a/.github/workflows/rolling-update.yml
+++ b/.github/workflows/rolling-update.yml
@@ -30,6 +30,7 @@ on:
 permissions:
   contents: read
   id-token: write   # required by tailscale/github-action OIDC flow
+  packages: read    # required by `docker manifest inspect` on ghcr.io private images
 
 concurrency:
   group: rolling-update
diff --git a/docs/deploy_via_tailscale_runbook.md b/docs/deploy_via_tailscale_runbook.md
index cdca043e3..abdacece5 100644
--- a/docs/deploy_via_tailscale_runbook.md
+++ b/docs/deploy_via_tailscale_runbook.md
@@ -17,6 +17,13 @@ sudo tailscale up \
   --accept-routes=false
 ```
 
+`--ssh=false` disables Tailscale SSH, so the node's regular system
+sshd must be running and authorised to accept connections on the
+tailnet interface. The workflow uses plain SSH over the tailnet
+(Tailscale is only the network layer); if you rely on Tailscale SSH
+for operator access elsewhere, drop this flag but keep in mind the
+workflow still connects to the system sshd.
+
 Verify the node is reachable by MagicDNS from another tailnet peer:
 
 ```
@@ -75,7 +82,7 @@ friction for previews).
 | `TS_OAUTH_CLIENT_ID`        | Tailscale OAuth client ID from step 3 |
 | `TS_OAUTH_SECRET`           | Tailscale OAuth secret from step 3 |
 | `DEPLOY_SSH_PRIVATE_KEY`    | OpenSSH private key, authorized on every node under the deploy user |
-| `DEPLOY_KNOWN_HOSTS`        | `ssh-keyscan kv01.<tailnet>.ts.net kv02.<tailnet>.ts.net …` output (one host per line) |
+| `DEPLOY_KNOWN_HOSTS`        | `ssh-keyscan -H kv01.<tailnet>.ts.net kv02.<tailnet>.ts.net …` output. Use `-H` to hash hostnames so the secret's contents don't leak the tailnet topology if the runner environment is compromised. |
 
 The SSH key should be ed25519, dedicated to CI (not a reused developer key).
 Regenerate on operator rotation.
@@ -86,8 +93,15 @@ Regenerate on operator rotation.
 |------|-------|---------|
 | `IMAGE_BASE`      | Container image path (no tag)     | `ghcr.io/bootjp/elastickv` |
 | `SSH_USER`        | SSH login on every node           | `bootjp` |
-| `NODES_RAFT_MAP`  | Comma-separated `raftId=host` (no port — the script appends `RAFT_PORT`) | `n1=kv01,n2=kv02,n3=kv03,n4=kv04,n5=kv05` |
-| `SSH_TARGETS_MAP` | Comma-separated `raftId=ssh-host` | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,...` |
+| `NODES_RAFT_MAP`  | Comma-separated `raftId=host` (no port — the script appends `RAFT_PORT`). The workflow renders this into the script's `NODES` env var. | `n1=kv01,n2=kv02,n3=kv03,n4=kv04,n5=kv05` |
+| `SSH_TARGETS_MAP` | Comma-separated `raftId=ssh-host`. The workflow renders this into the script's `SSH_TARGETS` env var. | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,...` |
+
+**Why two names?** The workflow uses `NODES_RAFT_MAP` / `SSH_TARGETS_MAP`
+in the `production` environment to keep the GitHub-side names
+distinct from the script-side env var names it hands to
+`rolling-update.sh`. If you run the script by hand from a workstation
+you must export `NODES` and `SSH_TARGETS` directly — the workflow-side
+names are only understood by the workflow's render step.
 
 ## 5. Running a deploy
 
@@ -149,5 +163,5 @@ the tag is a moving tag (`latest`) that the verification step can't
 distinguish from stale. Specify an immutable tag.
 
 ### SSH `Host key verification failed`
-`DEPLOY_KNOWN_HOSTS` is stale. Re-run `ssh-keyscan` against every node and
+`DEPLOY_KNOWN_HOSTS` is stale. Re-run `ssh-keyscan -H` against every node and
 update the secret.

From 894bce93d94678429c8182ae1a276d9484a5225a Mon Sep 17 00:00:00 2001
From: "Yoshiaki Ueda (bootjp)" <contact@bootjp.me>
Date: Sat, 25 Apr 2026 00:54:23 +0900
Subject: [PATCH 04/12] fix(workflow,docs): address round-3
 deploy-via-tailscale review
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- workflow nodes filter (Codex P2): reject any raft ID in the
  `nodes` input that does not appear in NODES_RAFT_MAP. Previously
  a typo like `n1,n9` silently rolled n1 only; now the workflow
  fails fast with a list of known IDs so the operator sees the
  typo before touching prod.
- runbook section 4 (Gemini Medium x2): GitHub's native environment
  protection rules cannot be made conditional on workflow inputs,
  so the previous "auto-approve dry-run" guidance was wrong.
  Documented the three workable options: accept the prompt for
  dry-runs too (v1 default), split into a second unprotected
  environment, or install a deployment-protection-rule GitHub App.
- runbook section 4 NODES_RAFT_MAP example (Gemini Medium): use full
  MagicDNS FQDNs instead of short hostnames so every node can
  resolve its peers regardless of local DNS search domains.
- runbook section 6 (Gemini Medium): added "If a running workflow is
  cancelled mid-rollout" recovery steps — find the in-flight node
  from logs, finish the recreate by hand, confirm leader, rerun
  scoped. Filed as a tracked gap to teach the workflow per-node
  start-markers in a follow-up.

Not addressed: Gemini HIGH line 187 claiming the workflow file is
missing — the file IS present at .github/workflows/rolling-update.yml
and has been since the first push of this PR. Third time the bot has
flagged this (same finding in rounds 1 and 2); leaving as-is since
responding further would just be repeating the same correction.
---
 .github/workflows/rolling-update.yml | 30 +++++++++++++---
 docs/deploy_via_tailscale_runbook.md | 51 ++++++++++++++++++++++++----
 2 files changed, 71 insertions(+), 10 deletions(-)

diff --git a/.github/workflows/rolling-update.yml b/.github/workflows/rolling-update.yml
index 1e691fde5..852840a1f 100644
--- a/.github/workflows/rolling-update.yml
+++ b/.github/workflows/rolling-update.yml
@@ -111,15 +111,37 @@ jobs:
           fi
           if [[ -n "$NODES_FILTER" ]]; then
             # Filter NODES_RAFT_MAP and SSH_TARGETS_MAP to the requested subset.
+            # Reject any filter ID that does not appear in the map: silently
+            # dropping unknown IDs would let a typo like "n1,n9" proceed as
+            # a one-node rollout of n1 alone, which is a staged-deploy
+            # footgun.
+            IFS=',' read -r -a wanted <<< "$NODES_FILTER"
+            IFS=',' read -r -a entries <<< "$NODES_RAFT_MAP"
+            declare -a known_ids=()
+            for e in "${entries[@]}"; do
+              known_ids+=("${e%%=*}")
+            done
+            unknown=""
+            for w in "${wanted[@]}"; do
+              found=0
+              for k in "${known_ids[@]}"; do
+                if [[ "$k" == "$w" ]]; then found=1; break; fi
+              done
+              if [[ $found -eq 0 ]]; then unknown+="${unknown:+, }$w"; fi
+            done
+            if [[ -n "$unknown" ]]; then
+              echo "::error::nodes filter '$NODES_FILTER' references unknown raft IDs: $unknown. Known IDs: ${known_ids[*]}"
+              exit 1
+            fi
             filter_csv() {
               local all="$1"
               local filter="$2"
               local out=""
-              IFS=',' read -r -a entries <<< "$all"
-              IFS=',' read -r -a wanted <<< "$filter"
-              for e in "${entries[@]}"; do
+              IFS=',' read -r -a list_entries <<< "$all"
+              IFS=',' read -r -a list_wanted <<< "$filter"
+              for e in "${list_entries[@]}"; do
                 key="${e%%=*}"
-                for w in "${wanted[@]}"; do
+                for w in "${list_wanted[@]}"; do
                   if [[ "$key" == "$w" ]]; then
                     out+="${e},"
                     break
diff --git a/docs/deploy_via_tailscale_runbook.md b/docs/deploy_via_tailscale_runbook.md
index abdacece5..1a5fbefb5 100644
--- a/docs/deploy_via_tailscale_runbook.md
+++ b/docs/deploy_via_tailscale_runbook.md
@@ -70,10 +70,23 @@ Copy the client ID and secret; they go into GitHub in the next step.
 Repo → Settings → Environments → New environment: `production`.
 
 ### Required reviewers
-Configure "Required reviewers" on the environment. Non-dry-run deploys will
-pause until one of the reviewers approves. Configure "Deployment protection
-rules" to auto-approve if the workflow input `dry_run == true` (optional; cuts
-friction for previews).
+Configure "Required reviewers" on the environment. **Every run that targets
+this environment pauses for approval** — including dry-runs, because
+GitHub's native environment-protection rules cannot be made conditional on
+workflow inputs. Three ways to handle the dry-run-approval friction:
+
+1. **Accept the prompt for dry-runs too.** A dry-run requires one approver
+   click before it proceeds; still cheap and keeps the policy simple.
+2. **Add a second environment `production-dry-run` without required
+   reviewers** and change the workflow to pick the environment via
+   `environment: ${{ inputs.dry_run && 'production-dry-run' || 'production' }}`.
+   Cleanest but doubles the secrets/vars you must keep in sync.
+3. **Install a deployment-protection-rule GitHub App** (custom or
+   marketplace) that approves runs whose inputs show `dry_run == true`.
+   Most flexible; most setup.
+
+v1 ships with approach 1 (single environment, prompt on every run).
+Approach 2 is the recommended upgrade once the friction becomes annoying.
 
 ### Environment secrets
 
@@ -93,8 +106,8 @@ Regenerate on operator rotation.
 |------|-------|---------|
 | `IMAGE_BASE`      | Container image path (no tag)     | `ghcr.io/bootjp/elastickv` |
 | `SSH_USER`        | SSH login on every node           | `bootjp` |
-| `NODES_RAFT_MAP`  | Comma-separated `raftId=host` (no port — the script appends `RAFT_PORT`). The workflow renders this into the script's `NODES` env var. | `n1=kv01,n2=kv02,n3=kv03,n4=kv04,n5=kv05` |
-| `SSH_TARGETS_MAP` | Comma-separated `raftId=ssh-host`. The workflow renders this into the script's `SSH_TARGETS` env var. | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,...` |
+| `NODES_RAFT_MAP`  | Comma-separated `raftId=host` (no port — the script appends `RAFT_PORT`). Use full MagicDNS FQDNs so every node can resolve the advertised address regardless of local DNS search domains. The workflow renders this into the script's `NODES` env var. | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,n3=kv03.<tailnet>.ts.net,n4=kv04.<tailnet>.ts.net,n5=kv05.<tailnet>.ts.net` |
+| `SSH_TARGETS_MAP` | Comma-separated `raftId=ssh-host`. The workflow renders this into the script's `SSH_TARGETS` env var. Usually identical to `NODES_RAFT_MAP` unless SSH access uses a different hostname. | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,...` |
 
 **Why two names?** The workflow uses `NODES_RAFT_MAP` / `SSH_TARGETS_MAP`
 in the `production` environment to keep the GitHub-side names
@@ -129,6 +142,32 @@ Recommended first-run sequence:
 Re-run the workflow with `image_tag` set to the previous-known-good sha. The
 `nodes` input can target specific nodes if only some carry the bad image.
 
+### If a running workflow is cancelled mid-rollout
+
+GitHub cancelling the job between node steps is the one operational
+hazard that needs manual cleanup.
+
+1. **Look at the last log line from the `Roll cluster` step.** The script
+   logs `[rolling-update] rolling n<id>: docker stop/rm/run ...` before
+   each node recreate. Whatever `n<id>` appears last is the one in
+   flight when the cancel signal landed.
+2. **SSH into that node** over Tailscale and run `docker ps`. If the
+   container is absent or `Exited`, finish the recreate by hand with the
+   docker run arguments the script emitted (which you can see in the
+   workflow log, step `Roll cluster`).
+3. **Confirm the new leader via `raftadmin` or metrics** before re-running
+   the workflow with `nodes:` scoped to the remaining untouched IDs. Do
+   NOT re-run the full rollout if the partial one is still in flight —
+   it will stop the same node you are trying to recover.
+4. **File a ticket** with the log excerpt so we can eventually teach the
+   workflow to set a start-marker on each node and fast-skip completed
+   nodes on re-run.
+
+The script is idempotent for the "container exists and is up" case, so
+re-running the workflow with the same `ref` after confirming the
+interrupted node is healthy is safe — the script will stop+recreate
+each node in turn regardless of whether it was touched before.
+
 ## 7. What the workflow does NOT do (yet)
 
 - **No post-deploy health verification beyond tailnet reachability.** The

From 165116bee650aa70072bec829ed20c1875d3e114 Mon Sep 17 00:00:00 2001
From: "Yoshiaki Ueda (bootjp)" <contact@bootjp.me>
Date: Sat, 25 Apr 2026 01:11:04 +0900
Subject: [PATCH 05/12] docs(deploy): round-4 deploy-via-tailscale review fixes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Gemini HIGH (design line 82): switch NODES_RAFT_MAP example to full
  MagicDNS FQDNs so it matches the runbook; bare hostnames resolve
  differently per node.
- Gemini Medium (design line 45): fix YAML — on/workflow_dispatch/inputs
  must be nested, not on a single line, and the fence is labelled yaml.
- Gemini Medium (runbook §3, design §2.3): retract devices:core — not a
  valid Tailscale OAuth scope; note devices:write as the standard one.
- Gemini Medium (runbook §6, line 153-156): correct the cancelled-job
  log pattern to what the script actually emits (`==> [<id>@<host>]
  start`, scripts/rolling-update.sh:398), not the fictitious
  `[rolling-update] rolling n<id>: ...`.
- Gemini Medium (runbook §6, line 156-160): clarify that docker run
  stdout/stderr is redirected to /dev/null, so operators reconstruct
  the invocation from the step-level env log, not from the docker-run
  argv.
- Codex P2 (runbook §8 approval troubleshooting): clarify that both
  dry-run and non-dry-run runs pause for approval in v1 because
  `environment: production` is unconditional; reference §4 for the
  second-environment upgrade path.
---
 docs/deploy_via_tailscale_runbook.md          | 38 +++++++++++++------
 ...026_04_24_proposed_deploy_via_tailscale.md | 33 +++++++++-------
 2 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/docs/deploy_via_tailscale_runbook.md b/docs/deploy_via_tailscale_runbook.md
index 1a5fbefb5..ca411ec05 100644
--- a/docs/deploy_via_tailscale_runbook.md
+++ b/docs/deploy_via_tailscale_runbook.md
@@ -58,9 +58,12 @@ Admin console → Settings → OAuth clients → New client:
 
 - Description: `elastickv GitHub Actions deploy`
 - Scopes: `auth_keys` (write). Recent `tailscale/github-action` versions
-  may additionally require `devices:core` (write); enable that if the
-  join step fails with an authorization error. The action's README is
-  the definitive source for current scope requirements.
+  may additionally require `devices:write` (to register and clean up
+  the ephemeral node); enable that if the join step fails with an
+  authorization error. The action's README is the definitive source
+  for current scope requirements. `devices:core` is NOT a valid
+  Tailscale OAuth scope — earlier drafts of this runbook named it and
+  would have produced an auth failure.
 - Tags: `tag:ci-deploy`
 
 Copy the client ID and secret; they go into GitHub in the next step.
@@ -147,14 +150,20 @@ Re-run the workflow with `image_tag` set to the previous-known-good sha. The
 GitHub cancelling the job between node steps is the one operational
 hazard that needs manual cleanup.
 
-1. **Look at the last log line from the `Roll cluster` step.** The script
-   logs `[rolling-update] rolling n<id>: docker stop/rm/run ...` before
-   each node recreate. Whatever `n<id>` appears last is the one in
+1. **Look at the last log line from the `Roll cluster` step.** The
+   script emits `==> [<raft-id>@<host>] start` at the beginning of
+   each per-node recreate (see `scripts/rolling-update.sh:398`).
+   Whichever `<raft-id>` appears in the last such line is the one in
    flight when the cancel signal landed.
 2. **SSH into that node** over Tailscale and run `docker ps`. If the
-   container is absent or `Exited`, finish the recreate by hand with the
-   docker run arguments the script emitted (which you can see in the
-   workflow log, step `Roll cluster`).
+   container is absent or `Exited`, finish the recreate by hand. The
+   `docker run` invocation itself is redirected to `/dev/null` by the
+   script, so the workflow log does NOT contain the full argv. Use
+   the resolved env instead: the step logs `NODES_RAFT_MAP`,
+   `EXTRA_ENV`, `GOMEMLIMIT`, `CONTAINER_MEMORY_LIMIT`, `IMAGE`, and
+   `DATA_DIR` before invoking the script — those are sufficient to
+   reconstruct the same `docker run` you would see if you re-ran with
+   the same inputs.
 3. **Confirm the new leader via `raftadmin` or metrics** before re-running
    the workflow with `nodes:` scoped to the remaining untouched IDs. Do
    NOT re-run the full rollout if the partial one is still in flight —
@@ -185,9 +194,14 @@ each node in turn regardless of whether it was touched before.
 ## 8. Troubleshooting
 
 ### Job pauses indefinitely at "Waiting for approval"
-Expected for non-dry-run deploys — a reviewer from the `production` environment
-must click Approve. Check the "Required reviewers" list in the environment
-settings.
+Expected for **every** run in v1 — `.github/workflows/rolling-update.yml`
+sets `environment: production` unconditionally, so both dry-run and
+non-dry-run executions pause for approval. A reviewer from the
+`production` environment must click Approve. Check the "Required
+reviewers" list in the environment settings. See §4 "GitHub
+environment" for the dry-run-approval alternatives (approach 2: add a
+second `production-dry-run` environment without required reviewers)
+if the friction becomes intolerable.
 
 ### `tailscale ping` fails for a node
 The node may not be running `tailscaled`, not tagged `tag:elastickv-node`, or
diff --git a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
index d69216d3f..7e65d334d 100644
--- a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
+++ b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
@@ -40,14 +40,15 @@ logged in on every node, with SSH access enabled over the tailnet.
 
 ### 2.1 Workflow shape
 
-```
+```yaml
 name: Rolling update
-on: workflow_dispatch:
-  inputs:
-    ref:           # git sha/tag of the image to deploy
-    image_tag:     # defaults to $ref; override only for rollbacks
-    nodes:         # subset of raft IDs; empty = full roll
-    dry_run:       # bool, default TRUE — renders plan but doesn't roll
+on:
+  workflow_dispatch:
+    inputs:
+      ref:           # git sha/tag of the image to deploy
+      image_tag:     # defaults to $ref; override only for rollbacks
+      nodes:         # subset of raft IDs; empty = full roll
+      dry_run:       # bool, default TRUE — renders plan but doesn't roll
 
 jobs:
   deploy:
@@ -79,10 +80,13 @@ Stored in a GitHub `production` environment (not repo-wide):
   entries. Prevents the first-connect TOFU prompt.
 
 **Variables (non-secret):**
-- `NODES_RAFT_MAP` — `n1=kv01,n2=kv02,...` (advertised hostnames as seen
-  from inside the tailnet; the script appends `RAFT_PORT` automatically,
-  so do NOT include a port here).
-- `SSH_TARGETS_MAP` — `n1=kv01.tailnet.ts.net,...` (MagicDNS).
+- `NODES_RAFT_MAP` — `n1=kv01.tailnet.ts.net,n2=kv02.tailnet.ts.net,...`
+  (full MagicDNS FQDNs; bare short names can resolve differently
+  depending on each node's search-domain configuration). The script
+  appends `RAFT_PORT` automatically, so do NOT include a port here.
+  The runbook (`docs/deploy_via_tailscale_runbook.md`) carries the
+  same FQDN convention; keep the two in sync if either changes.
+- `SSH_TARGETS_MAP` — `n1=kv01.tailnet.ts.net,...` (MagicDNS FQDN).
 - `IMAGE_BASE` — `ghcr.io/bootjp/elastickv` (tag is appended from the input).
 - `SSH_USER` — e.g., `bootjp`.
 
@@ -93,8 +97,11 @@ Use OAuth ephemeral nodes (not a long-lived auth key):
 - Create an OAuth client in Tailscale admin console with scope
   `auth_keys` (write) on tag `tag:ci-deploy`. (`tailscale/github-action`
   uses the OAuth client to mint a short-lived auth key on each run;
-  recent action versions may also require `devices:core` — consult the
-  action's README for the current scope list.)
+  recent action versions may also require `devices:write` so the
+  ephemeral node can register and be cleaned up — consult the action's
+  README for the current scope list. Earlier drafts of this doc named
+  `devices:core`, which is not a supported Tailscale OAuth scope and
+  would fail authentication.)
 - Store client ID + secret in GitHub env secrets.
 - `tailscale/github-action@v3` joins the tailnet for the duration of the job
   as an ephemeral tagged node; disconnects automatically on job exit.

From 24383b2db1460fb999d6c50efb91738ca18cd896 Mon Sep 17 00:00:00 2001
From: "Yoshiaki Ueda (bootjp)" <contact@bootjp.me>
Date: Sat, 25 Apr 2026 01:33:06 +0900
Subject: [PATCH 06/12] docs(deploy): round-5 deploy-via-tailscale review fixes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Gemini Medium (design §2.6 line 146): the design doc contradicted
  the runbook by claiming dry-runs do NOT need approval. GitHub's
  environment-protection rules cannot be made conditional on workflow
  inputs, so `environment: production` pauses BOTH dry-run and
  non-dry-run executions in v1. Aligned the design-doc wording with
  the runbook and cross-referenced §4 for the second-environment
  upgrade path.
- Gemini Medium (runbook §6 line 164 env list): the list of
  reconstruction vars was incomplete. Listed every env var the
  workflow actually exports (IMAGE, DATA_DIR, RAFT_PORT, REDIS_PORT,
  S3_PORT, ENABLE_S3, NODES, SSH_TARGETS, EXTRA_ENV) and called out
  the script-level defaults for anything not overridden, plus noted
  GOMEMLIMIT / CONTAINER_MEMORY_LIMIT are propagated via EXTRA_ENV
  once PR #617 lands.
- Gemini Medium (runbook §6 line 178 idempotency): corrected the
  "stop+recreate every node regardless" claim. The script
  (scripts/rolling-update.sh:794-798) skips nodes whose running
  image id matches the target AND whose gRPC endpoint is healthy,
  so re-running after a partial roll is safe because already-rolled
  nodes are no-ops, not stops.

Declining again Gemini HIGH "workflow file missing" — the file IS
in this PR at .github/workflows/rolling-update.yml; this is the
fourth round the bot has flagged its own misread. See prior rounds
for rationale; no change.
---
 docs/deploy_via_tailscale_runbook.md          | 30 ++++++++++++-------
 ...026_04_24_proposed_deploy_via_tailscale.md | 18 +++++++----
 2 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/docs/deploy_via_tailscale_runbook.md b/docs/deploy_via_tailscale_runbook.md
index ca411ec05..82e6f0511 100644
--- a/docs/deploy_via_tailscale_runbook.md
+++ b/docs/deploy_via_tailscale_runbook.md
@@ -158,12 +158,19 @@ hazard that needs manual cleanup.
 2. **SSH into that node** over Tailscale and run `docker ps`. If the
    container is absent or `Exited`, finish the recreate by hand. The
    `docker run` invocation itself is redirected to `/dev/null` by the
-   script, so the workflow log does NOT contain the full argv. Use
-   the resolved env instead: the step logs `NODES_RAFT_MAP`,
-   `EXTRA_ENV`, `GOMEMLIMIT`, `CONTAINER_MEMORY_LIMIT`, `IMAGE`, and
-   `DATA_DIR` before invoking the script — those are sufficient to
-   reconstruct the same `docker run` you would see if you re-ran with
-   the same inputs.
+   script, so the workflow log does NOT contain the full argv. To
+   reconstruct it, read the `Roll cluster` step's rendered
+   environment — the workflow exports `IMAGE`, `DATA_DIR`,
+   `RAFT_PORT`, `REDIS_PORT`, `S3_PORT`, `ENABLE_S3`, `NODES`,
+   `SSH_TARGETS`, and the merged `EXTRA_ENV` before invoking the
+   script. Anything not explicitly set (e.g., `RAFT_PORT` in a
+   minimally-overridden deploy) falls back to the script's default
+   (`RAFT_PORT=50051`, `REDIS_PORT=6379`, `S3_PORT=9000`,
+   `ENABLE_S3=true`). GOMEMLIMIT / CONTAINER_MEMORY_LIMIT (PR #617)
+   are propagated via `EXTRA_ENV` once that PR lands. Together the
+   rendered env + the node's `deploy.env` is enough to reconstruct
+   the same `docker run` you would see if you re-ran with the same
+   inputs.
 3. **Confirm the new leader via `raftadmin` or metrics** before re-running
    the workflow with `nodes:` scoped to the remaining untouched IDs. Do
    NOT re-run the full rollout if the partial one is still in flight —
@@ -172,10 +179,13 @@ hazard that needs manual cleanup.
    workflow to set a start-marker on each node and fast-skip completed
    nodes on re-run.
 
-The script is idempotent for the "container exists and is up" case, so
-re-running the workflow with the same `ref` after confirming the
-interrupted node is healthy is safe — the script will stop+recreate
-each node in turn regardless of whether it was touched before.
+The script is idempotent. `scripts/rolling-update.sh:794-798` skips a
+node when its running image id equals the target image and its gRPC
+endpoint is healthy — an already-rolled node is a no-op, not a
+redundant stop/recreate. Re-running the workflow with the same
+`ref` after confirming the interrupted node is healthy is therefore
+safe: nodes that already match the target image are passed over,
+and only the still-stale one gets recreated.
 
 ## 7. What the workflow does NOT do (yet)
 
diff --git a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
index 7e65d334d..2eb0168f2 100644
--- a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
+++ b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
@@ -141,15 +141,21 @@ unreachable over the tailnet) before touching any live container.
 
 ### 2.6 Production environment approval
 
-Mark the `production` GitHub environment as requiring approval from a list of
-reviewers. A non-dry-run deploy will pause until approved; the dry-run run
-itself does not need approval (it only needs the tailnet join).
+Mark the `production` GitHub environment as requiring approval from a list
+of reviewers. GitHub's native environment-protection rules do NOT support
+conditioning approval on workflow inputs, so **both** dry-run and non-
+dry-run runs will pause for approval when `environment: production` is
+declared unconditionally on the job. That is the v1 policy — simpler,
+one environment, one approver list; see runbook §4 for the dry-run-
+approval alternatives (a second `production-dry-run` environment without
+required reviewers, or a deployment-protection-rule GitHub App).
 
 Alternative: require approval unconditionally and treat the dry-run as a
-"preview" that an approver must ack. Simpler policy, slightly more friction.
+"preview" that an approver must ack. This is the v1 shape by default.
 
-**Recommendation:** approval required for non-dry-run only. Dry-runs are
-cheap and useful.
+**Recommendation:** approval required for every run in v1 (one
+environment). Add the second environment only when the dry-run friction
+becomes annoying.
 
 ### 2.7 Rollback
 

From 3ddcd42e71468a7d9fa1f6cb4e5e5507f2005bf5 Mon Sep 17 00:00:00 2001
From: "Yoshiaki Ueda (bootjp)" <contact@bootjp.me>
Date: Sat, 25 Apr 2026 01:53:21 +0900
Subject: [PATCH 07/12] fix(workflow): validate inputs.ref is default branch or
 tag

Codex P1: the workflow hands the checked-out tree a Tailscale
OAuth secret and an SSH key, then executes scripts/rolling-update.sh
from that tree. Anyone who can dispatch runs could previously point
inputs.ref at a branch containing a malicious script modification
and exfiltrate the secrets.

Mitigations:
- New 'Validate ref is default branch or a tag' step rejects any
  ref that is not the default branch (by name or HEAD sha) or an
  existing tag. A sha reachable from elsewhere is still accepted
  (the subsequent checkout does its own verification) but non-
  default branches fail closed with an operator-visible error.
- actions/checkout now pins persist-credentials: false so the
  GITHUB_TOKEN is not left in the runner's git config for the
  deploy script to harvest. The token is still explicitly exposed
  to the ghcr verification step via env:, which is the only place
  it needs to be readable.
---
 .github/workflows/rolling-update.yml | 34 ++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/.github/workflows/rolling-update.yml b/.github/workflows/rolling-update.yml
index 852840a1f..626044a80 100644
--- a/.github/workflows/rolling-update.yml
+++ b/.github/workflows/rolling-update.yml
@@ -48,10 +48,44 @@ jobs:
     timeout-minutes: 60
 
     steps:
+      # The deploy script (scripts/rolling-update.sh) is executed from the
+      # checkout below, after the tailnet join and SSH key load. If `ref`
+      # were unvalidated, anyone with workflow_dispatch permission could
+      # point it at a fork commit containing a modified script that
+      # harvests the SSH key / Tailscale OAuth secret. Validate that
+      # `ref` resolves to (a) the repository's default branch, or (b) a
+      # tag on the repo, before we hand it any secret. Branches other
+      # than the default are rejected so review-gated default is the only
+      # entry point besides immutable tags.
+      - name: Validate ref is default branch or a tag
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          REF: ${{ inputs.ref }}
+        run: |
+          set -euo pipefail
+          default_branch=$(gh api "repos/${{ github.repository }}" --jq '.default_branch')
+          default_sha=$(gh api "repos/${{ github.repository }}/commits/$default_branch" --jq '.sha')
+          if [[ "$REF" == "$default_branch" || "$REF" == "$default_sha" ]]; then
+            echo "ref is the default branch ($default_branch / $default_sha)"
+            exit 0
+          fi
+          if gh api "repos/${{ github.repository }}/git/refs/tags/$REF" >/dev/null 2>&1; then
+            echo "ref is a tag"
+            exit 0
+          fi
+          # Also accept a sha that is reachable from the default branch's HEAD
+          # so historical default-branch commits remain deployable for rollback.
+          if git -c "http.https://github.com/.extraheader=" ls-remote "https://github.com/${{ github.repository }}.git" | grep -q "^$REF"; then
+            echo "::error::ref '$REF' is not the default branch or a tag. Branches other than '$default_branch' are disallowed to prevent arbitrary-code execution with production secrets."
+            exit 1
+          fi
+          echo "ref '$REF' treated as a sha; checkout will fail if it is not reachable."
+
       - name: Checkout
         uses: actions/checkout@v6
         with:
           ref: ${{ inputs.ref }}
+          persist-credentials: false
 
       - name: Install jq
         run: sudo apt-get install -y --no-install-recommends jq

From b2598fc8e5c09e92f78544844756f31aaaffbadb Mon Sep 17 00:00:00 2001
From: "Yoshiaki Ueda (bootjp)" <contact@bootjp.me>
Date: Wed, 24 Jun 2026 02:51:57 +0900
Subject: [PATCH 08/12] ops: harden tailscale rollout workflow

---
 .github/workflows/rolling-update.yml          | 215 ++++++++++++------
 docs/deploy_via_tailscale_runbook.md          |  15 +-
 ...026_04_24_proposed_deploy_via_tailscale.md |  46 +++-
 scripts/rolling-update.sh                     |  62 ++++-
 4 files changed, 246 insertions(+), 92 deletions(-)

diff --git a/.github/workflows/rolling-update.yml b/.github/workflows/rolling-update.yml
index 626044a80..4d94aaf63 100644
--- a/.github/workflows/rolling-update.yml
+++ b/.github/workflows/rolling-update.yml
@@ -8,7 +8,7 @@ on:
   workflow_dispatch:
     inputs:
       ref:
-        description: Git ref (tag or sha) to deploy. Also used as the image tag unless image_tag is set.
+        description: Image tag/ref to deploy. Workflow code is always checked out from the repository default branch.
         required: true
         type: string
       image_tag:
@@ -48,43 +48,24 @@ jobs:
     timeout-minutes: 60
 
     steps:
-      # The deploy script (scripts/rolling-update.sh) is executed from the
-      # checkout below, after the tailnet join and SSH key load. If `ref`
-      # were unvalidated, anyone with workflow_dispatch permission could
-      # point it at a fork commit containing a modified script that
-      # harvests the SSH key / Tailscale OAuth secret. Validate that
-      # `ref` resolves to (a) the repository's default branch, or (b) a
-      # tag on the repo, before we hand it any secret. Branches other
-      # than the default are rejected so review-gated default is the only
-      # entry point besides immutable tags.
-      - name: Validate ref is default branch or a tag
+      # The deploy script is executed after the tailnet join and SSH key load.
+      # Always take that script from the review-gated default branch; the
+      # workflow input only selects the image tag/ref to deploy.
+      - name: Resolve trusted checkout ref
+        id: trusted-ref
         env:
           GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
           REF: ${{ inputs.ref }}
         run: |
           set -euo pipefail
           default_branch=$(gh api "repos/${{ github.repository }}" --jq '.default_branch')
-          default_sha=$(gh api "repos/${{ github.repository }}/commits/$default_branch" --jq '.sha')
-          if [[ "$REF" == "$default_branch" || "$REF" == "$default_sha" ]]; then
-            echo "ref is the default branch ($default_branch / $default_sha)"
-            exit 0
-          fi
-          if gh api "repos/${{ github.repository }}/git/refs/tags/$REF" >/dev/null 2>&1; then
-            echo "ref is a tag"
-            exit 0
-          fi
-          # Also accept a sha that is reachable from the default branch's HEAD
-          # so historical default-branch commits remain deployable for rollback.
-          if git -c "http.https://github.com/.extraheader=" ls-remote "https://github.com/${{ github.repository }}.git" | grep -q "^$REF"; then
-            echo "::error::ref '$REF' is not the default branch or a tag. Branches other than '$default_branch' are disallowed to prevent arbitrary-code execution with production secrets."
-            exit 1
-          fi
-          echo "ref '$REF' treated as a sha; checkout will fail if it is not reachable."
+          echo "checkout_ref=$default_branch" >> "$GITHUB_OUTPUT"
+          echo "deploy ref/image tag: $REF"
 
-      - name: Checkout
+      - name: Checkout trusted deploy script
         uses: actions/checkout@v6
         with:
-          ref: ${{ inputs.ref }}
+          ref: ${{ steps.trusted-ref.outputs.checkout_ref }}
           persist-credentials: false
 
       - name: Install jq
@@ -139,51 +120,147 @@ jobs:
           NODES_FILTER: ${{ inputs.nodes }}
         run: |
           set -euo pipefail
-          if [[ -z "$NODES_RAFT_MAP" || -z "$SSH_TARGETS_MAP" ]]; then
-            echo "::error::NODES_RAFT_MAP or SSH_TARGETS_MAP is not set in the production environment variables"
+          if [[ -z "$NODES_RAFT_MAP" ]]; then
+            echo "::error::NODES_RAFT_MAP is not set in the production environment variables"
+            exit 1
+          fi
+
+          normalize_csv_map() {
+            local all="$1"
+            local out=""
+            local e key value
+            if [[ -z "$all" ]]; then
+              printf '%s' ""
+              return 0
+            fi
+            IFS=',' read -r -a entries <<< "$all"
+            for e in "${entries[@]}"; do
+              e="${e//[[:space:]]/}"
+              [[ -n "$e" ]] || continue
+              if [[ "$e" != *=* ]]; then
+                echo "::error::invalid map entry '$e' (expected raftId=value)"
+                exit 1
+              fi
+              key="${e%%=*}"
+              value="${e#*=}"
+              if [[ -z "$key" || -z "$value" ]]; then
+                echo "::error::invalid map entry '$e' (empty raft ID or value)"
+                exit 1
+              fi
+              out+="${out:+,}${key}=${value}"
+            done
+            printf '%s' "$out"
+          }
+
+          lookup_map() {
+            local key="$1"
+            local all="$2"
+            local e entry_key entry_value
+            [[ -n "$all" ]] || return 1
+            IFS=',' read -r -a entries <<< "$all"
+            for e in "${entries[@]}"; do
+              e="${e//[[:space:]]/}"
+              [[ -n "$e" ]] || continue
+              entry_key="${e%%=*}"
+              entry_value="${e#*=}"
+              if [[ "$entry_key" == "$key" ]]; then
+                printf '%s' "$entry_value"
+                return 0
+              fi
+            done
+            return 1
+          }
+
+          filter_csv() {
+            local all="$1"
+            local filter="$2"
+            local out=""
+            local e key w
+            if [[ -z "$all" ]]; then
+              printf '%s' ""
+              return 0
+            fi
+            IFS=',' read -r -a list_entries <<< "$all"
+            IFS=',' read -r -a list_wanted <<< "$filter"
+            for e in "${list_entries[@]}"; do
+              e="${e//[[:space:]]/}"
+              [[ -n "$e" ]] || continue
+              key="${e%%=*}"
+              for w in "${list_wanted[@]}"; do
+                w="${w//[[:space:]]/}"
+                if [[ "$key" == "$w" ]]; then
+                  out+="${out:+,}$e"
+                  break
+                fi
+              done
+            done
+            printf '%s' "$out"
+          }
+
+          known_ids_csv() {
+            local all="$1"
+            local out=""
+            local e key
+            IFS=',' read -r -a entries <<< "$all"
+            for e in "${entries[@]}"; do
+              e="${e//[[:space:]]/}"
+              [[ -n "$e" ]] || continue
+              key="${e%%=*}"
+              out+="${out:+,}$key"
+            done
+            printf '%s' "$out"
+          }
+
+          materialize_ssh_targets() {
+            local nodes="$1"
+            local ssh_targets="$2"
+            local out=""
+            local e key host target
+            if [[ -z "$nodes" ]]; then
+              printf '%s' ""
+              return 0
+            fi
+            IFS=',' read -r -a entries <<< "$nodes"
+            for e in "${entries[@]}"; do
+              e="${e//[[:space:]]/}"
+              [[ -n "$e" ]] || continue
+              key="${e%%=*}"
+              host="${e#*=}"
+              target="$(lookup_map "$key" "$ssh_targets" || true)"
+              if [[ -z "$target" ]]; then
+                target="$host"
+              fi
+              out+="${out:+,}${key}=${target}"
+            done
+            printf '%s' "$out"
+          }
+
+          NODES_RAFT_MAP="$(normalize_csv_map "$NODES_RAFT_MAP")"
+          SSH_TARGETS_MAP="$(normalize_csv_map "$SSH_TARGETS_MAP")"
+          if [[ -z "$NODES_RAFT_MAP" ]]; then
+            echo "::error::NODES_RAFT_MAP did not contain any nodes"
             exit 1
           fi
+          NODES_FILTER="${NODES_FILTER//[[:space:]]/}"
+
           if [[ -n "$NODES_FILTER" ]]; then
             # Filter NODES_RAFT_MAP and SSH_TARGETS_MAP to the requested subset.
             # Reject any filter ID that does not appear in the map: silently
             # dropping unknown IDs would let a typo like "n1,n9" proceed as
             # a one-node rollout of n1 alone, which is a staged-deploy
             # footgun.
-            IFS=',' read -r -a wanted <<< "$NODES_FILTER"
-            IFS=',' read -r -a entries <<< "$NODES_RAFT_MAP"
-            declare -a known_ids=()
-            for e in "${entries[@]}"; do
-              known_ids+=("${e%%=*}")
-            done
             unknown=""
+            IFS=',' read -r -a wanted <<< "$NODES_FILTER"
             for w in "${wanted[@]}"; do
-              found=0
-              for k in "${known_ids[@]}"; do
-                if [[ "$k" == "$w" ]]; then found=1; break; fi
-              done
-              if [[ $found -eq 0 ]]; then unknown+="${unknown:+, }$w"; fi
+              [[ -n "$w" ]] || continue
+              if ! lookup_map "$w" "$NODES_RAFT_MAP" >/dev/null; then
+                unknown+="${unknown:+, }$w"
+              fi
             done
             if [[ -n "$unknown" ]]; then
-              echo "::error::nodes filter '$NODES_FILTER' references unknown raft IDs: $unknown. Known IDs: ${known_ids[*]}"
+              echo "::error::nodes filter '$NODES_FILTER' references unknown raft IDs: $unknown. Known IDs: $(known_ids_csv "$NODES_RAFT_MAP")"
               exit 1
             fi
-            filter_csv() {
-              local all="$1"
-              local filter="$2"
-              local out=""
-              IFS=',' read -r -a list_entries <<< "$all"
-              IFS=',' read -r -a list_wanted <<< "$filter"
-              for e in "${list_entries[@]}"; do
-                key="${e%%=*}"
-                for w in "${list_wanted[@]}"; do
-                  if [[ "$key" == "$w" ]]; then
-                    out+="${e},"
-                    break
-                  fi
-                done
-              done
-              echo "${out%,}"
-            }
             NODES_RAFT_MAP="$(filter_csv "$NODES_RAFT_MAP" "$NODES_FILTER")"
             SSH_TARGETS_MAP="$(filter_csv "$SSH_TARGETS_MAP" "$NODES_FILTER")"
             if [[ -z "$NODES_RAFT_MAP" ]]; then
@@ -191,6 +268,7 @@ jobs:
               exit 1
             fi
           fi
+          SSH_TARGETS_MAP="$(materialize_ssh_targets "$NODES_RAFT_MAP" "$SSH_TARGETS_MAP")"
           {
             echo "NODES=$NODES_RAFT_MAP"
             echo "SSH_TARGETS=$SSH_TARGETS_MAP"
@@ -228,20 +306,15 @@ jobs:
         env:
           NODES: ${{ steps.render.outputs.NODES }}
           SSH_TARGETS: ${{ steps.render.outputs.SSH_TARGETS }}
-          IMAGE_BASE: ${{ vars.IMAGE_BASE }}
-          IMAGE_TAG: ${{ inputs.image_tag || inputs.ref }}
+          IMAGE: ${{ vars.IMAGE_BASE }}:${{ inputs.image_tag || inputs.ref }}
           SSH_USER: ${{ vars.SSH_USER }}
+          DRY_RUN: "true"
+          REF: ${{ inputs.ref }}
         run: |
           set -euo pipefail
-          cat <<EOF
-          ==== DRY RUN — no containers were touched ====
-          image:       ${IMAGE_BASE}:${IMAGE_TAG}
-          SSH user:    ${SSH_USER}
-          NODES:       ${NODES}
-          SSH_TARGETS: ${SSH_TARGETS}
-          ref:         ${{ inputs.ref }}
-          Re-run with dry_run=false to apply.
-          EOF
+          ./scripts/rolling-update.sh --dry-run
+          echo "ref: $REF"
+          echo "Re-run with dry_run=false to apply."
 
       - name: Roll cluster
         if: ${{ !inputs.dry_run }}
diff --git a/docs/deploy_via_tailscale_runbook.md b/docs/deploy_via_tailscale_runbook.md
index 82e6f0511..3c27b78b8 100644
--- a/docs/deploy_via_tailscale_runbook.md
+++ b/docs/deploy_via_tailscale_runbook.md
@@ -110,7 +110,7 @@ Regenerate on operator rotation.
 | `IMAGE_BASE`      | Container image path (no tag)     | `ghcr.io/bootjp/elastickv` |
 | `SSH_USER`        | SSH login on every node           | `bootjp` |
 | `NODES_RAFT_MAP`  | Comma-separated `raftId=host` (no port — the script appends `RAFT_PORT`). Use full MagicDNS FQDNs so every node can resolve the advertised address regardless of local DNS search domains. The workflow renders this into the script's `NODES` env var. | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,n3=kv03.<tailnet>.ts.net,n4=kv04.<tailnet>.ts.net,n5=kv05.<tailnet>.ts.net` |
-| `SSH_TARGETS_MAP` | Comma-separated `raftId=ssh-host`. The workflow renders this into the script's `SSH_TARGETS` env var. Usually identical to `NODES_RAFT_MAP` unless SSH access uses a different hostname. | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,...` |
+| `SSH_TARGETS_MAP` | Optional comma-separated `raftId=ssh-host`. The workflow renders this into the script's `SSH_TARGETS` env var. Usually identical to `NODES_RAFT_MAP` unless SSH access uses a different hostname. If the variable is empty or an ID is omitted, the workflow falls back to that ID's `NODES_RAFT_MAP` host so reachability checks still cover every rollout node. | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,...` |
 
 **Why two names?** The workflow uses `NODES_RAFT_MAP` / `SSH_TARGETS_MAP`
 in the `production` environment to keep the GitHub-side names
@@ -125,12 +125,14 @@ Actions tab → "Rolling update" → Run workflow.
 
 Inputs:
 
-- `ref` — the git tag or sha to deploy (also used as the container image tag)
+- `ref` — the image tag/ref to deploy. The workflow code itself is always
+  checked out from the repository default branch.
 - `image_tag` — override only for rollbacks (e.g., deploy tag `v1.2.3` of a
   commit that was also `v1.2.3`)
 - `nodes` — subset of raft IDs, e.g., `n1,n2`. Empty rolls all nodes.
-- `dry_run` — default `true`. Renders the plan and checks reachability without
-  touching containers.
+- `dry_run` — default `true`. Checks reachability and runs
+  `./scripts/rolling-update.sh --dry-run` with the rendered environment,
+  without touching containers.
 
 Recommended first-run sequence:
 
@@ -152,9 +154,8 @@ hazard that needs manual cleanup.
 
 1. **Look at the last log line from the `Roll cluster` step.** The
    script emits `==> [<raft-id>@<host>] start` at the beginning of
-   each per-node recreate (see `scripts/rolling-update.sh:398`).
-   Whichever `<raft-id>` appears in the last such line is the one in
-   flight when the cancel signal landed.
+   each per-node recreate. Whichever `<raft-id>` appears in the last
+   such line is the one in flight when the cancel signal landed.
 2. **SSH into that node** over Tailscale and run `docker ps`. If the
    container is absent or `Exited`, finish the recreate by hand. The
    `docker run` invocation itself is redirected to `/dev/null` by the
diff --git a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
index 2eb0168f2..be9e3492d 100644
--- a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
+++ b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
@@ -131,9 +131,11 @@ With `dry_run: true` (the default):
 
 - Everything up to script invocation runs (checkout, tailnet join, SSH agent
   load, `NODES`/`SSH_TARGETS` render).
-- The script is invoked with `--help` + the rendered env is printed as a
-  collapsed log group.
 - `tailscale ping` is run against each SSH target to confirm reachability.
+- The script is invoked as `DRY_RUN=true ./scripts/rolling-update.sh --dry-run`
+  with the same rendered env the live rollout would receive. It validates the
+  node maps, rollout order, derived service maps, image name, and per-node SSH
+  targets, then prints the plan.
 - The actual `docker stop/rm/run` loop does NOT execute.
 
 This catches the common failure modes (bad secret, bad env mapping, a node
@@ -163,10 +165,35 @@ Rolling back uses the same workflow with `image_tag: <previous-sha>`. The
 script already supports the rollout order env var (`ROLLING_ORDER`) so an
 operator can force-roll only the affected nodes.
 
-**Gap:** there is no "stop mid-rollout" control today. If the workflow is
-cancelled via GitHub UI during a roll, the in-flight node may be mid-recreate.
-`rolling-update.sh` is supposed to be idempotent and crash-safe, but this
-should be verified before we call the workflow production-ready.
+The workflow sets `cancel-in-progress: false`, so a newer run cannot
+automatically interrupt an active rollout. A human can still cancel the job in
+the GitHub UI, and cancellation can land while one node is between `docker rm`
+and the replacement container becoming healthy. The runbook therefore treats
+mid-rollout cancellation as a manual recovery case:
+
+- identify the last `==> [<raft-id>@<host>] start` line in the workflow log,
+- inspect that node over Tailscale and restore/finish the container if needed,
+- run a dry-run with the same image tag after the node is healthy,
+- re-run the workflow against the remaining stale node IDs, or against the full
+  cluster once the interrupted node is confirmed healthy.
+
+The script is intentionally idempotent for a re-run: a node already running the
+target image and passing the gRPC health check is skipped instead of being
+stopped again.
+
+### 2.8 Live cutover and zero-downtime posture
+
+v1 is a controlled rolling restart of the existing Raft cluster, not a
+blue-green deploy. It reduces downtime risk by rolling one node at a time,
+transferring leadership before touching a leader when possible, waiting for
+gRPC health after each node, requiring image existence before the roll, and
+keeping dry-run/reachability checks on the same node map used by the live run.
+
+That is enough for compatible container updates where quorum remains available.
+It is not a universal zero-downtime migration mechanism. Changes that need
+dual-write, request shadowing, schema compatibility windows, or traffic
+switching must use a bridge/proxy or blue-green plan outside this workflow
+before production cutover.
 
 ## 3. Open questions
 
@@ -189,8 +216,9 @@ should be verified before we call the workflow production-ready.
 
 - Automatic deploys on merge to main (needs more test coverage before we'd
   trust it).
-- Blue-green or canary strategies (we don't have the traffic-routing layer
-  for it).
+- Blue-green or canary strategies in this workflow. They remain the recommended
+  mitigation for risky incompatible cutovers, but require a traffic-routing
+  layer outside v1.
 - Metrics-based rollback trigger (watch p99, auto-revert if it jumps).
 - Tailscale SSH (option A above).
 - A shared `deploy` user with restricted sudo.
@@ -199,7 +227,7 @@ should be verified before we call the workflow production-ready.
 
 1. Write `.github/workflows/rolling-update.yml` implementing §2.1.
 2. Document the secrets/variables setup in
-   `docs/operations/deploy_runbook.md` (new).
+   `docs/deploy_via_tailscale_runbook.md`.
 3. Run once with `dry_run: true` on a feature branch to validate secrets
    wiring without touching prod.
 4. Run once with `dry_run: false` targeting a single node (via the `nodes`
diff --git a/scripts/rolling-update.sh b/scripts/rolling-update.sh
index 97e23de69..74ba85dd3 100755
--- a/scripts/rolling-update.sh
+++ b/scripts/rolling-update.sh
@@ -7,7 +7,7 @@ REPO_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
 usage() {
   cat <<'EOF'
 Usage:
-  NODES="n1=raft-1.internal,n2=raft-2.internal,n3=raft-3.internal" ./scripts/rolling-update.sh
+  NODES="n1=raft-1.internal,n2=raft-2.internal,n3=raft-3.internal" ./scripts/rolling-update.sh [--dry-run]
 
 Required environment:
   NODES
@@ -17,6 +17,11 @@ Optional environment:
   ROLLING_UPDATE_ENV_FILE
     Shell env file to source before evaluating the rest of the settings.
 
+  DRY_RUN
+    Set to true, or pass --dry-run, to validate and print the rollout plan
+    without building helpers, copying files, SSHing to nodes, or touching
+    containers.
+
   SSH_TARGETS
     Comma-separated SSH target map when SSH hosts differ from advertised hosts:
     "<raftId>=<ssh-host-or-user@host>,..."
@@ -83,10 +88,24 @@ Notes:
 EOF
 }
 
-if [[ "${1:-}" == "--help" || "${1:-}" == "-h" ]]; then
-  usage
-  exit 0
-fi
+DRY_RUN_ARG=false
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --help|-h)
+      usage
+      exit 0
+      ;;
+    --dry-run)
+      DRY_RUN_ARG=true
+      shift
+      ;;
+    *)
+      echo "unknown argument: $1" >&2
+      usage >&2
+      exit 1
+      ;;
+  esac
+done
 
 if [[ -n "${ROLLING_UPDATE_ENV_FILE:-}" ]]; then
   if [[ ! -f "$ROLLING_UPDATE_ENV_FILE" ]]; then
@@ -125,10 +144,19 @@ SSH_TARGETS="${SSH_TARGETS:-}"
 ROLLING_ORDER="${ROLLING_ORDER:-}"
 RAFT_TO_REDIS_MAP="${RAFT_TO_REDIS_MAP:-}"
 RAFT_TO_S3_MAP="${RAFT_TO_S3_MAP:-}"
+DRY_RUN="${DRY_RUN:-false}"
+if [[ "$DRY_RUN_ARG" == "true" ]]; then
+  DRY_RUN=true
+fi
 # Container OOM defenses. See usage() for rationale. Empty string disables.
 DEFAULT_EXTRA_ENV="${DEFAULT_EXTRA_ENV-GOMEMLIMIT=1800MiB}"
 CONTAINER_MEMORY_LIMIT="${CONTAINER_MEMORY_LIMIT-2500m}"
 
+if [[ "$DRY_RUN" != "true" && "$DRY_RUN" != "false" ]]; then
+  echo "DRY_RUN must be true or false" >&2
+  exit 1
+fi
+
 if [[ -z "$NODES" ]]; then
   echo "NODES is required" >&2
   usage >&2
@@ -309,6 +337,25 @@ derive_raft_to_s3_map() {
   )
 }
 
+print_dry_run_plan() {
+  local node_id node_host ssh_target
+
+  echo "[rolling-update] dry run: no remote commands will be executed"
+  echo "[rolling-update] target image: $IMAGE"
+  echo "[rolling-update] container: $CONTAINER_NAME"
+  echo "[rolling-update] raft engine: $RAFT_ENGINE"
+  echo "[rolling-update] nodes:"
+  for node_id in "${ROLLING_NODE_IDS[@]}"; do
+    node_host="$(node_host_by_id "$node_id")"
+    ssh_target="$(ssh_target_by_id "$node_id")"
+    echo "  - raft_id=$node_id host=$node_host ssh_target=$ssh_target"
+  done
+  echo "[rolling-update] RAFT_TO_REDIS_MAP=$RAFT_TO_REDIS_MAP"
+  if [[ "${ENABLE_S3}" == "true" ]]; then
+    echo "[rolling-update] RAFT_TO_S3_MAP=$RAFT_TO_S3_MAP"
+  fi
+}
+
 ensure_local_raftadmin() {
   if [[ -n "$RAFTADMIN_LOCAL_BIN" ]]; then
     if [[ ! -x "$RAFTADMIN_LOCAL_BIN" ]]; then
@@ -873,6 +920,11 @@ if [[ "${ENABLE_S3}" == "true" && -z "$RAFT_TO_S3_MAP" ]]; then
   RAFT_TO_S3_MAP="$(derive_raft_to_s3_map)"
 fi
 
+if [[ "$DRY_RUN" == "true" ]]; then
+  print_dry_run_plan
+  exit 0
+fi
+
 ensure_local_raftadmin
 ensure_remote_raftadmin_binaries
 

From ee2213ad688a293c741472b03388b956f071d711 Mon Sep 17 00:00:00 2001
From: "Yoshiaki Ueda (bootjp)" <contact@bootjp.me>
Date: Wed, 24 Jun 2026 04:20:58 +0900
Subject: [PATCH 09/12] ops: preserve cluster map for subset rollouts

---
 .github/workflows/docker-image.yml   |  4 +-
 .github/workflows/rolling-update.yml | 59 ++++++++++++++++------------
 docs/deploy_via_tailscale_runbook.md | 39 ++++++++++++++----
 3 files changed, 67 insertions(+), 35 deletions(-)

diff --git a/.github/workflows/docker-image.yml b/.github/workflows/docker-image.yml
index 635f1943b..a50c2d444 100644
--- a/.github/workflows/docker-image.yml
+++ b/.github/workflows/docker-image.yml
@@ -40,6 +40,8 @@ jobs:
           platforms: linux/amd64
 #          platforms: linux/amd64,linux/arm64
           push: ${{ github.event_name != 'pull_request' }}
-          tags: ghcr.io/${{ github.REPOSITORY }}:latest
+          tags: |
+            ghcr.io/${{ github.REPOSITORY }}:latest
+            ghcr.io/${{ github.REPOSITORY }}:${{ github.sha }}
 #          cache-from: type=gha
 #          cache-to: type=gha,mode=max
diff --git a/.github/workflows/rolling-update.yml b/.github/workflows/rolling-update.yml
index 4d94aaf63..b7bb1ac01 100644
--- a/.github/workflows/rolling-update.yml
+++ b/.github/workflows/rolling-update.yml
@@ -175,24 +175,19 @@ jobs:
             local all="$1"
             local filter="$2"
             local out=""
-            local e key w
+            local w value
             if [[ -z "$all" ]]; then
               printf '%s' ""
               return 0
             fi
-            IFS=',' read -r -a list_entries <<< "$all"
             IFS=',' read -r -a list_wanted <<< "$filter"
-            for e in "${list_entries[@]}"; do
-              e="${e//[[:space:]]/}"
-              [[ -n "$e" ]] || continue
-              key="${e%%=*}"
-              for w in "${list_wanted[@]}"; do
-                w="${w//[[:space:]]/}"
-                if [[ "$key" == "$w" ]]; then
-                  out+="${out:+,}$e"
-                  break
-                fi
-              done
+            for w in "${list_wanted[@]}"; do
+              w="${w//[[:space:]]/}"
+              [[ -n "$w" ]] || continue
+              value="$(lookup_map "$w" "$all" || true)"
+              if [[ -n "$value" ]]; then
+                out+="${out:+,}${w}=${value}"
+              fi
             done
             printf '%s' "$out"
           }
@@ -243,8 +238,13 @@ jobs:
           fi
           NODES_FILTER="${NODES_FILTER//[[:space:]]/}"
 
+          ROLLING_ORDER="$(known_ids_csv "$NODES_RAFT_MAP")"
           if [[ -n "$NODES_FILTER" ]]; then
-            # Filter NODES_RAFT_MAP and SSH_TARGETS_MAP to the requested subset.
+            # Keep NODES_RAFT_MAP as the full cluster map. rolling-update.sh
+            # derives RAFT_TO_REDIS_MAP / RAFT_TO_S3_MAP and transfer
+            # candidates from NODES, so filtering it for a staged rollout would
+            # start the target node with an incomplete view of the cluster.
+            # The requested subset is passed separately as ROLLING_ORDER.
             # Reject any filter ID that does not appear in the map: silently
             # dropping unknown IDs would let a typo like "n1,n9" proceed as
             # a one-node rollout of n1 alone, which is a staged-deploy
@@ -261,39 +261,44 @@ jobs:
               echo "::error::nodes filter '$NODES_FILTER' references unknown raft IDs: $unknown. Known IDs: $(known_ids_csv "$NODES_RAFT_MAP")"
               exit 1
             fi
-            NODES_RAFT_MAP="$(filter_csv "$NODES_RAFT_MAP" "$NODES_FILTER")"
-            SSH_TARGETS_MAP="$(filter_csv "$SSH_TARGETS_MAP" "$NODES_FILTER")"
-            if [[ -z "$NODES_RAFT_MAP" ]]; then
+            ROLLING_ORDER="$(known_ids_csv "$(filter_csv "$NODES_RAFT_MAP" "$NODES_FILTER")")"
+            if [[ -z "$ROLLING_ORDER" ]]; then
               echo "::error::nodes filter '$NODES_FILTER' matches nothing in NODES_RAFT_MAP"
               exit 1
             fi
           fi
           SSH_TARGETS_MAP="$(materialize_ssh_targets "$NODES_RAFT_MAP" "$SSH_TARGETS_MAP")"
+          ROLLING_SSH_TARGETS="$(filter_csv "$SSH_TARGETS_MAP" "$ROLLING_ORDER")"
           {
             echo "NODES=$NODES_RAFT_MAP"
             echo "SSH_TARGETS=$SSH_TARGETS_MAP"
+            echo "ROLLING_ORDER=$ROLLING_ORDER"
+            echo "ROLLING_SSH_TARGETS=$ROLLING_SSH_TARGETS"
           } >> "$GITHUB_OUTPUT"
           echo "::group::Deploy plan"
           echo "NODES=$NODES_RAFT_MAP"
           echo "SSH_TARGETS=$SSH_TARGETS_MAP"
+          echo "ROLLING_ORDER=$ROLLING_ORDER"
+          echo "ROLLING_SSH_TARGETS=$ROLLING_SSH_TARGETS"
           echo "::endgroup::"
 
-      - name: Tailscale reachability check
+      - name: SSH reachability check
         env:
-          SSH_TARGETS: ${{ steps.render.outputs.SSH_TARGETS }}
+          SSH_TARGETS: ${{ steps.render.outputs.ROLLING_SSH_TARGETS }}
+          SSH_USER: ${{ vars.SSH_USER }}
         run: |
           set -euo pipefail
           IFS=',' read -r -a entries <<< "$SSH_TARGETS"
           failed=0
           for e in "${entries[@]}"; do
-            host="${e##*=}"
-            host="${host%%:*}"
-            # strip user@ if present
-            host="${host##*@}"
-            if tailscale ping --c 2 --timeout 3s "$host" >/dev/null 2>&1; then
-              echo "  ok   $host"
+            target="${e##*=}"
+            if [[ "$target" != *@* ]]; then
+              target="${SSH_USER:-$(id -un)}@$target"
+            fi
+            if ssh -o BatchMode=yes -o ConnectTimeout=10 -o StrictHostKeyChecking=yes "$target" true; then
+              echo "  ok   $target"
             else
-              echo "::error::$host not reachable over tailnet"
+              echo "::error::$target not reachable by SSH over tailnet"
               failed=1
             fi
           done
@@ -306,6 +311,7 @@ jobs:
         env:
           NODES: ${{ steps.render.outputs.NODES }}
           SSH_TARGETS: ${{ steps.render.outputs.SSH_TARGETS }}
+          ROLLING_ORDER: ${{ steps.render.outputs.ROLLING_ORDER }}
           IMAGE: ${{ vars.IMAGE_BASE }}:${{ inputs.image_tag || inputs.ref }}
           SSH_USER: ${{ vars.SSH_USER }}
           DRY_RUN: "true"
@@ -321,6 +327,7 @@ jobs:
         env:
           NODES: ${{ steps.render.outputs.NODES }}
           SSH_TARGETS: ${{ steps.render.outputs.SSH_TARGETS }}
+          ROLLING_ORDER: ${{ steps.render.outputs.ROLLING_ORDER }}
           SSH_USER: ${{ vars.SSH_USER }}
           IMAGE: ${{ vars.IMAGE_BASE }}:${{ inputs.image_tag || inputs.ref }}
           SSH_STRICT_HOST_KEY_CHECKING: "yes"
diff --git a/docs/deploy_via_tailscale_runbook.md b/docs/deploy_via_tailscale_runbook.md
index 3c27b78b8..e10d42dc2 100644
--- a/docs/deploy_via_tailscale_runbook.md
+++ b/docs/deploy_via_tailscale_runbook.md
@@ -46,11 +46,24 @@ In the Tailscale admin console, add the deploy rule to the tailnet ACL:
     "src":    ["tag:ci-deploy"],
     "dst":    ["tag:elastickv-node:22"],
   },
+  {
+    "action": "accept",
+    "src":    ["tag:elastickv-node"],
+    "dst":    [
+      "tag:elastickv-node:50051", // Raft / raftadmin
+      "tag:elastickv-node:6379",  // Redis adapter, if enabled
+      "tag:elastickv-node:9000",  // S3 adapter, if enabled
+    ],
+  },
 ],
 ```
 
 `tag:ci-deploy` must NOT have access to any other port on the tailnet. The
-deploy workflow only needs SSH.
+deploy workflow only needs SSH. Node-to-node access is separate: every
+`tag:elastickv-node` must be able to reach the cluster ports advertised in
+`NODES_RAFT_MAP` / derived adapter maps, otherwise a restarted node can come
+back with peer addresses it cannot dial and leader-transfer probes can fail
+mid-roll.
 
 ## 3. Tailscale OAuth client
 
@@ -109,15 +122,16 @@ Regenerate on operator rotation.
 |------|-------|---------|
 | `IMAGE_BASE`      | Container image path (no tag)     | `ghcr.io/bootjp/elastickv` |
 | `SSH_USER`        | SSH login on every node           | `bootjp` |
-| `NODES_RAFT_MAP`  | Comma-separated `raftId=host` (no port — the script appends `RAFT_PORT`). Use full MagicDNS FQDNs so every node can resolve the advertised address regardless of local DNS search domains. The workflow renders this into the script's `NODES` env var. | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,n3=kv03.<tailnet>.ts.net,n4=kv04.<tailnet>.ts.net,n5=kv05.<tailnet>.ts.net` |
+| `NODES_RAFT_MAP`  | Comma-separated `raftId=host` (no port — the script appends `RAFT_PORT`). Use full MagicDNS FQDNs so every node can resolve the advertised address regardless of local DNS search domains. The workflow always renders the full map into the script's `NODES` env var, even for subset rollouts; the `nodes` input becomes `ROLLING_ORDER` so the script still derives full-cluster peer maps. | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,n3=kv03.<tailnet>.ts.net,n4=kv04.<tailnet>.ts.net,n5=kv05.<tailnet>.ts.net` |
 | `SSH_TARGETS_MAP` | Optional comma-separated `raftId=ssh-host`. The workflow renders this into the script's `SSH_TARGETS` env var. Usually identical to `NODES_RAFT_MAP` unless SSH access uses a different hostname. If the variable is empty or an ID is omitted, the workflow falls back to that ID's `NODES_RAFT_MAP` host so reachability checks still cover every rollout node. | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,...` |
 
 **Why two names?** The workflow uses `NODES_RAFT_MAP` / `SSH_TARGETS_MAP`
 in the `production` environment to keep the GitHub-side names
 distinct from the script-side env var names it hands to
 `rolling-update.sh`. If you run the script by hand from a workstation
-you must export `NODES` and `SSH_TARGETS` directly — the workflow-side
-names are only understood by the workflow's render step.
+you must export `NODES` and `SSH_TARGETS` directly, plus `ROLLING_ORDER`
+when you want a subset rollout — the workflow-side names are only
+understood by the workflow's render step.
 
 ## 5. Running a deploy
 
@@ -145,8 +159,16 @@ Recommended first-run sequence:
 ## 6. Rollback
 
 Re-run the workflow with `image_tag` set to the previous-known-good sha. The
+Docker image workflow publishes both `latest` and the immutable commit-SHA tag
+for each main-branch build, so SHA rollback works without a manual retag. The
 `nodes` input can target specific nodes if only some carry the bad image.
 
+For private GHCR packages, each node must already be logged in to `ghcr.io`
+with a deploy-scoped read token, or the remote `docker pull` will fail even
+though the workflow runner's manifest check succeeded. Keep that credential
+rotation outside this workflow for v1; the workflow only verifies that the tag
+exists from the runner side.
+
 ### If a running workflow is cancelled mid-rollout
 
 GitHub cancelling the job between node steps is the one operational
@@ -190,7 +212,7 @@ and only the still-stale one gets recreated.
 
 ## 7. What the workflow does NOT do (yet)
 
-- **No post-deploy health verification beyond tailnet reachability.** The
+- **No post-deploy health verification beyond SSH reachability.** The
   script itself blocks on `raftadmin` leadership transfer and health-gate
   timeouts, but the workflow does not independently probe Prometheus or
   Redis after the roll. Add this when we have a canonical post-deploy
@@ -214,10 +236,11 @@ environment" for the dry-run-approval alternatives (approach 2: add a
 second `production-dry-run` environment without required reviewers)
 if the friction becomes intolerable.
 
-### `tailscale ping` fails for a node
+### SSH reachability fails for a node
 The node may not be running `tailscaled`, not tagged `tag:elastickv-node`, or
-the tailnet ACL may have drifted. `tailscale status` on the node should show
-the tag; the admin console should show the IP in the `tag:elastickv-node`
+the system `sshd` may not be reachable over the tailnet ACL. `tailscale status`
+on the node should show the tag; the admin console should show the IP in the
+`tag:elastickv-node`
 group.
 
 ### `image ... not found on ghcr.io`

From e96ac64ca776b3783f0fa730cc63c508ea2e0c9f Mon Sep 17 00:00:00 2001
From: "Yoshiaki Ueda (bootjp)" <contact@bootjp.me>
Date: Wed, 24 Jun 2026 04:53:29 +0900
Subject: [PATCH 10/12] ops: harden rolling update dispatch settings

---
 .github/workflows/rolling-update.yml | 21 ++++++++++++++++++++-
 docs/deploy_via_tailscale_runbook.md | 12 +++++++++++-
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/rolling-update.yml b/.github/workflows/rolling-update.yml
index b7bb1ac01..cae682dfa 100644
--- a/.github/workflows/rolling-update.yml
+++ b/.github/workflows/rolling-update.yml
@@ -8,7 +8,7 @@ on:
   workflow_dispatch:
     inputs:
       ref:
-        description: Image tag/ref to deploy. Workflow code is always checked out from the repository default branch.
+        description: Image tag/ref to deploy. Start this workflow from the repository default branch.
         required: true
         type: string
       image_tag:
@@ -56,9 +56,16 @@ jobs:
         env:
           GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
           REF: ${{ inputs.ref }}
+          RUN_REF_NAME: ${{ github.ref_name }}
+          RUN_REF_TYPE: ${{ github.ref_type }}
         run: |
           set -euo pipefail
           default_branch=$(gh api "repos/${{ github.repository }}" --jq '.default_branch')
+          if [[ "$RUN_REF_TYPE" != "branch" || "$RUN_REF_NAME" != "$default_branch" ]]; then
+            echo "::error::rolling-update must be dispatched from the trusted default branch '$default_branch' (got ${RUN_REF_TYPE}:${RUN_REF_NAME})"
+            echo "::error::configure the production environment to allow deployments only from the default branch"
+            exit 1
+          fi
           echo "checkout_ref=$default_branch" >> "$GITHUB_OUTPUT"
           echo "deploy ref/image tag: $REF"
 
@@ -314,10 +321,16 @@ jobs:
           ROLLING_ORDER: ${{ steps.render.outputs.ROLLING_ORDER }}
           IMAGE: ${{ vars.IMAGE_BASE }}:${{ inputs.image_tag || inputs.ref }}
           SSH_USER: ${{ vars.SSH_USER }}
+          ENABLE_S3: ${{ vars.ENABLE_S3 || 'false' }}
+          S3_CREDENTIALS_FILE: ${{ vars.S3_CREDENTIALS_FILE }}
           DRY_RUN: "true"
           REF: ${{ inputs.ref }}
         run: |
           set -euo pipefail
+          if [[ "$ENABLE_S3" == "true" && -z "$S3_CREDENTIALS_FILE" ]]; then
+            echo "::error::ENABLE_S3=true requires S3_CREDENTIALS_FILE in the production environment"
+            exit 1
+          fi
           ./scripts/rolling-update.sh --dry-run
           echo "ref: $REF"
           echo "Re-run with dry_run=false to apply."
@@ -330,7 +343,13 @@ jobs:
           ROLLING_ORDER: ${{ steps.render.outputs.ROLLING_ORDER }}
           SSH_USER: ${{ vars.SSH_USER }}
           IMAGE: ${{ vars.IMAGE_BASE }}:${{ inputs.image_tag || inputs.ref }}
+          ENABLE_S3: ${{ vars.ENABLE_S3 || 'false' }}
+          S3_CREDENTIALS_FILE: ${{ vars.S3_CREDENTIALS_FILE }}
           SSH_STRICT_HOST_KEY_CHECKING: "yes"
         run: |
           set -euo pipefail
+          if [[ "$ENABLE_S3" == "true" && -z "$S3_CREDENTIALS_FILE" ]]; then
+            echo "::error::ENABLE_S3=true requires S3_CREDENTIALS_FILE in the production environment"
+            exit 1
+          fi
           ./scripts/rolling-update.sh
diff --git a/docs/deploy_via_tailscale_runbook.md b/docs/deploy_via_tailscale_runbook.md
index e10d42dc2..b06bc7ce8 100644
--- a/docs/deploy_via_tailscale_runbook.md
+++ b/docs/deploy_via_tailscale_runbook.md
@@ -104,6 +104,14 @@ workflow inputs. Three ways to handle the dry-run-approval friction:
 v1 ships with approach 1 (single environment, prompt on every run).
 Approach 2 is the recommended upgrade once the friction becomes annoying.
 
+### Deployment branch policy
+
+Restrict the `production` environment to deployments from the repository
+default branch only. The workflow also has an early guard that fails when a
+manual dispatch is started from any other branch, but the environment policy is
+the trust boundary because GitHub executes workflow YAML from the selected
+dispatch ref before checkout.
+
 ### Environment secrets
 
 | Name | Value |
@@ -124,6 +132,8 @@ Regenerate on operator rotation.
 | `SSH_USER`        | SSH login on every node           | `bootjp` |
 | `NODES_RAFT_MAP`  | Comma-separated `raftId=host` (no port — the script appends `RAFT_PORT`). Use full MagicDNS FQDNs so every node can resolve the advertised address regardless of local DNS search domains. The workflow always renders the full map into the script's `NODES` env var, even for subset rollouts; the `nodes` input becomes `ROLLING_ORDER` so the script still derives full-cluster peer maps. | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,n3=kv03.<tailnet>.ts.net,n4=kv04.<tailnet>.ts.net,n5=kv05.<tailnet>.ts.net` |
 | `SSH_TARGETS_MAP` | Optional comma-separated `raftId=ssh-host`. The workflow renders this into the script's `SSH_TARGETS` env var. Usually identical to `NODES_RAFT_MAP` unless SSH access uses a different hostname. If the variable is empty or an ID is omitted, the workflow falls back to that ID's `NODES_RAFT_MAP` host so reachability checks still cover every rollout node. | `n1=kv01.<tailnet>.ts.net,n2=kv02.<tailnet>.ts.net,...` |
+| `ENABLE_S3`       | `true` to start the S3 adapter, `false` to keep it disabled. The workflow defaults missing values to `false` rather than the script's local default. | `true` |
+| `S3_CREDENTIALS_FILE` | Node-local path to the SigV4 credentials file. Required when `ENABLE_S3=true`; the workflow fails before rollout if it is missing. | `/etc/elastickv/s3-credentials.json` |
 
 **Why two names?** The workflow uses `NODES_RAFT_MAP` / `SSH_TARGETS_MAP`
 in the `production` environment to keep the GitHub-side names
@@ -140,7 +150,7 @@ Actions tab → "Rolling update" → Run workflow.
 Inputs:
 
 - `ref` — the image tag/ref to deploy. The workflow code itself is always
-  checked out from the repository default branch.
+  dispatched and checked out from the repository default branch.
 - `image_tag` — override only for rollbacks (e.g., deploy tag `v1.2.3` of a
   commit that was also `v1.2.3`)
 - `nodes` — subset of raft IDs, e.g., `n1,n2`. Empty rolls all nodes.

From 8fcaac654b2359246c5aaea17c894e708e8dd9f4 Mon Sep 17 00:00:00 2001
From: "Yoshiaki Ueda (bootjp)" <contact@bootjp.me>
Date: Wed, 24 Jun 2026 05:06:30 +0900
Subject: [PATCH 11/12] ops: harden deploy workflow review findings

---
 .github/workflows/docker-image.yml            |  7 +++--
 .github/workflows/rolling-update.yml          | 26 ++++++++++++-------
 docs/deploy_via_tailscale_runbook.md          | 22 +++++++++-------
 ...026_04_24_proposed_deploy_via_tailscale.md |  9 +++++--
 4 files changed, 42 insertions(+), 22 deletions(-)

diff --git a/.github/workflows/docker-image.yml b/.github/workflows/docker-image.yml
index a50c2d444..3e31e2c05 100644
--- a/.github/workflows/docker-image.yml
+++ b/.github/workflows/docker-image.yml
@@ -32,6 +32,9 @@ jobs:
           registry: ghcr.io
           username: ${{ github.repository_owner }}
           password: ${{ secrets.GITHUB_TOKEN }}
+      - name: Derive image name
+        id: image
+        run: echo "name=ghcr.io/${GITHUB_REPOSITORY,,}" >> "$GITHUB_OUTPUT"
       - name: Build and push
         uses: docker/build-push-action@v7
         with:
@@ -41,7 +44,7 @@ jobs:
 #          platforms: linux/amd64,linux/arm64
           push: ${{ github.event_name != 'pull_request' }}
           tags: |
-            ghcr.io/${{ github.REPOSITORY }}:latest
-            ghcr.io/${{ github.REPOSITORY }}:${{ github.sha }}
+            ${{ steps.image.outputs.name }}:latest
+            ${{ steps.image.outputs.name }}:${{ github.sha }}
 #          cache-from: type=gha
 #          cache-to: type=gha,mode=max
diff --git a/.github/workflows/rolling-update.yml b/.github/workflows/rolling-update.yml
index cae682dfa..b1658973c 100644
--- a/.github/workflows/rolling-update.yml
+++ b/.github/workflows/rolling-update.yml
@@ -70,19 +70,17 @@ jobs:
           echo "deploy ref/image tag: $REF"
 
       - name: Checkout trusted deploy script
-        uses: actions/checkout@v6
+        uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6
         with:
           ref: ${{ steps.trusted-ref.outputs.checkout_ref }}
           persist-credentials: false
 
-      - name: Install jq
-        run: sudo apt-get install -y --no-install-recommends jq
-
       - name: Verify image exists on ghcr.io
         env:
           IMAGE_BASE: ${{ vars.IMAGE_BASE }}
           IMAGE_TAG: ${{ inputs.image_tag || inputs.ref }}
           GHCR_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          ACTOR: ${{ github.actor }}
         run: |
           set -euo pipefail
           if [[ -z "$IMAGE_BASE" ]]; then
@@ -90,14 +88,14 @@ jobs:
             exit 1
           fi
           echo "Checking $IMAGE_BASE:$IMAGE_TAG"
-          echo "$GHCR_TOKEN" | docker login ghcr.io -u "${{ github.actor }}" --password-stdin >/dev/null
+          echo "$GHCR_TOKEN" | docker login ghcr.io -u "$ACTOR" --password-stdin >/dev/null
           if ! docker manifest inspect "$IMAGE_BASE:$IMAGE_TAG" >/dev/null; then
             echo "::error::image $IMAGE_BASE:$IMAGE_TAG not found on ghcr.io"
             exit 1
           fi
 
       - name: Join Tailnet (ephemeral)
-        uses: tailscale/github-action@v3
+        uses: tailscale/github-action@6cae46e2d796f265265cfcf628b72a32b4d7cade # v3
         with:
           oauth-client-id: ${{ secrets.TS_OAUTH_CLIENT_ID }}
           oauth-secret: ${{ secrets.TS_OAUTH_SECRET }}
@@ -302,9 +300,19 @@ jobs:
             if [[ "$target" != *@* ]]; then
               target="${SSH_USER:-$(id -un)}@$target"
             fi
-            if ssh -o BatchMode=yes -o ConnectTimeout=10 -o StrictHostKeyChecking=yes "$target" true; then
-              echo "  ok   $target"
-            else
+            ok=0
+            for attempt in 1 2 3 4 5 6; do
+              if ssh -o BatchMode=yes -o ConnectTimeout=10 -o StrictHostKeyChecking=yes "$target" true; then
+                echo "  ok   $target"
+                ok=1
+                break
+              fi
+              if [[ "$attempt" -lt 6 ]]; then
+                echo "  wait $target (attempt $attempt failed; retrying)"
+                sleep 10
+              fi
+            done
+            if [[ "$ok" -ne 1 ]]; then
               echo "::error::$target not reachable by SSH over tailnet"
               failed=1
             fi
diff --git a/docs/deploy_via_tailscale_runbook.md b/docs/deploy_via_tailscale_runbook.md
index b06bc7ce8..85c174a8e 100644
--- a/docs/deploy_via_tailscale_runbook.md
+++ b/docs/deploy_via_tailscale_runbook.md
@@ -9,7 +9,7 @@ runbook is for operators: what to configure on GitHub and Tailscale so the
 Each cluster node must have `tailscale` installed, logged into the tailnet, and
 tagged so the CI runner's ACL can reach it.
 
-```
+```bash
 # on each kv0X node
 sudo tailscale up \
   --ssh=false \
@@ -18,7 +18,7 @@ sudo tailscale up \
 ```
 
 `--ssh=false` disables Tailscale SSH, so the node's regular system
-sshd must be running and authorised to accept connections on the
+sshd must be running and authorized to accept connections on the
 tailnet interface. The workflow uses plain SSH over the tailnet
 (Tailscale is only the network layer); if you rely on Tailscale SSH
 for operator access elsewhere, drop this flag but keep in mind the
@@ -26,7 +26,7 @@ workflow still connects to the system sshd.
 
 Verify the node is reachable by MagicDNS from another tailnet peer:
 
-```
+```bash
 tailscale status | grep kv0X
 ping kv0X.<tailnet>.ts.net
 ```
@@ -52,6 +52,7 @@ In the Tailscale admin console, add the deploy rule to the tailnet ACL:
     "dst":    [
       "tag:elastickv-node:50051", // Raft / raftadmin
       "tag:elastickv-node:6379",  // Redis adapter, if enabled
+      "tag:elastickv-node:8000",  // DynamoDB adapter, if enabled
       "tag:elastickv-node:9000",  // S3 adapter, if enabled
     ],
   },
@@ -119,7 +120,7 @@ dispatch ref before checkout.
 | `TS_OAUTH_CLIENT_ID`        | Tailscale OAuth client ID from step 3 |
 | `TS_OAUTH_SECRET`           | Tailscale OAuth secret from step 3 |
 | `DEPLOY_SSH_PRIVATE_KEY`    | OpenSSH private key, authorized on every node under the deploy user |
-| `DEPLOY_KNOWN_HOSTS`        | `ssh-keyscan -H kv01.<tailnet>.ts.net kv02.<tailnet>.ts.net …` output. Use `-H` to hash hostnames so the secret's contents don't leak the tailnet topology if the runner environment is compromised. |
+| `DEPLOY_KNOWN_HOSTS`        | `ssh-keyscan -H kv01.<tailnet>.ts.net kv02.<tailnet>.ts.net …` output. Use `-H` to hash hostnames so the secret's contents don't leak the tailnet topology if the runner environment is compromised. Regenerate this secret when the node list changes or if SSH reports `Host key verification failed`. |
 
 The SSH key should be ed25519, dedicated to CI (not a reused developer key).
 Regenerate on operator rotation.
@@ -135,6 +136,11 @@ Regenerate on operator rotation.
 | `ENABLE_S3`       | `true` to start the S3 adapter, `false` to keep it disabled. The workflow defaults missing values to `false` rather than the script's local default. | `true` |
 | `S3_CREDENTIALS_FILE` | Node-local path to the SigV4 credentials file. Required when `ENABLE_S3=true`; the workflow fails before rollout if it is missing. | `/etc/elastickv/s3-credentials.json` |
 
+For private GHCR packages, log every node in to `ghcr.io` with a
+deploy-scoped read token before the first rollout. The workflow's manifest check
+proves the runner can see the image tag; it does not install Docker credentials
+on the remote nodes that execute `docker pull`.
+
 **Why two names?** The workflow uses `NODES_RAFT_MAP` / `SSH_TARGETS_MAP`
 in the `production` environment to keep the GitHub-side names
 distinct from the script-side env var names it hands to
@@ -173,11 +179,9 @@ Docker image workflow publishes both `latest` and the immutable commit-SHA tag
 for each main-branch build, so SHA rollback works without a manual retag. The
 `nodes` input can target specific nodes if only some carry the bad image.
 
-For private GHCR packages, each node must already be logged in to `ghcr.io`
-with a deploy-scoped read token, or the remote `docker pull` will fail even
-though the workflow runner's manifest check succeeded. Keep that credential
-rotation outside this workflow for v1; the workflow only verifies that the tag
-exists from the runner side.
+For private GHCR packages, keep the node-level Docker credential rotation
+outside this workflow for v1; the workflow only verifies that the tag exists
+from the runner side.
 
 ### If a running workflow is cancelled mid-rollout
 
diff --git a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
index be9e3492d..069a819fd 100644
--- a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
+++ b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
@@ -131,12 +131,17 @@ With `dry_run: true` (the default):
 
 - Everything up to script invocation runs (checkout, tailnet join, SSH agent
   load, `NODES`/`SSH_TARGETS` render).
-- `tailscale ping` is run against each SSH target to confirm reachability.
+- An SSH reachability pre-check (`ssh -o BatchMode=yes -o ConnectTimeout=10
+  <target> true`) is retried against each SSH target. This confirms that
+  network routing, host-key trust, the system sshd, and deploy-key
+  authorization all work, which a network-layer ping cannot prove.
 - The script is invoked as `DRY_RUN=true ./scripts/rolling-update.sh --dry-run`
   with the same rendered env the live rollout would receive. It validates the
   node maps, rollout order, derived service maps, image name, and per-node SSH
   targets, then prints the plan.
-- The actual `docker stop/rm/run` loop does NOT execute.
+- The actual `docker stop/rm/run` loop does NOT execute. Dry-run also skips
+  remote image pulls and any other container side effects; only validation and
+  plan rendering run.
 
 This catches the common failure modes (bad secret, bad env mapping, a node
 unreachable over the tailnet) before touching any live container.

From fdb65de92fe294ae8399e5429225b65de74ee3cf Mon Sep 17 00:00:00 2001
From: "Yoshiaki Ueda (bootjp)" <contact@bootjp.me>
Date: Wed, 24 Jun 2026 05:51:53 +0900
Subject: [PATCH 12/12] ops: harden deploy workflow preflights

---
 .github/workflows/rolling-update.yml          | 72 +++++++++++++++++--
 docs/deploy_via_tailscale_runbook.md          | 38 +++++-----
 ...026_04_24_proposed_deploy_via_tailscale.md | 12 ++--
 3 files changed, 92 insertions(+), 30 deletions(-)

diff --git a/.github/workflows/rolling-update.yml b/.github/workflows/rolling-update.yml
index b1658973c..4c4b0c88b 100644
--- a/.github/workflows/rolling-update.yml
+++ b/.github/workflows/rolling-update.yml
@@ -46,6 +46,22 @@ jobs:
     # protection rules").
     environment: production
     timeout-minutes: 60
+    env:
+      CONTAINER_NAME: ${{ vars.CONTAINER_NAME || 'elastickv' }}
+      DATA_DIR: ${{ vars.DATA_DIR || '/var/lib/elastickv' }}
+      SERVER_ENTRYPOINT: ${{ vars.SERVER_ENTRYPOINT || '/app' }}
+      RAFT_ENGINE: ${{ vars.RAFT_ENGINE || 'etcd' }}
+      RAFT_PORT: ${{ vars.RAFT_PORT || '50051' }}
+      REDIS_PORT: ${{ vars.REDIS_PORT || '6379' }}
+      DYNAMO_PORT: ${{ vars.DYNAMO_PORT || '8000' }}
+      S3_PORT: ${{ vars.S3_PORT || '9000' }}
+      ENABLE_S3: ${{ vars.ENABLE_S3 || 'false' }}
+      S3_REGION: ${{ vars.S3_REGION || 'us-east-1' }}
+      S3_CREDENTIALS_FILE: ${{ vars.S3_CREDENTIALS_FILE }}
+      S3_PATH_STYLE_ONLY: ${{ vars.S3_PATH_STYLE_ONLY || 'true' }}
+      DEFAULT_EXTRA_ENV: ${{ vars.DEFAULT_EXTRA_ENV || 'GOMEMLIMIT=1800MiB' }}
+      EXTRA_ENV: ${{ vars.EXTRA_ENV }}
+      CONTAINER_MEMORY_LIMIT: ${{ vars.CONTAINER_MEMORY_LIMIT || '2500m' }}
 
     steps:
       # The deploy script is executed after the tailnet join and SSH key load.
@@ -58,6 +74,7 @@ jobs:
           REF: ${{ inputs.ref }}
           RUN_REF_NAME: ${{ github.ref_name }}
           RUN_REF_TYPE: ${{ github.ref_type }}
+          RUN_SHA: ${{ github.sha }}
         run: |
           set -euo pipefail
           default_branch=$(gh api "repos/${{ github.repository }}" --jq '.default_branch')
@@ -66,8 +83,9 @@ jobs:
             echo "::error::configure the production environment to allow deployments only from the default branch"
             exit 1
           fi
-          echo "checkout_ref=$default_branch" >> "$GITHUB_OUTPUT"
+          echo "checkout_ref=$RUN_SHA" >> "$GITHUB_OUTPUT"
           echo "deploy ref/image tag: $REF"
+          echo "trusted workflow sha: $RUN_SHA"
 
       - name: Checkout trusted deploy script
         uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6
@@ -321,6 +339,54 @@ jobs:
             exit 1
           fi
 
+      - name: Remote S3 credentials preflight
+        if: ${{ env.ENABLE_S3 == 'true' }}
+        env:
+          SSH_TARGETS: ${{ steps.render.outputs.ROLLING_SSH_TARGETS }}
+          SSH_USER: ${{ vars.SSH_USER }}
+        run: |
+          set -euo pipefail
+          if [[ -z "$S3_CREDENTIALS_FILE" ]]; then
+            echo "::error::ENABLE_S3=true requires S3_CREDENTIALS_FILE in the production environment"
+            exit 1
+          fi
+          printf -v remote_path '%q' "$S3_CREDENTIALS_FILE"
+          IFS=',' read -r -a entries <<< "$SSH_TARGETS"
+          failed=0
+          for e in "${entries[@]}"; do
+            target="${e##*=}"
+            if [[ "$target" != *@* ]]; then
+              target="${SSH_USER:-$(id -un)}@$target"
+            fi
+            if ssh -o BatchMode=yes -o ConnectTimeout=10 -o StrictHostKeyChecking=yes "$target" "test -r $remote_path"; then
+              echo "  ok   $target:$S3_CREDENTIALS_FILE"
+            else
+              echo "::error::$target cannot read S3_CREDENTIALS_FILE=$S3_CREDENTIALS_FILE"
+              failed=1
+            fi
+          done
+          if [[ "$failed" -ne 0 ]]; then
+            exit 1
+          fi
+
+      - name: Log rollout configuration
+        env:
+          NODES: ${{ steps.render.outputs.NODES }}
+          SSH_TARGETS: ${{ steps.render.outputs.SSH_TARGETS }}
+          ROLLING_ORDER: ${{ steps.render.outputs.ROLLING_ORDER }}
+          IMAGE: ${{ vars.IMAGE_BASE }}:${{ inputs.image_tag || inputs.ref }}
+        run: |
+          set -euo pipefail
+          echo "::group::Rollout runtime configuration"
+          for name in \
+            IMAGE CONTAINER_NAME DATA_DIR SERVER_ENTRYPOINT RAFT_ENGINE \
+            RAFT_PORT REDIS_PORT DYNAMO_PORT S3_PORT ENABLE_S3 S3_REGION \
+            S3_CREDENTIALS_FILE S3_PATH_STYLE_ONLY DEFAULT_EXTRA_ENV EXTRA_ENV \
+            CONTAINER_MEMORY_LIMIT NODES SSH_TARGETS ROLLING_ORDER; do
+            printf '%s=%s\n' "$name" "${!name-}"
+          done
+          echo "::endgroup::"
+
       - name: Dry-run summary
         if: ${{ inputs.dry_run }}
         env:
@@ -329,8 +395,6 @@ jobs:
           ROLLING_ORDER: ${{ steps.render.outputs.ROLLING_ORDER }}
           IMAGE: ${{ vars.IMAGE_BASE }}:${{ inputs.image_tag || inputs.ref }}
           SSH_USER: ${{ vars.SSH_USER }}
-          ENABLE_S3: ${{ vars.ENABLE_S3 || 'false' }}
-          S3_CREDENTIALS_FILE: ${{ vars.S3_CREDENTIALS_FILE }}
           DRY_RUN: "true"
           REF: ${{ inputs.ref }}
         run: |
@@ -351,8 +415,6 @@ jobs:
           ROLLING_ORDER: ${{ steps.render.outputs.ROLLING_ORDER }}
           SSH_USER: ${{ vars.SSH_USER }}
           IMAGE: ${{ vars.IMAGE_BASE }}:${{ inputs.image_tag || inputs.ref }}
-          ENABLE_S3: ${{ vars.ENABLE_S3 || 'false' }}
-          S3_CREDENTIALS_FILE: ${{ vars.S3_CREDENTIALS_FILE }}
           SSH_STRICT_HOST_KEY_CHECKING: "yes"
         run: |
           set -euo pipefail
diff --git a/docs/deploy_via_tailscale_runbook.md b/docs/deploy_via_tailscale_runbook.md
index 85c174a8e..132539734 100644
--- a/docs/deploy_via_tailscale_runbook.md
+++ b/docs/deploy_via_tailscale_runbook.md
@@ -71,13 +71,12 @@ mid-roll.
 Admin console → Settings → OAuth clients → New client:
 
 - Description: `elastickv GitHub Actions deploy`
-- Scopes: `auth_keys` (write). Recent `tailscale/github-action` versions
-  may additionally require `devices:write` (to register and clean up
-  the ephemeral node); enable that if the join step fails with an
-  authorization error. The action's README is the definitive source
-  for current scope requirements. `devices:core` is NOT a valid
-  Tailscale OAuth scope — earlier drafts of this runbook named it and
-  would have produced an auth failure.
+- Scopes: `auth_keys` (write). The pinned `tailscale/github-action`
+  version uses this OAuth client to mint the ephemeral auth key. If the
+  join step fails with a 403 during device registration or cleanup,
+  add the exact Devices scope named by the action README and the
+  Tailscale OAuth UI for that action version; do not guess from older
+  drafts of this runbook.
 - Tags: `tag:ci-deploy`
 
 Copy the client ID and secret; they go into GitHub in the next step.
@@ -196,18 +195,19 @@ hazard that needs manual cleanup.
    container is absent or `Exited`, finish the recreate by hand. The
    `docker run` invocation itself is redirected to `/dev/null` by the
    script, so the workflow log does NOT contain the full argv. To
-   reconstruct it, read the `Roll cluster` step's rendered
-   environment — the workflow exports `IMAGE`, `DATA_DIR`,
-   `RAFT_PORT`, `REDIS_PORT`, `S3_PORT`, `ENABLE_S3`, `NODES`,
-   `SSH_TARGETS`, and the merged `EXTRA_ENV` before invoking the
-   script. Anything not explicitly set (e.g., `RAFT_PORT` in a
-   minimally-overridden deploy) falls back to the script's default
-   (`RAFT_PORT=50051`, `REDIS_PORT=6379`, `S3_PORT=9000`,
-   `ENABLE_S3=true`). GOMEMLIMIT / CONTAINER_MEMORY_LIMIT (PR #617)
-   are propagated via `EXTRA_ENV` once that PR lands. Together the
-   rendered env + the node's `deploy.env` is enough to reconstruct
-   the same `docker run` you would see if you re-ran with the same
-   inputs.
+   reconstruct it, open the `Log rollout configuration` step: it emits
+   the actual `IMAGE`, `CONTAINER_NAME`, `DATA_DIR`, `SERVER_ENTRYPOINT`,
+   `RAFT_ENGINE`, `RAFT_PORT`, `REDIS_PORT`, `DYNAMO_PORT`, `S3_PORT`,
+   `ENABLE_S3`, `S3_REGION`, `S3_CREDENTIALS_FILE`,
+   `S3_PATH_STYLE_ONLY`, `DEFAULT_EXTRA_ENV`, `EXTRA_ENV`,
+   `CONTAINER_MEMORY_LIMIT`, `NODES`, `SSH_TARGETS`, and
+   `ROLLING_ORDER` values used by the following `Roll cluster` step.
+   The workflow sets the same defaults as `scripts/rolling-update.sh`
+   except for `ENABLE_S3`, which defaults to `false` in the workflow
+   unless the `production` environment variable explicitly enables it.
+   Together those logged values plus the node's current Docker state
+   are enough to reconstruct the same `docker run` you would get from
+   re-running the workflow with the same inputs.
 3. **Confirm the new leader via `raftadmin` or metrics** before re-running
    the workflow with `nodes:` scoped to the remaining untouched IDs. Do
    NOT re-run the full rollout if the partial one is still in flight —
diff --git a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
index 069a819fd..c7dc3acd8 100644
--- a/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
+++ b/docs/design/2026_04_24_proposed_deploy_via_tailscale.md
@@ -96,12 +96,12 @@ Use OAuth ephemeral nodes (not a long-lived auth key):
 
 - Create an OAuth client in Tailscale admin console with scope
   `auth_keys` (write) on tag `tag:ci-deploy`. (`tailscale/github-action`
-  uses the OAuth client to mint a short-lived auth key on each run;
-  recent action versions may also require `devices:write` so the
-  ephemeral node can register and be cleaned up — consult the action's
-  README for the current scope list. Earlier drafts of this doc named
-  `devices:core`, which is not a supported Tailscale OAuth scope and
-  would fail authentication.)
+  uses the OAuth client to mint a short-lived auth key on each run.)
+  If the pinned action starts returning 403 during device registration
+  or cleanup, add the exact Devices scope named by the action README
+  and the Tailscale OAuth UI for that action version; keep this doc as
+  the rollout contract, not the authority for future Tailscale scope
+  names.
 - Store client ID + secret in GitHub env secrets.
 - `tailscale/github-action@v3` joins the tailnet for the duration of the job
   as an ephemeral tagged node; disconnects automatically on job exit.