Fix the docker channel's negative-control and token-interop CI steps#11
Conversation
The docker channel's relay wrapper does `exec docker run --rm -i --network host ...`. smoke.sh tears the relay down with kill_tree (SIGKILL), which kills the runtime client but leaves the detached container running -- still bound to the relay's port via --network host. The next smoke.sh run in the same job (the workflow's "Negative control" step after "Smoke") then aborts its preflight with "something is already listening on 127.0.0.1:4443". This was latent until the catalog fix let the docker Smoke step pass, so the Negative control step ran for the first time and hit the leftover container. Fix at the source: name the relay container and reap it by name. - clients/docker/moq-relay: run with --name, and `docker rm -f` any leftover before binding (covers a crashed run whose cleanup never fired). - smoke.sh: export the shared container name; cleanup() removes it after kill_tree (no-op when the runtime isn't installed / non-docker channels). - smoke.sh: wait briefly in the port preflight for the host-network socket to release after the container is reaped, instead of failing instantly. Verified locally (non-docker channel) running Smoke then Negative control back-to-back with no manual port clearing: both pass. The docker path is Linux/--network-host only, so it's validated by CI. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
WalkthroughThe changes introduce a stable container name ( 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@smoke.sh`:
- Around line 134-136: The cleanup condition in the RELAY_CONTAINER removal
block currently only checks if the RELAY_CONTAINER variable is set and if
docker/podman exists on PATH, but does not verify that docker was actually
selected as the relay channel for the current test run. Modify the conditional
to additionally check that the relay channel was actually set to docker before
attempting to remove the container, preventing accidental deletion of unrelated
containers when docker exists on the system but was not used for the relay in
this particular test execution.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8f815c8f-0f9a-49e9-bd2d-bde7bfebd1b1
📒 Files selected for processing (2)
clients/docker/moq-relaysmoke.sh
cleanup() removed the named relay container whenever a container runtime was on PATH, so a non-docker run (cargo/apt/brew/nix) on a host that happens to have docker installed could delete an unrelated container named moq-relay-smoke. Gate the reap on the docker wrapper actually being the relay (RELAY path ends in clients/docker/moq-relay), which also covers the podman override since it shares that wrapper. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Token interop ran `rust` (which needs moq-token-cli on PATH) on every channel, but the docker channel installs no such binary -- it covers the Rust verifier via the moqdev/moq-token-cli image (rust-docker). So every rust generator/ verifier combo failed there with "generator/verifier unavailable". This was latent: the docker job used to fail earlier (Negative control) and skip Token interop entirely. With that fixed, Token interop runs on docker for the first time. Drop plain `rust` from the impl set on the docker channel, keeping js-node, js-bun, and rust-docker (which the run log shows cross-verify). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Background
The scheduled smoke job's docker runner had a chain of latent bugs in its later steps that never surfaced before, because an earlier step always failed first and the rest were skipped. Now that #10 fixed the docker
Smokestep (the oldCatalog.fetchAPI), those steps run for the first time. This PR fixes the two that were broken.1. Relay container leaks across steps (Negative control)
The docker relay wrapper (clients/docker/moq-relay) does
exec docker run --rm -i --network host …. smoke.sh tears the relay down withkill_tree(SIGKILL), which kills the docker client but leaves the container running — still bound to port 4443 via--network host. TheSmokeandNegative controlsteps run consecutively in the same job, so the second run's port preflight saw the leftover container and aborted withsomething is already listening on 127.0.0.1:4443.Fix — reap the container at the source:
clients/docker/moq-relay: run with a stable--name, anddocker rm -fany leftover before binding (covers a crashed run whose cleanup never fired).smoke.sh cleanup(): export the shared container name anddocker rm -fit afterkill_tree— gated on the docker wrapper actually being the relay (RELAYends inclients/docker/moq-relay), so a non-docker run on a host that happens to have docker can't delete an unrelated container of that name. Honors theSMOKE_DOCKER=podmanoverride.smoke.shpreflight: briefly wait for the host-network socket to release after the container is reaped, instead of failing instantly.2.
rusttoken impl unavailable on the docker channel (Token interop)Token interop ran the
rustimpl (which needsmoq-token-clion PATH) on every channel, but the docker channel installs no such binary — it covers the Rust verifier via themoqdev/moq-token-cliimage (rust-docker). So everyrustgenerator/verifier combo failed with "generator/verifier unavailable".Fix: drop plain
rustfrom the impl set on the docker channel (.github/workflows/smoke.yml), keepingjs-node,js-bun, andrust-docker— which the CI run log shows already cross-verify. Other channels are unchanged.Verification
SmokeandNegative controlboth passed (Negative control passed on docker for the first time). Also exercised the shared smoke.sh changes locally (non-docker channel) runningSmokethenNegative controlback-to-back with no manual port clearing: both pass.js-node/js-bun/rust-dockertoken combos all PASS; only therustcombos failed (the bug this commit removes).--network hostand the docker token image are Linux-only, so the docker path is validated by CI.Notes for reviewers
moq-relay-smoke, overridable viaMOQ_RELAY_CONTAINER.🤖 Generated with Claude Code