Skip to content

ci(kubernetes): add kube gateway e2e tests and gated CI workflow#1251

Merged
TaylorMutch merged 5 commits into
mainfrom
tmutch/kube-e2e-ci-take2
May 11, 2026
Merged

ci(kubernetes): add kube gateway e2e tests and gated CI workflow#1251
TaylorMutch merged 5 commits into
mainfrom
tmutch/kube-e2e-ci-take2

Conversation

@TaylorMutch
Copy link
Copy Markdown
Collaborator

@TaylorMutch TaylorMutch commented May 7, 2026

Summary

Adds a Kubernetes e2e harness (mise run e2e:kubernetes) and a Branch Kubernetes E2E workflow gated on test:e2e-kubernetes, so Helm chart and gateway packaging changes can be exercised end-to-end on demand against a real kind cluster.

Related Issue

N/A — infrastructure follow-up to the earlier kube gateway e2e work.

Changes

Kube e2e harness

  • New e2e/with-kube-gateway.sh wrapper:
    • If OPENSHELL_E2E_KUBE_CONTEXT is set, installs the chart into an ephemeral namespace on the existing context (CI path).
    • Otherwise creates a local k3d cluster via tasks/scripts/helm-k3s-local.sh and tears it down on exit (dev path).
    • Imports locally available gateway/supervisor images (also for existing k3d clusters), helm-installs with ci/values-tls-disabled.yaml, port-forwards svc/openshell, registers a plaintext gateway, and runs the supplied command with OPENSHELL_E2E_DRIVER=kubernetes.
    • Captures pod state, events, gateway logs, and port-forward logs on failure for debugging.
    • Waits for namespace deletion on cleanup so back-to-back runs don't race.
  • New e2e/rust/e2e-kubernetes.sh that builds openshell-cli and runs the Rust e2e tests through the wrapper.
  • New e2e:kubernetes mise task wired up in tasks/test.toml.
  • New .github/workflows/branch-kubernetes-e2e.yml:
    • Triggers on pull-request/* push and workflow_dispatch.
    • Gates via ./.github/actions/pr-gate on test:e2e-kubernetes.
    • Builds gateway and supervisor Docker images via the reusable docker-build.yml workflow.
    • Provisions a kind cluster with helm/kind-action, materializes the kubeconfig at the mise-expected path, side-loads images tagged with ${{ github.sha }}, and runs mise run --no-deps --skip-deps e2e:kubernetes.
  • Extends .github/workflows/e2e-label-help.yml to post the next-step hint when test:e2e-kubernetes is applied.

Testing

  • mise run pre-commit passes
  • Smoke regression run intact locally
  • test:e2e-kubernetes label is applied so the new Branch Kubernetes E2E workflow runs on this PR

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable) — N/A (CI/test infrastructure only)

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 7, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@TaylorMutch TaylorMutch force-pushed the tmutch/kube-e2e-ci-take2 branch 6 times, most recently from d55f224 to 3982a35 Compare May 8, 2026 23:21
Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Adds a label-gated GitHub Actions workflow that exercises the Helm
chart end-to-end against the Rust e2e suite via `mise run e2e:helm`.

Pipeline:
- pr_metadata gates on the `test:e2e-helm` label via the pr-gate action.
- build-gateway / build-supervisor build and push Docker images using
  the reusable docker-build.yml workflow.
- helm-e2e (bare runner): apt-installs z3 build deps so cargo can
  compile the openshell-policy crate's z3-sys backend, creates a kind
  cluster via helm/kind-action, materializes the kind kubeconfig at the
  path mise's [env] block expects, side-loads the freshly built
  gateway/supervisor images, applies
  deploy/kube/manifests/agent-sandbox.yaml so the
  sandboxes.agents.x-k8s.io CRD and reconciling StatefulSet are in
  place, and finally runs `mise run e2e:helm`.

Also expands the `e2e:helm` task to run the full Rust e2e suite
(matching `e2e:podman`) instead of only the smoke test, with
OPENSHELL_E2E_KUBE_TEST as an opt-in single-test override for local
debugging.

Extends the e2e-label-help workflow so applying `test:e2e-helm` posts
the next-step hint pointing at this workflow.

Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
@TaylorMutch TaylorMutch force-pushed the tmutch/kube-e2e-ci-take2 branch from 3982a35 to 7c6abc5 Compare May 11, 2026 15:24
Comment thread e2e/rust/tests/host_gateway_alias.rs Outdated
Comment thread tasks/test.toml Outdated
Six tests previously skipped on the kubernetes driver — three in
host_gateway_alias.rs plus the forward_proxy_l7 + graphql_l7 cases —
relied on `host.openshell.internal` reaching either a host process or a
sibling docker container. The Helm chart already supports this via
`server.hostGatewayIP`, but the e2e wrapper never set it.

- with-kube-gateway.sh: auto-detect the host-routable IP (CoreDNS
  `host.k3d.internal` first to handle Docker Desktop, docker network
  gateway as a fallback for kind on Linux CI) and pass it to helm
  install. Also import locally-built images into existing k3d clusters
  and wait for namespace deletion to complete before exit.
- e2e/rust/Cargo.toml + e2e-helm.sh: add an `e2e-host-gateway` feature
  that gates the three test files; docker and podman runs imply it,
  helm runs opt in by default (overridable via
  OPENSHELL_E2E_HELM_FEATURES for remote clusters where the test host
  is unreachable from pods).
- Drop the `skip_if_kube` helper and its callers — feature gating now
  decides whether the tests compile in.

Verified against a local k3d cluster: all six previously-skipped tests
pass, smoke regression intact.
The runner exercises the gateway on Kubernetes; helm is just the
deployment mechanism. Names now describe the target environment.

- mise task: `e2e:helm` -> `e2e:kubernetes`
- script: e2e/rust/e2e-helm.sh -> e2e/rust/e2e-kubernetes.sh
- env var: OPENSHELL_E2E_HELM_FEATURES -> OPENSHELL_E2E_KUBERNETES_FEATURES
- workflow: branch-helm-e2e.yml -> branch-kubernetes-e2e.yml
  (display name "Branch Kubernetes E2E", job "kubernetes-e2e")
- PR gate label: test:e2e-helm -> test:e2e-kubernetes (e2e-label-help
  hint workflow updated to match)

PR #1251 needs the new label applied; the old label can be removed.
@TaylorMutch TaylorMutch added test:e2e-kubernetes Requires Kubernetes end-to-end coverage and removed test:e2e-helm labels May 11, 2026
@TaylorMutch TaylorMutch changed the title ci(helm): add kube gateway e2e tests and gated CI workflow ci(kubernetes): add kube gateway e2e tests and gated CI workflow May 11, 2026
kind dual-stacks its network; the wrapper's awk picked the first
non-empty gateway, which was the IPv6 entry (fc00:...). Sandbox pods
can't reach the test host's IPv4 listener through that, so the L7 and
host-gateway-alias tests failed in CI even though they passed locally
against k3d (where host.k3d.internal in CoreDNS short-circuits the
docker-network fallback).

Restrict the awk filter to IPv4 octets.
@TaylorMutch
Copy link
Copy Markdown
Collaborator Author

/ok to test 84b114f

@TaylorMutch TaylorMutch requested a review from drew May 11, 2026 19:32
@TaylorMutch TaylorMutch merged commit 59475aa into main May 11, 2026
45 of 65 checks passed
@TaylorMutch TaylorMutch deleted the tmutch/kube-e2e-ci-take2 branch May 11, 2026 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e-kubernetes Requires Kubernetes end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants