diff --git a/cmd/ephemerd/runtime_windows.go b/cmd/ephemerd/runtime_windows.go index 6002f23..2035844 100644 --- a/cmd/ephemerd/runtime_windows.go +++ b/cmd/ephemerd/runtime_windows.go @@ -51,7 +51,10 @@ func startContainerRuntime(dataDir string, log *slog.Logger, linuxVMEnabled bool DiskSizeGB: linuxVMDiskSizeGB, DindEnabled: dindEnabled, DindAllowPrivileged: dindAllowPrivileged, - Log: log, + // Share the host's data dir read-only so the in-VM ephemerd + // reads the same config.toml. See docs/arch/plan9-config-share.md. + HostDataDir: dataDir, + Log: log, }) if err != nil { log.Warn("Linux VM not started — Linux jobs will not be available on this host", "error", err) diff --git a/docs/arch/host-config-initrd.md b/docs/arch/host-config-initrd.md new file mode 100644 index 0000000..f71fcb5 --- /dev/null +++ b/docs/arch/host-config-initrd.md @@ -0,0 +1,154 @@ +# Host Config Delivery via Boot-Initrd Tail + +> **Status: implemented.** The in-VM ephemerd reads the host's +> `config.toml`, delivered on every VM boot through the same +> runtime-generated initrd tail that carries `ephemerd-linux`. Adding a +> new in-VM-relevant config knob costs zero plumbing: edit the host's +> config.toml, restart ephemerd, the VM reboots and reads the same TOML. + +## Context + +Until now, every host-side setting that needed to take effect *inside +the Linux VM* required its own ad-hoc plumbing across the VM boundary: + +1. A field on `vm.LinuxVMConfig` (`DindEnabled`, `DindAllowPrivileged`). +2. A kernel command-line parameter (`ephemerd.dind=1`, + `ephemerd.dind_allow_privileged=1`) appended by + `pkg/vm/linuxvm_windows.go`. +3. A parser for that parameter in the init script + (`mage/download/download.go`). +4. A CLI flag on `ephemerd serve` (`--dind-allow-privileged`) that + overrides the in-VM config. +5. A re-render of the init script + initrd, and a rebuild of the host + binary. + +PRs #87 (metrics — needed `container_stats_interval` over the boundary +for the in-VM sampler) and #88 (dind allow-privileged plumbing) both +had to walk this path. The cost-per-knob is small but real, and the +pattern doesn't scale. + +## The mechanism + +ephemerd already rebuilds the boot initrd **on every VM start**: +`pkg/vm.buildBootInitrd` appends a small gzipped cpio tail containing +`/assets/ephemerd-linux` to the build-time base initrd. The kernel +concatenates initrd cpio archives, so files in the appended tail +override or add to the base. That is how a fresh `go build` of +ephemerd.exe delivers a new Linux binary into the VM without an +initrd rebuild. + +This feature adds one more file to that tail: the host's +`config.toml`, when it exists, lands at `/assets/config.toml`. The +init script stages it to `/etc/ephemerd/config.toml` (mode 0600) and +passes `--config /etc/ephemerd/config.toml` to the in-VM `ephemerd +serve`. The in-VM daemon then reads the same TOML the host reads. + +Because the tail is regenerated on every VM boot and a VM boot happens +on every ephemerd service start, "edit config.toml + restart the +service" is the complete update procedure. Same semantics as the host +daemon itself. + +## Why not a live file share (Plan9 / virtio-fs) + +The first draft of this feature exposed the host data dir to the VM as +a Hyper-V Plan9 share. It failed in two independent ways, the first of +which took down Linux CI on the dev rig for ~100 minutes: + +1. **The HCS document was rejected at VM start** (`HcsStartComputeSystem: + HRESULT 0xc0370110`) — the `Plan9` device JSON we constructed did not + match the schema HCS expects at creation time. The VM never booted, + ephemerd logged a single WARN, and every `[self-hosted linux x64]` + job sat queued while the host poll loop skipped them with "OS labels + don't match this platform." +2. **More fundamentally, the guest could never have mounted it.** + Hyper-V serves Plan9 shares over **hvsock**, not virtio — there is no + virtio-9p device on HCS. Mainline `mount -t 9p` supports + `trans=virtio|tcp|fd|...` but has no hvsock transport; LCOW's GCS + daemon makes this work by opening an `AF_VSOCK` socket itself and + passing the fd via `trans=fd`. Replicating that means a vsock dialer + + mount helper in the guest — real machinery, for a file we read + exactly once at boot. + +A live share buys *continuous* visibility of host files. We need a +*boot-time snapshot* of one file. The initrd tail already exists, is +exercised on every boot, has no new kernel or transport surface, and +fails in exactly one obvious way (file missing → defaults). + +virtio-fs is the natural choice on Apple Vz for the Darwin equivalent +(Vz exposes virtio-fs directly) — that remains the plan for macOS, +tracked separately. + +## Security + +- `config.toml` can contain webhook secrets. It is written into the + cpio tail with mode 0600 and staged in the VM at + `/etc/ephemerd/config.toml` with mode 0600, root-owned. Job + containers never see the VM's host rootfs — they get only the bind + mounts the runtime hands them. +- The boot initrd lives at `\vm\linux\initrd` on the host — + the same ACL domain as `config.toml` itself, so embedding the config + does not widen host-side exposure. +- The GitHub App private key is **not** carried into the VM: + `private_key_path` in config.toml names a file outside the data dir, + and only the TOML text crosses the boundary, not referenced files. + The in-VM worker (`--containerd-only`) never constructs a GitHub + client, so the path string sits inert. + +## What the in-VM daemon actually reads + +The worker-mode code path dereferences a narrow slice of the config: + +- `[dind]` — `enabled`, `allow_privileged`, cache settings. +- `[runtime.rlimits]` — per-container nofile, etc. +- `[log]` — log level/format. + +Everything else (`[github]`, `[runner.windows]`, `[metrics]`, +`[vm.linux]`, `[webhook]`, tunnels, repo lists) is parsed into the +in-memory config but never read in worker mode. Worker mode returns +before the metrics server, providers, scheduler, and VM-boot blocks in +`serve()`, so a host config with `[metrics] enabled = true` does NOT +start a second metrics listener inside the VM. Future changes to +worker mode should preserve that invariant — the host scrapes in-VM +container stats via the Dispatch stream (#87) precisely so the VM +needs no listener of its own. + +## Fallback + +When `config.toml` doesn't exist on the host (fresh install before +first write), `buildBootInitrd` skips the entry and the init script +sees no `/assets/config.toml` — the in-VM daemon runs on its compiled +defaults plus the kernel-cmdline `ephemerd.dind*` flags from #88, +which are retained for exactly this case. Once a config exists, the +TOML wins (the cmdline flags force the same values they always did, +and `--config` only adds settings the flags don't cover). + +## Failure modes worth knowing + +- **Host config unreadable** (ACL mishap): treated as missing — + defaults + cmdline. The init banner logs `host_config=` empty. +- **Malformed TOML on the host**: the host daemon itself fails to start + first (it parses the same file), so a broken config never reaches a + running VM in practice. +- **Operator edits config.toml while the VM is running**: not picked up + until the next VM boot. Restart the ephemerd service. +- **Secrets rotation**: same story — restart the service; the initrd + tail is regenerated with the new file on every boot. + +## Lessons recorded + +- **Deploying a draft build to the only Linux CI host turns "VM won't + boot" into "CI is silently down."** The only symptom was a DEBUG-level + skip log. Follow-up worth doing: a WARN (or health-endpoint signal) + when Linux-labeled jobs are queued but the Linux dispatcher is + unavailable. +- **HCS document changes need a boot test before deploy.** `0xc0370110` + arrives at start time, not at document-build time; nothing in `mage + ci` exercises it. A future smoke target that creates + starts a + minimal VM would catch this class. + +## File pointers + +- Tail construction: `pkg/vm/initrd_windows.go` (`buildBootInitrd`) +- Call site + config path resolution: `pkg/vm/linuxvm_windows.go` +- VM-side staging: init script in `mage/download/download.go` +- Field: `vm.LinuxVMConfig.HostDataDir` in `pkg/vm/vm.go` diff --git a/mage/download/download.go b/mage/download/download.go index a980e59..26a6116 100644 --- a/mage/download/download.go +++ b/mage/download/download.go @@ -1618,8 +1618,27 @@ if [ "$DIND" = "1" ]; then DIND_FLAG="$DIND_FLAG --dind-allow-privileged" fi fi -echo "ephemerd-init: launching ephemerd-linux (dind=$DIND allow_privileged=$DIND_ALLOW_PRIV)" + +# Host config rides in via the runtime-generated initrd tail (the same +# mechanism that delivers ephemerd-linux — see pkg/vm.buildBootInitrd). +# When present, copy it into the VM rootfs and point ephemerd at it so +# every host-side setting (dind.*, runtime.rlimits, future knobs) takes +# effect on this VM boot with no per-setting plumbing. When absent +# (fresh install before config.toml exists), the kernel-cmdline flags +# above keep the in-VM daemon working on defaults. +# See docs/arch/host-config-initrd.md. +CONFIG_FLAG="" +if [ -f /assets/config.toml ]; then + mkdir -p /newroot/etc/ephemerd + cp /assets/config.toml /newroot/etc/ephemerd/config.toml + chmod 600 /newroot/etc/ephemerd/config.toml + CONFIG_FLAG="--config /etc/ephemerd/config.toml" + echo "ephemerd-init: host config staged at /etc/ephemerd/config.toml" +fi + +echo "ephemerd-init: launching ephemerd-linux (dind=$DIND allow_privileged=$DIND_ALLOW_PRIV host_config=${CONFIG_FLAG:+yes})" exec switch_root /newroot /usr/local/bin/ephemerd-linux serve \ + $CONFIG_FLAG \ --data-dir /var/lib/ephemerd \ --containerd-tcp-port "$CONTAINERD_PORT" \ --containerd-tcp-addr 0.0.0.0 \ diff --git a/pkg/vm/initrd_windows.go b/pkg/vm/initrd_windows.go index aae13f7..fd7ae44 100644 --- a/pkg/vm/initrd_windows.go +++ b/pkg/vm/initrd_windows.go @@ -11,14 +11,17 @@ import ( ) // buildBootInitrd produces the initrd the VM actually boots with by appending -// a tiny cpio archive containing /assets/ephemerd-linux to the embedded base -// initrd. The Linux kernel concatenates initrd cpios into a single initramfs, -// so files in the appended cpio override (or add to) those in the base. This -// lets a fresh `go build` of ephemerd.exe deliver a new ephemerd-linux to the -// VM without any initrd rebuild — the build-time initrd contains only the -// boot scaffolding (busybox, modules, init script), and the binary itself -// rides in via the runtime-generated tail. -func buildBootInitrd(basePath, ephemerdLinuxPath, destPath string) error { +// a tiny cpio archive containing /assets/ephemerd-linux — and, when +// hostConfigPath is non-empty and readable, /assets/config.toml — to the +// embedded base initrd. The Linux kernel concatenates initrd cpios into a +// single initramfs, so files in the appended cpio override (or add to) those +// in the base. This lets a fresh `go build` of ephemerd.exe deliver a new +// ephemerd-linux to the VM without any initrd rebuild, and lets the host's +// config.toml reach the in-VM daemon on every boot with no per-setting +// plumbing — the build-time initrd contains only the boot scaffolding +// (busybox, modules, init script); the binary and config ride in via the +// runtime-generated tail. +func buildBootInitrd(basePath, ephemerdLinuxPath, hostConfigPath, destPath string) error { baseData, err := os.ReadFile(basePath) if err != nil { return fmt.Errorf("reading base initrd: %w", err) @@ -27,6 +30,16 @@ func buildBootInitrd(basePath, ephemerdLinuxPath, destPath string) error { if err != nil { return fmt.Errorf("reading ephemerd-linux: %w", err) } + // Host config is best-effort: a missing config.toml (fresh install + // before first write, or tests) simply means the in-VM daemon runs on + // defaults + kernel-cmdline flags, same as before this feature. + var cfgData []byte + if hostConfigPath != "" { + cfgData, err = os.ReadFile(hostConfigPath) + if err != nil { + cfgData = nil + } + } var tail bytes.Buffer gw := gzip.NewWriter(&tail) @@ -39,6 +52,14 @@ func buildBootInitrd(basePath, ephemerdLinuxPath, destPath string) error { if err := writeCPIOEntry(gw, "assets/ephemerd-linux", 0o100755, binData, ""); err != nil { return fmt.Errorf("cpio: ephemerd-linux: %w", err) } + if cfgData != nil { + // 0600: config.toml can carry webhook secrets. Inside the VM it's + // only readable by root, and job containers never see the host + // rootfs — but no reason to be sloppy with the mode. + if err := writeCPIOEntry(gw, "assets/config.toml", 0o100600, cfgData, ""); err != nil { + return fmt.Errorf("cpio: config.toml: %w", err) + } + } if err := writeCPIOEntry(gw, "TRAILER!!!", 0, nil, ""); err != nil { return fmt.Errorf("cpio: trailer: %w", err) } diff --git a/pkg/vm/initrd_windows_test.go b/pkg/vm/initrd_windows_test.go index f7f3c91..6dee370 100644 --- a/pkg/vm/initrd_windows_test.go +++ b/pkg/vm/initrd_windows_test.go @@ -87,7 +87,7 @@ func TestBuildBootInitrd_AppendsEphemerdLinux(t *testing.T) { } destPath := filepath.Join(dir, "initrd") - if err := buildBootInitrd(basePath, binPath, destPath); err != nil { + if err := buildBootInitrd(basePath, binPath, "", destPath); err != nil { t.Fatalf("buildBootInitrd: %v", err) } @@ -137,7 +137,7 @@ func TestBuildBootInitrd_MissingBase(t *testing.T) { if err := os.WriteFile(binPath, []byte("data"), 0o755); err != nil { t.Fatalf("writing binary: %v", err) } - err := buildBootInitrd(filepath.Join(dir, "missing-base"), binPath, filepath.Join(dir, "out")) + err := buildBootInitrd(filepath.Join(dir, "missing-base"), binPath, "", filepath.Join(dir, "out")) if err == nil { t.Error("expected error for missing base initrd") } @@ -149,12 +149,88 @@ func TestBuildBootInitrd_MissingBinary(t *testing.T) { if err := writeGzippedCPIO(basePath, map[string][]byte{"x": []byte("y")}); err != nil { t.Fatalf("writing base: %v", err) } - err := buildBootInitrd(basePath, filepath.Join(dir, "missing-bin"), filepath.Join(dir, "out")) + err := buildBootInitrd(basePath, filepath.Join(dir, "missing-bin"), "", filepath.Join(dir, "out")) if err == nil { t.Error("expected error for missing ephemerd-linux") } } +func TestBuildBootInitrd_AppendsHostConfig(t *testing.T) { + dir := t.TempDir() + basePath := filepath.Join(dir, "initrd-base") + if err := writeGzippedCPIO(basePath, map[string][]byte{"x": []byte("y")}); err != nil { + t.Fatalf("writing base: %v", err) + } + binPath := filepath.Join(dir, "ephemerd-linux") + if err := os.WriteFile(binPath, []byte("elf"), 0o755); err != nil { + t.Fatalf("writing binary: %v", err) + } + cfgPath := filepath.Join(dir, "config.toml") + cfgBody := []byte("[dind]\nenabled = true\nallow_privileged = true\n") + if err := os.WriteFile(cfgPath, cfgBody, 0o600); err != nil { + t.Fatalf("writing config: %v", err) + } + + destPath := filepath.Join(dir, "initrd") + if err := buildBootInitrd(basePath, binPath, cfgPath, destPath); err != nil { + t.Fatalf("buildBootInitrd: %v", err) + } + + got, err := os.ReadFile(destPath) + if err != nil { + t.Fatalf("reading boot initrd: %v", err) + } + baseData, err := os.ReadFile(basePath) + if err != nil { + t.Fatalf("reading base: %v", err) + } + gr, err := gzip.NewReader(bytes.NewReader(got[len(baseData):])) + if err != nil { + t.Fatalf("appended tail is not gzip: %v", err) + } + defer func() { _ = gr.Close() }() + cpio, err := io.ReadAll(gr) + if err != nil { + t.Fatalf("reading appended cpio: %v", err) + } + if !bytes.Contains(cpio, []byte("assets/config.toml")) { + t.Error("appended cpio does not contain assets/config.toml path") + } + if !bytes.Contains(cpio, cfgBody) { + t.Error("appended cpio does not contain config body") + } +} + +// TestBuildBootInitrd_MissingHostConfigIsNotFatal asserts the no-config +// path: a fresh install where config.toml doesn't exist yet must still +// produce a bootable initrd (in-VM daemon runs on defaults + cmdline). +func TestBuildBootInitrd_MissingHostConfigIsNotFatal(t *testing.T) { + dir := t.TempDir() + basePath := filepath.Join(dir, "initrd-base") + if err := writeGzippedCPIO(basePath, map[string][]byte{"x": []byte("y")}); err != nil { + t.Fatalf("writing base: %v", err) + } + binPath := filepath.Join(dir, "ephemerd-linux") + if err := os.WriteFile(binPath, []byte("elf"), 0o755); err != nil { + t.Fatalf("writing binary: %v", err) + } + + destPath := filepath.Join(dir, "initrd") + if err := buildBootInitrd(basePath, binPath, filepath.Join(dir, "nonexistent-config.toml"), destPath); err != nil { + t.Fatalf("buildBootInitrd should tolerate a missing host config: %v", err) + } + got, err := os.ReadFile(destPath) + if err != nil { + t.Fatalf("reading boot initrd: %v", err) + } + if bytes.Contains(got[len(got)/2:], []byte("assets/config.toml")) { + // Cheap sanity: the tail shouldn't reference a config we never had. + // (Scan the back half only — the base could theoretically contain + // the string, though our fixture doesn't.) + t.Error("initrd tail references assets/config.toml despite missing source") + } +} + // writeGzippedCPIO is a test helper that emits a tiny valid gzipped newc cpio // archive containing the given files. func writeGzippedCPIO(path string, files map[string][]byte) error { diff --git a/pkg/vm/linuxvm_windows.go b/pkg/vm/linuxvm_windows.go index e2e985b..4047e71 100644 --- a/pkg/vm/linuxvm_windows.go +++ b/pkg/vm/linuxvm_windows.go @@ -287,13 +287,20 @@ func (l *hypervLinuxVM) extractAssets() error { } // Build the final initrd by appending a cpio with /assets/ephemerd-linux - // to the base initrd. Cheap (a few file ops + gzip on a single 240MB blob) - // and idempotent, so we run it on every start. HCS reads InitRdPath only - // when the VM is created, so this must happen before createAndBootVM. + // (and the host's config.toml when present) to the base initrd. Cheap + // (a few file ops + gzip on a single 240MB blob) and idempotent, so we + // run it on every start. HCS reads InitRdPath only when the VM is + // created, so this must happen before createAndBootVM. Re-running on + // every boot is what gives "edit config.toml + restart ephemerd = + // in-VM daemon sees the change" with no per-setting plumbing. bootInitrd := filepath.Join(l.vmDir, "initrd") baseInitrd := filepath.Join(l.vmDir, "initrd-base") ephemerdBin := filepath.Join(l.vmDir, "ephemerd-linux") - if err := buildBootInitrd(baseInitrd, ephemerdBin, bootInitrd); err != nil { + hostConfig := "" + if l.cfg.HostDataDir != "" { + hostConfig = filepath.Join(l.cfg.HostDataDir, "config.toml") + } + if err := buildBootInitrd(baseInitrd, ephemerdBin, hostConfig, bootInitrd); err != nil { return fmt.Errorf("building boot initrd: %w", err) } if err := grantVmFileAccess(bootInitrd); err != nil { @@ -517,6 +524,10 @@ func (l *hypervLinuxVM) createAndBootVM() error { // /dev/sda as root and find init there, which fails on an unformatted VHDX. // - 8250_core: enable serial UART for console output via named pipe // - ephemerd.*: custom params parsed by our init script + // The dind* params are redundant when the host's config.toml rides in + // via the initrd tail (the in-VM ephemerd reads the same [dind] + // section the host does). Kept as a fallback for the no-config path — + // fresh installs where config.toml doesn't exist yet. dindFlag := "" if l.cfg.DindEnabled { dindFlag = " ephemerd.dind=1" @@ -589,6 +600,7 @@ func (l *hypervLinuxVM) createAndBootVM() error { }, } + l.cfg.Log.Info("creating Hyper-V Linux VM", "name", l.vmName, "cpus", l.cfg.CPUs, diff --git a/pkg/vm/vm.go b/pkg/vm/vm.go index adfb93f..91b220e 100644 --- a/pkg/vm/vm.go +++ b/pkg/vm/vm.go @@ -47,12 +47,21 @@ type LinuxVMConfig struct { DindEnabled bool // DindAllowPrivileged forwards the host's dind.allow_privileged setting - // to the in-VM ephemerd via the kernel cmdline. Without this, the in-VM - // daemon reads its own (minimal) config and Linux defaults to false, - // rejecting `docker run --privileged` siblings even when the host - // operator explicitly opted in. + // to the in-VM ephemerd via the kernel cmdline. Kept as a fallback for + // environments where the Plan9 host-config share fails to mount (e.g. + // stripped kernel without 9p modules); the share normally supersedes + // this by carrying the same value through the shared config.toml. DindAllowPrivileged bool + // HostDataDir is the host's ephemerd data directory. When set, + // /config.toml is appended into the runtime-generated + // boot-initrd tail (next to ephemerd-linux) and staged at + // /etc/ephemerd/config.toml inside the VM, so any host-side setting + // takes effect on the next VM boot without per-setting kernel cmdline + // plumbing. A missing config.toml is non-fatal (fresh installs run on + // defaults). See docs/arch/host-config-initrd.md. + HostDataDir string + Log *slog.Logger }