Skip to content

feat: native multi-platform builds via runner matrix (no QEMU emulation)#435

Open
tgenov wants to merge 41 commits intodevcontainers:mainfrom
tgenov:main
Open

feat: native multi-platform builds via runner matrix (no QEMU emulation)#435
tgenov wants to merge 41 commits intodevcontainers:mainfrom
tgenov:main

Conversation

@tgenov
Copy link
Copy Markdown

@tgenov tgenov commented Feb 26, 2026

Problem

The current multi-platform build support (platform input) relies on QEMU emulation through Docker buildx. This works but has significant drawbacks:

  • Performance: Cross-architecture emulation is 5-10x slower than native builds. A linux/arm64 build on an amd64 runner can take 30+ minutes for large images.
  • Reliability: QEMU emulation can produce subtle runtime differences and occasionally fails on complex build steps (e.g., compiling native extensions).
  • Cost: The long build times consume more CI minutes than necessary.

GitHub Actions and Azure DevOps both offer native ARM runners (ubuntu-24.04-arm, ARM64 pool), making emulation unnecessary if the action supports a matrix-based workflow where each platform builds natively on its own runner.

Solution

Note: This PR description was updated to reflect the current implementation. The original approach used platformTag/mergeTag inputs on the existing action. Based on review feedback, the design evolved to use useNativeRunner + dedicated merge actions. See the edit history for the previous direction.

The implementation uses a clean separation of concerns:

  1. useNativeRunner boolean on the existing build action/task — signals that the build is running on a native runner for a single platform. When true, platform must be a single value (e.g., linux/amd64) and the image tag suffix is auto-derived (linux/amd64linux-amd64). The --platform flag is not passed to devcontainer build, relying on the native runner architecture.

  2. Dedicated merge action/task — a separate devcontainers/ci/merge GitHub Action and DevcontainersMerge@0 Azure DevOps task that combines per-platform images into a multi-arch manifest using docker buildx imagetools create.

Example Workflow (GitHub Actions)

jobs:
  build:
    strategy:
      matrix:
        include:
          - runner: ubuntu-latest
            platform: linux/amd64
          - runner: ubuntu-24.04-arm
            platform: linux/arm64
    runs-on: ${{ matrix.runner }}
    steps:
      - uses: actions/checkout@v4
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/setup-buildx-action@v3
      - uses: devcontainers/ci@v0.3
        with:
          imageName: ghcr.io/org/repo/devcontainer
          imageTag: latest
          platform: ${{ matrix.platform }}
          useNativeRunner: true
          push: always

  manifest:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/setup-buildx-action@v3
      - uses: devcontainers/ci/merge@v0.3
        with:
          imageName: ghcr.io/org/repo/devcontainer
          platforms: linux/amd64,linux/arm64

Scope of Work

GitHub Action (action.yml)

  • Add useNativeRunner boolean input
  • When useNativeRunner=true: validate single platform, auto-derive tag suffix via platformToTagSuffix(), skip --platform flag, push platform-suffixed images

GitHub Action — Merge (merge/action.yml)

  • New dedicated action for multi-arch manifest creation
  • Accepts imageName, imageTag, platforms inputs
  • Uses docker buildx imagetools create to combine per-platform images

Azure DevOps Task (DevcontainersCi)

  • Add useNativeRunner boolean input
  • Mirror the same native runner logic as the GitHub Action

Azure DevOps Task — Merge (DevcontainersMerge)

  • New dedicated task for multi-arch manifest creation
  • Registered in vss-extension.json

Common

  • Add platformToTagSuffix() to common/src/platform.ts — converts linux/amd64linux-amd64
  • Update mergeMultiPlatformImages() to accept standard platform format and auto-derive suffixes
  • createMultiPlatformImage in common/src/docker.ts — validates/trims input, includes stderr/stdout in errors

Tests

  • Unit tests for platformToTagSuffix, buildImageNames, mergeMultiPlatformImages (16 tests)
  • Unit tests for createMultiPlatformImage (7 tests)

Documentation

  • Update docs/github-action.md with useNativeRunner input + merge action reference
  • Update docs/azure-devops-task.md with useNativeRunner input + merge task reference
  • Update docs/multi-platform-builds.md with "How it works" section, revised examples for both platforms

Design Notes

Why useNativeRunner instead of extending platform?

The existing platform input (e.g., linux/amd64,linux/arm64) triggers a single QEMU-emulated build. useNativeRunner is a mode signal that says "this is a native build, skip --platform" — it can't be folded into platform because platform: linux/amd64 already means "use QEMU via buildx" for existing users. The tag suffix is auto-derived from the platform value, so users don't need to manually compute it.

Why a separate merge action/task?

The original approach overloaded the build action with merge logic via mergeTag, adding conditional complexity to both runMain() and runPost(). Extracting the merge into dedicated actions keeps each action's contract simple and avoids the push-gating concerns that arose in review.

Why docker buildx imagetools create instead of docker manifest?

imagetools create works with remote registry images without pulling them locally, and supports OCI image indexes natively. It is the recommended approach for combining multi-platform images that are already pushed to a registry.

Backwards compatibility

When useNativeRunner is false (default), the action behaves exactly as before. The existing platform input with QEMU emulation continues to work unchanged.

Mode Inputs Runner Build strategy
QEMU (existing) platform only Single runner Cross-arch emulation via buildx
Native (new) platform + useNativeRunner: true Matrix of native runners Each runner builds its own arch natively
Merge (new) Separate action/task with platforms Any runner Combines per-platform images into multi-arch manifest

Note

We are already using the fork internally with GitHub Actions. The DevOps task implementation follows the same patterns but has not been tested in an AzDO pipeline.

Add platformTag and mergeTag inputs to support building on native
ARM runners in a matrix strategy, then merging per-platform images
into a multi-arch manifest via docker buildx imagetools create.

This avoids slow QEMU emulation for multi-platform builds by allowing
each matrix job to build natively for its own platform.
The devcontainer CLI rejects --platform without --output. For native
single-platform builds (platformTag set), use type=docker to load
the image into the local daemon for subsequent docker push.
The devcontainer CLI rejects --platform for docker-compose-based
devcontainers. When platformTag is set, the runner is already the
correct native architecture, so --platform is unnecessary.
- Mirror platformTag/mergeTag logic in azdo-task (task.json inputs,
  runMain/runPost in main.ts, createManifest wrapper in docker.ts)
- Add unit tests for createManifest in common/__tests__/docker.test.ts
- Update docs/github-action.md and docs/azure-devops-task.md input tables
- Add native multi-platform builds section to docs/multi-platform-builds.md
  with examples for both GitHub Actions and Azure DevOps Pipelines
@tgenov
Copy link
Copy Markdown
Author

tgenov commented Feb 26, 2026

@microsoft-github-policy-service agree

@tgenov tgenov marked this pull request as ready for review February 26, 2026 07:41
@tgenov tgenov requested review from a team and stuartleeks as code owners February 26, 2026 07:41
@abdurriq abdurriq requested a review from Copilot March 27, 2026 14:36
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a native (non-QEMU) multi-platform build strategy by splitting per-architecture builds into matrix jobs that push platform-suffixed tags, followed by a final manifest-merge job that publishes a multi-arch tag.

Changes:

  • Introduces platformTag (per-platform tagging/push) and mergeTag (manifest merge) flows for both GitHub Actions and Azure DevOps.
  • Adds a createManifest implementation using docker buildx imagetools create, plus unit tests.
  • Updates docs and action/task metadata to document and expose the new inputs.

Reviewed changes

Copilot reviewed 11 out of 15 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
action.yml Exposes new platformTag / mergeTag inputs for the GitHub Action.
github-action/src/main.ts Implements platform-suffixed tagging, merge-only early return, and post-step manifest creation/push behavior.
github-action/src/docker.ts Adds a GitHub Action wrapper for createManifest.
github-action/dist/sourcemap-register.js Updated build artifact.
github-action/dist/licenses.txt Updated build artifact.
common/src/docker.ts Adds shared createManifest implementation via docker buildx imagetools create.
common/__tests__/docker.test.ts Adds unit tests for createManifest.
azdo-task/DevcontainersCi/task.json Exposes new platformTag / mergeTag inputs for the AzDO task.
azdo-task/DevcontainersCi/src/main.ts Mirrors the platform-suffixed tagging and manifest merge flow in AzDO.
azdo-task/DevcontainersCi/src/docker.ts Adds an AzDO wrapper for createManifest.
docs/multi-platform-builds.md Documents the new native matrix strategy and examples.
docs/github-action.md Documents new GitHub Action inputs.
docs/azure-devops-task.md Documents new AzDO task inputs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/multi-platform-builds.md
Comment thread github-action/src/main.ts Outdated
Comment thread github-action/src/main.ts Outdated
Comment thread common/src/docker.ts Outdated
Comment thread azdo-task/DevcontainersCi/src/main.ts Outdated
Comment thread azdo-task/DevcontainersCi/src/main.ts Outdated
Comment thread azdo-task/DevcontainersCi/task.json Outdated
"required": false
},
{
"name": "mergeTag",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Naming could be more clear

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what naming you'd prefer? Happy to rename.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps mergePlatformTags?

Comment thread common/src/docker.ts Outdated
Comment thread common/__tests__/docker.test.ts Outdated
@abdurriq
Copy link
Copy Markdown
Contributor

@tgenov Thanks for kickstarting this work, it would be very useful to have native builds. I'll be happy to re-review once the above comments are addressed.

tgenov added 11 commits March 28, 2026 21:07
Move the mergeTag block after the push option filtering logic in both
GitHub Action and Azure DevOps implementations. Previously, mergeTag
would bypass all push gating and could publish manifests on PRs or
when push was set to 'never'.
…orm.ts

Extract buildImageNames and mergeMultiPlatformImages helpers to eliminate
duplicated logic between GitHub Action and Azure DevOps implementations.
… platformTag

- Fail early with a clear error if mergeTag is set without push: always,
  preventing silent no-ops when default push filtering skips the manifest.
- Fail early if both mergeTag and platformTag are set on the same step.
- Simplify redundant return logic in mergeTag runPost blocks.
- Add push: always to manifest job examples in docs.
- Remove duplicate "Creating multi-arch manifest" log from AzDO wrapper
  (mergeMultiPlatformImages already logs this).
- Shorten GH Action group header to avoid repeating the log message.
- Update General Notes to reflect GitHub's hosted ARM runners and link
  to the native matrix strategy.
- Add missing Docker login and buildx setup steps to the AzDO native
  multi-platform example.
@tgenov
Copy link
Copy Markdown
Author

tgenov commented Mar 28, 2026

Thanks for the review. All (but one) comments addressed. Beyond the requested changes, I also added validation that mergeTag and platformTag cannot be set together, and updated the docs General Notes section to reflect GitHub's hosted ARM runners.

@tgenov tgenov requested a review from Copilot March 28, 2026 20:34
tgenov and others added 2 commits March 28, 2026 23:18
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Comment thread common/__tests__/platform.test.ts Outdated
Comment thread github-action/dist/index.js Outdated
Comment thread docs/github-action.md Outdated
Comment thread azdo-task/DevcontainersCi/src/main.ts Outdated
…test

- Remove github-action/dist/ from tracking (upstream compiles in CI per PR devcontainers#210)
- Fix "GitHub Actions Runner" -> "Azure Pipelines agent" in azure-devops-task.md
- Merge redundant "returns true" test into the existing happy-path test
@tgenov
Copy link
Copy Markdown
Author

tgenov commented Apr 11, 2026

@abdurriq

EDIT: Previous version of the comment was way too verbose for this stage of the process.

You're right that this is a lot to track across the two platforms. Both questions connect to the open question in the PR description about polymorphism vs. separation.

On dropping platformTag: it can't be folded into platform because platform: linux/amd64 already means "use QEMU via buildx" for existing users. platformTag is the mode signal that says "this is a native build, skip --platform" and also provides the tag suffix (Docker tags can't contain slashes, so linux/amd64 can't be used directly).

On separating the merge: agreed, that would help. I'd propose extracting the merge step into a new merge/action.yml (and AzDO equivalent). That removes all mergeTag paths from runMain()/runPost(), making the main action cleaner. platformTag stays on the main action since it's still needed there.

Want me to go ahead with that, or do you have a different split in mind?

@abdurriq
Copy link
Copy Markdown
Contributor

abdurriq commented Apr 13, 2026

@tgenov Thank you for thinking this through in detail.

On dropping platformTag: it can't be folded into platform because platform: linux/amd64 already means "use QEMU via buildx" for existing users. platformTag is the mode signal that says "this is a native build, skip --platform" and also provides the tag suffix (Docker tags can't contain slashes, so linux/amd64 can't be used directly).

Could we instead introduce a native boolean, which we then use to determine if it should be native / QEMU? We could then use the auto-derived tag name from platform rather than having a new platformTag (similar to Option 3 in your original comment). Since we will be bumping the major version, won't be doing anything different when native is unset, and since we don't need platform to change (we can do the linux/amd64 -> linux-amd64 transformation automatically), it should be drop-in regardless.

Some checks we should add:

  • If native: true, require platform to be a single platform (no commas) because each job is one arch.
  • (Potentially) Validate runner arch vs requested platform.

On separating the merge: agreed, that would help. I'd propose extracting the merge step into a new merge/action.yml (and AzDO equivalent). That removes all mergeTag paths from runMain()/runPost(), making the main action cleaner. platformTag stays on the main action since it's still needed there.

That would be great. The merge action should ideally take platforms in the standard format (e.g. linux/amd64,linux/arm64) and do the same derevation / transformation that we do above for platform -> suffix.

tgenov added 3 commits April 13, 2026 15:57
…ept standard platform format

- platformToTagSuffix converts 'linux/amd64' to 'linux-amd64'
- mergeMultiPlatformImages now takes platforms in standard format
  (e.g., 'linux/amd64,linux/arm64') and auto-derives tag suffixes
- Updated and expanded tests
@tgenov
Copy link
Copy Markdown
Author

tgenov commented Apr 13, 2026

@abdurriq Quick clarification on the platform + native interaction before I go further.

With native builds, platform changes meaning: it goes from "tell buildx which architectures to emulate" to "label for tag suffix derivation". The --platform flag is not passed to the build, and the value must be single-valued (no commas). That's two different semantics behind the same input depending on a boolean flag.

An alternative: when native: true, skip platform entirely and auto-detect the runner's architecture to derive the suffix. That way platform always means what it means today.

The downside of auto-detection is that the build job becomes implicit about its platform while the merge action still needs an explicit platforms: linux/amd64,linux/arm64 list. So you'd have an asymmetry where the suffix is auto-detected on one side and explicitly declared on the other.

Happy to go either way -- just want to flag the ambiguity before building on it.

@tgenov
Copy link
Copy Markdown
Author

tgenov commented Apr 13, 2026

To be clear on the auto-detection risk: uname -m returns values like x86_64 or aarch64, which need a mapping table to convert to Docker platform format. For amd64/arm64 this is unambiguous, but for ARM variants (armv7l could be linux/arm/v7 or linux/arm/v6) it gets unreliable. So auto-detection works for the common case but is not fully general.

So re-using platform reduces the complexity (no platform auto-detection) at the cost of changing the semantics of the field.

@abdurriq
Copy link
Copy Markdown
Contributor

@tgenov I see, in that case let's change the semantics.

@tgenov tgenov requested a review from Copilot April 14, 2026 19:10
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 39 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

merge/package.json:1

  • The package script invokes ncc build without an entry file, which will cause npm run package (and therefore npm run all) to fail. Update the script to pass the correct entrypoint (e.g., the compiled main file) and ensure the output matches run-main.js expectations (typically dist/index.js).
    merge/src/main.ts:1
  • This overwrites more specific failure details that can already be emitted by the underlying createMultiPlatformImage wrapper (which calls core.setFailed(error) on exceptions). Consider propagating/returning the original error and setting a single, specific failure message here (or avoid calling setFailed in the lower-level wrapper so runMain can surface the real error).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread azdo-task/DevcontainersMerge/package.json
Comment thread action.yml
Comment thread azdo-task/DevcontainersMerge/src/main.ts
createMultiPlatformImage was calling setResult/setFailed with the
detailed error, then main.ts overwrote it with a generic message.
Now docker.ts logs the detail and main.ts is the single source of
truth for task failure status.
@tgenov
Copy link
Copy Markdown
Author

tgenov commented Apr 14, 2026

PR description has been updated to reflect the current state of the commit (the original implementation was superceded based on feedback).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants