microsoft · bmehta001 · Jun 17, 2026 · Jun 17, 2026 · Jun 17, 2026 · Jun 17, 2026
diff --git a/.github/workflows/samples-integration-test.yml b/.github/workflows/samples-integration-test.yml
@@ -200,17 +200,6 @@ jobs:
             /p:TreatWarningsAsErrors=false `
             --configuration Release
 
-          # Build WinML SDK package (Windows only)
-          if ($IsWindows) {
-            dotnet pack sdk/cs/src/Microsoft.AI.Foundry.Local.csproj `
-              -o local-packages `
-              /p:Version=0.9.0-dev-20260324 `
-              /p:UseWinML=true `
-              /p:IsPacking=true `
-              /p:TreatWarningsAsErrors=false `
-              --configuration Release
-          }
-
           Write-Host "Local packages:"
           Get-ChildItem local-packages/*.nupkg | ForEach-Object { Write-Host "  $($_.Name)" }
 

diff --git a/.pipelines/v2/sdk_v2-js-pipeline-plan.md b/.pipelines/v2/sdk_v2-js-pipeline-plan.md
@@ -19,7 +19,6 @@ Out of scope (deferred):
 
 - **Linux ARM64.** Cross-cutting ARM64 work item will add this for C++,
   C#, Python, and JS together.
-- **WinML variant.** JS is scoped out by the base plan.
 - **npm publishing.** Pipeline produces the `.tgz` as an artifact only.
 
 ## Supported platforms

diff --git a/.pipelines/v2/sdk_v2-pipeline-plan.md b/.pipelines/v2/sdk_v2-pipeline-plan.md
@@ -39,9 +39,10 @@ are gated separately via `.pipelines/v1/templates/stages-sdk-v1.yml`.
    `/p:FoundryLocalRuntimeVersion=$(sdkVersion)` where `sdkVersion` is the
    same string used to pack the Runtime nupkg. SDK and Runtime always ship
    as a matched pair.
-2. **C# WinML SKU.** `UseWinML=true` flips the `PackageReference` to
-   `Microsoft.AI.Foundry.Local.Runtime.WinML`. The csproj has two
-   `UseWinML`-conditional blocks (one for `.Runtime`, one for `.Runtime.WinML`).
+2. **Single C# package bundles WinML.** The SDK csproj references one
+   `Microsoft.AI.Foundry.Local.Runtime` package, which carries the reg-free
+   WinML 2.x runtime on Windows. There is no separate WinML SKU or
+   `UseWinML` switch.
 3. **Python SDK wheel bundles `foundry_local` directly** at
    `_native/<rid>/foundry_local.{ext}`. No separate `foundry-local-runtime`
    wheel. Trade-off accepted: SDK and runtime upgrade together. Non-SDK
@@ -62,11 +63,11 @@ are gated separately via `.pipelines/v1/templates/stages-sdk-v1.yml`.
    per-track files (`sdkVersion.txt`, `pyVersion.txt`, plus `*.v1.txt`
    counterparts). Both the sdk_v2 and sdk_v1 coordinators read from this
    artifact rather than recomputing a timestamp.
-7. **Python WinML wheel name via custom PEP 517 backend.** The build
-   backend at `sdk_v2/python/_build_backend/__init__.py` wraps
-   `setuptools.build_meta` and rewrites the project name to
-   `foundry-local-sdk-winml` when `FL_PYTHON_PACKAGE_NAME` is set in the
-   environment. The same backend also handles ORT pin rewriting (decision 8).
+7. **Single Python wheel name; custom PEP 517 backend rewrites ORT pins.**
+   One wheel, `foundry-local-sdk`, bundles the WinML runtime on Windows. The
+   build backend at `sdk_v2/python/_build_backend/__init__.py` wraps
+   `setuptools.build_meta` solely to rewrite the ORT/GenAI pins (decision 8);
+   it no longer rewrites the project name.
 8. **Single source of truth for ORT/GenAI versions.** ORT and GenAI versions
    live in `sdk_v2/deps_versions.json`. The file shape is
    `{ "onnxruntime": { "version": "..." }, "onnxruntime-genai": { "version": "..." } }`.
@@ -80,10 +81,10 @@ are gated separately via `.pipelines/v1/templates/stages-sdk-v1.yml`.
      at wheel-build time. If the backend is ever bypassed, `pip install`
      fails fast with "no matching version" (intentional loud failure).
 
-   Bumping ORT/GenAI is a one-file edit per variant.
+   Bumping ORT/GenAI is a one-file edit.
 9. **ORT/GenAI come from public PyPI.** No private feed plumbing required
    for the wheel install path:
-   - `onnxruntime-core` (Windows/macOS, standard + WinML variants)
+   - `onnxruntime-core` (Windows/macOS)
    - `onnxruntime-genai-core` (Windows/macOS)
    - `onnxruntime-gpu` / `onnxruntime-genai-cuda` (Linux)
 
@@ -97,15 +98,21 @@ are gated separately via `.pipelines/v1/templates/stages-sdk-v1.yml`.
     copy the artifact verbatim — they do not re-filter. This keeps the
     "what ships next to `foundry_local`" decision in exactly one place.
 
-    Staging includes all `*.dll`/`*.pdb` (Windows), `*.so`/`*.so.*`/`lib*.so*`
-    (Linux), `*.dylib` (macOS) from the build output bin directory and
-    excludes:
-    - ORT/GenAI (`onnxruntime*`, `Microsoft.Windows.AI.MachineLearning.*`) —
-      provided by pip on the Python side and by NuGet on the C# side.
-    - Test/example binaries (`*_tests.*`, `*_example.*`, `gtest*`,
-      `cmake_test_discovery_*`).
-
-    The step fails loudly if `foundry_local` itself is missing.
+    Each staging step copies an explicit allow-list, not a glob: just the
+    redistributable `foundry_local` library (`.dll` + `.pdb` + `.lib` on
+    Windows, `libfoundry_local.so` / `.dylib` elsewhere). vcpkg statically
+    links the rest of the closure (ORT/GenAI/azure-*/curl/ssl/zlib/spdlog/fmt)
+    into `foundry_local` itself, so nothing else needs to travel. ORT/GenAI
+    are resolved separately at runtime — from pip on the Python side and from
+    NuGet on the C# side.
+
+    On Windows the allow-list also includes the delay-loaded
+    `Microsoft.Windows.AI.MachineLearning.dll` (the reg-free WinML 2.x
+    runtime, ~922 KB, Microsoft-signed). It is the single payload difference
+    that used to distinguish the WinML SKU; bundling it unconditionally is
+    what lets one package serve every consumer.
+
+    Each step fails loudly if its primary library is missing.
 11. **Python runtime ORT discovery.** `lib_loader.py::prepare_native_dependencies()`
     bridges between the in-wheel `foundry_local` and the pip-installed ORT
     packages:
@@ -120,7 +127,7 @@ are gated separately via `.pipelines/v1/templates/stages-sdk-v1.yml`.
 12. **`foundry-local-install` CLI.** Declared via `[project.scripts]` in
     `pyproject.toml`, backed by
     `sdk_v2/python/src/foundry_local_sdk/_native/installer.py`. Flags:
-    `--winml`, `--verbose`. Verifies installed packages via
+    `--verbose`. Verifies installed packages via
     `importlib.util.find_spec` to avoid triggering DLL load during
     verification.
 
@@ -130,18 +137,18 @@ are gated separately via `.pipelines/v1/templates/stages-sdk-v1.yml`.
 .pipelines/v2/
 └── templates/
     ├── stages-sdk-v2.yml             # Coordinator: native + cs + python
-    ├── stages-build-native.yml       # 6 build stages + 2 pack stages (C++)
-    ├── stages-cs.yml                 # C# build + test (variant: base | winml)
-    ├── stages-python.yml             # Python build + test (variant: base | winml)
+    ├── stages-build-native.yml       # 4 build stages + 1 pack stage (C++)
+    ├── stages-cs.yml                 # C# build + test
+    ├── stages-python.yml             # Python build + test
     ├── steps-prefetch-nuget.yml      # ORT/GenAI/WinML NuGet pre-fetch (pwsh + bash)
-    ├── steps-build-windows.yml       # arch: x64 | arm64; useWinml: true | false
+    ├── steps-build-windows.yml       # arch: x64 | arm64 (always bundles WinML)
     ├── steps-build-linux.yml
     ├── steps-build-macos.yml
     ├── steps-build-cs.yml            # restore + build + ESRP-sign + pack + ESRP-sign nupkg
     ├── steps-test-cs.yml             # restore + build + run tests
     ├── steps-build-python.yml        # pass-through copy + python -m build --wheel
     ├── steps-test-python.yml         # install wheel + pytest
-    └── steps-pack-nuget.yml          # Runs sdk_v2/cpp/nuget/pack.py (variant: base | winml)
+    └── steps-pack-nuget.yml          # Runs sdk_v2/cpp/nuget/pack.py
 ```
 
 The repo-shared `.pipelines/templates/checkout-steps.yml` is reused for
@@ -155,32 +162,25 @@ compute_version
    |
    |-- cpp_build_win_x64 ----------+
    |-- cpp_build_win_arm64 --------|
-   |-- cpp_build_linux_x64 --------+--> pack_nuget --+--> cs_build_base --+--> cs_test_win_x64
-   |-- cpp_build_osx_arm64 --------+                 |                    |--> cs_test_linux_x64
-   |                                                 |                    +--> cs_test_osx_arm64
-   |                                                 +--> (cs-sdk-v2-base artifact)
-   |
-   |   +--> py_build_base_win_x64    --> py_test_base_win_x64
-   +-->|--> py_build_base_linux_x64  --> py_test_base_linux_x64
-   |   +--> py_build_base_osx_arm64  --> py_test_base_osx_arm64
+   |-- cpp_build_linux_x64 --------+--> pack_nuget --+--> cs_build --+--> cs_test_win_x64
+   |-- cpp_build_osx_arm64 --------+                 |               |--> cs_test_linux_x64
+   |                                                 |               +--> cs_test_osx_arm64
+   |                                                 +--> (cs-sdk-v2 artifact)
    |
-   |-- cpp_build_win_x64_winml ---+
-   |-- cpp_build_win_arm64_winml -+--> pack_nuget_winml --> cs_build_winml --> cs_test_win_x64_winml
-   |                              |
-   |                              +--> py_build_winml_win_x64 --> py_test_winml_win_x64
-   |                                   py_build_winml_win_arm64  (no test — cross-compile)
+   |   +--> py_build_win_x64    --> py_test_win_x64
+   +-->|--> py_build_linux_x64  --> py_test_linux_x64
+       +--> py_build_osx_arm64  --> py_test_osx_arm64
 ```
 
 * All build stages are independent (`dependsOn: [compute_version]`) and run
   in parallel.
-* Both pack stages run on every build (PR and `main`).
-* WinML and non-WinML build stages link against the same ORT version
-  (`ortVersion`, currently 1.25.x). WinML 2.x is reg-free and uses the
-  standard ORT package, so a separate WinML-aligned ORT pin is no longer
-  required.
-* Tests run on `cpp_build_win_x64`, `cpp_build_win_x64_winml`,
-  `cpp_build_linux_x64`, and `cpp_build_osx_arm64`. The two ARM64 Windows
-  stages cross-compile from an x64 host and skip tests.
+* The pack stage runs on every build (PR and `main`).
+* The Windows native build always bundles the reg-free WinML 2.x runtime,
+  which links against the same `ortVersion` as every other platform — there
+  is no separate WinML-aligned ORT pin or build flavor.
+* Tests run on `cpp_build_win_x64`, `cpp_build_linux_x64`, and
+  `cpp_build_osx_arm64`. The ARM64 Windows stage cross-compiles from an x64
+  host and skips tests.
 
 ## Per-stage artifacts
 
@@ -189,15 +189,12 @@ Published via 1ES `templateContext.outputs` (no manual `PublishPipelineArtifact`
 | Stage                       | Artifact name                  | Contents                                                   |
 |-----------------------------|--------------------------------|------------------------------------------------------------|
 | `compute_version`           | `version-info`                 | `sdkVersion.txt`, `pyVersion.txt`, `flcVersion.txt`        |
-| `cpp_build_win_x64`         | `cpp-native-win-x64`           | `foundry_local.dll`, `foundry_local.pdb` + vcpkg closure   |
+| `cpp_build_win_x64`         | `cpp-native-win-x64`           | `foundry_local.dll`, `.pdb`, `.lib` + `Microsoft.Windows.AI.MachineLearning.dll` |
 | `cpp_build_win_x64`         | `cpp-native-include`           | Public headers (sourced once, from win-x64)                |
-| `cpp_build_win_arm64`       | `cpp-native-win-arm64`         | `foundry_local.dll`, `foundry_local.pdb` + vcpkg closure   |
-| `cpp_build_linux_x64`       | `cpp-native-linux-x64`         | `libfoundry_local.so` + vcpkg closure                      |
-| `cpp_build_osx_arm64`       | `cpp-native-osx-arm64`         | `libfoundry_local.dylib` + vcpkg closure                   |
-| `cpp_build_win_x64_winml`   | `cpp-native-win-x64-winml`     | `foundry_local.dll`, `foundry_local.pdb` (WinML)           |
-| `cpp_build_win_arm64_winml` | `cpp-native-win-arm64-winml`   | `foundry_local.dll`, `foundry_local.pdb` (WinML)           |
-| `cpp_pack_nuget`            | `cpp-nuget`                    | `Microsoft.AI.Foundry.Local.Runtime.<version>.nupkg`       |
-| `cpp_pack_nuget_winml`      | `cpp-nuget-winml`              | `Microsoft.AI.Foundry.Local.Runtime.WinML.<version>.nupkg` |
+| `cpp_build_win_arm64`       | `cpp-native-win-arm64`         | `foundry_local.dll`, `.pdb`, `.lib` + `Microsoft.Windows.AI.MachineLearning.dll` |
+| `cpp_build_linux_x64`       | `cpp-native-linux-x64`         | `libfoundry_local.so`                                      |
+| `cpp_build_osx_arm64`       | `cpp-native-osx-arm64`         | `libfoundry_local.dylib`                                   |
+| `cpp_pack_nuget`            | `cpp-nuget`                    | `Microsoft.AI.Foundry.Local.Runtime.<version>.nupkg` (bundles WinML on Windows) |
 
 ## Versioning
 
@@ -278,13 +275,12 @@ set the env var. SDK *build* stages do not need test-data-shared.
 Test data is fetched on every stage that runs tests:
 
 * `cpp_build_win_x64`
-* `cpp_build_win_x64_winml`
 * `cpp_build_linux_x64`
 * `cpp_build_osx_arm64`
 * All `cs_test_*` and `py_test_*` stages
 
-Skipped on `cpp_build_win_arm64` and `cpp_build_win_arm64_winml`
-(cross-compile, no test execution on the host).
+Skipped on `cpp_build_win_arm64` (cross-compile, no test execution on the
+host).
 
 ## Build commands
 

diff --git a/.pipelines/v2/templates/stages-build-native.yml b/.pipelines/v2/templates/stages-build-native.yml
@@ -1,13 +1,12 @@
 # Native build + pack stages for the Foundry Local C++ SDK.
 #
-# Per-platform stages each produce a `cpp-native-<rid>[-winml]` pipeline
-# artifact. Two pack stages assemble the artifacts into separate NuGet
-# packages on every build (PR and main):
+# Per-platform stages each produce a `cpp-native-<rid>` pipeline artifact. On
+# Windows the build always includes the reg-free WinML 2.x runtime. A single
+# pack stage assembles the artifacts into one NuGet package on every build
+# (PR and main):
 #
-#   pack_nuget        – Microsoft.AI.Foundry.Local.Runtime
-#                       (win-x64, win-arm64, linux-x64, osx-arm64)
-#   pack_nuget_winml  – Microsoft.AI.Foundry.Local.Runtime.WinML
-#                       (win-x64, win-arm64; both built with --use_winml)
+#   pack_nuget  – Microsoft.AI.Foundry.Local.Runtime
+#                 (win-x64, win-arm64, linux-x64, osx-arm64; Windows ships WinML)
 
 parameters:
 - name: buildConfig
@@ -149,81 +148,10 @@ stages:
           runTests: true
 
   # ====================================================================
-  # Windows x64 (WinML) — build + test
-  # The WinML variant links against the WinML-aligned ORT (1.23.x), which
-  # is older than the base ORT and lacks some ops used by cataloged vision
-  # models (e.g. com.microsoft:CausalConvWithState). The C++ VisionFixture
-  # and the C# VisionTests both skip themselves on WinML so the rest of
-  # the suite still runs against this configuration.
-  # ====================================================================
-  - stage: cpp_build_win_x64_winml
-    displayName: 'C++ Native: Windows x64 (WinML)'
-    dependsOn:
-    - compute_version
-    jobs:
-    - job: build
-      pool:
-        name: onnxruntime-Win-CPU-2022
-        os: windows
-      templateContext:
-        inputs:
-        - input: pipelineArtifact
-          artifactName: 'version-info'
-          targetPath: '$(Pipeline.Workspace)/version-info'
-        outputs:
-        - output: pipelineArtifact
-          artifactName: 'cpp-native-win-x64-winml'
-          targetPath: '$(Build.ArtifactStagingDirectory)/native'
-      steps:
-      - template: steps-build-windows.yml
-        parameters:
-          arch: x64
-          buildConfig: ${{ parameters.buildConfig }}
-          ortVersion: ${{ parameters.ortVersion }}
-          genaiVersion: ${{ parameters.genaiVersion }}
-          winmlVersion: ${{ parameters.winmlVersion }}
-          useWinml: true
-          runTests: true
-          stageHeaders: false
-
-  # ====================================================================
-  # Windows ARM64 (WinML) — cross-compile only
-  # ====================================================================
-  - stage: cpp_build_win_arm64_winml
-    displayName: 'C++ Native: Windows ARM64 (WinML)'
-    dependsOn:
-    - compute_version
-    jobs:
-    - job: build
-      pool:
-        name: onnxruntime-Win-CPU-2022
-        os: windows
-      templateContext:
-        inputs:
-        - input: pipelineArtifact
-          artifactName: 'version-info'
-          targetPath: '$(Pipeline.Workspace)/version-info'
-        outputs:
-        - output: pipelineArtifact
-          artifactName: 'cpp-native-win-arm64-winml'
-          targetPath: '$(Build.ArtifactStagingDirectory)/native'
-      steps:
-      - template: steps-build-windows.yml
-        parameters:
-          arch: arm64
-          buildConfig: ${{ parameters.buildConfig }}
-          ortVersion: ${{ parameters.ortVersion }}
-          genaiVersion: ${{ parameters.genaiVersion }}
-          winmlVersion: ${{ parameters.winmlVersion }}
-          useWinml: true
-          runTests: false
-          stageHeaders: false
-
-  # ====================================================================
-  # Pack — base NuGet package (all 4 platforms)
+  # Pack — NuGet package (all 4 platforms; Windows includes WinML)
   # ====================================================================
   - stage: cpp_pack_nuget
-    displayName: 'C++ Native: Pack NuGet (base)'
+    displayName: 'C++ Native: Pack NuGet'
     dependsOn:
     - compute_version
     - cpp_build_win_x64
@@ -261,40 +189,3 @@ stages:
         parameters:
           ortVersion: ${{ parameters.ortVersion }}
           genaiVersion: ${{ parameters.genaiVersion }}
-          variant: base
-
-  # ====================================================================
-  # Pack — WinML NuGet package (Windows x64 + arm64 only)
-  # ====================================================================
-  - stage: cpp_pack_nuget_winml
-    displayName: 'C++ Native: Pack NuGet (WinML)'
-    dependsOn:
-    - compute_version
-    - cpp_build_win_x64_winml
-    - cpp_build_win_arm64_winml
-    jobs:
-    - job: pack
-      pool:
-        name: onnxruntime-Win-CPU-2022
-        os: windows
-      templateContext:
-        inputs:
-        - input: pipelineArtifact
-          artifactName: 'version-info'
-          targetPath: '$(Pipeline.Workspace)/version-info'
-        - input: pipelineArtifact
-          artifactName: 'cpp-native-win-x64-winml'
-          targetPath: '$(Pipeline.Workspace)/cpp-native-win-x64-winml'
-        - input: pipelineArtifact
-          artifactName: 'cpp-native-win-arm64-winml'
-          targetPath: '$(Pipeline.Workspace)/cpp-native-win-arm64-winml'
-        outputs:
-        - output: pipelineArtifact
-          artifactName: 'cpp-nuget-winml'
-          targetPath: '$(Build.ArtifactStagingDirectory)/nuget'
-      steps:
-      - template: steps-pack-nuget.yml
-        parameters:
-          ortVersion: ${{ parameters.ortVersion }}
-          genaiVersion: ${{ parameters.genaiVersion }}
-          variant: winml