Add wasm/CoreCLR libraries tests to outerloop CI#129436
Conversation
Add a scheduled outerloop leg that runs the slow [OuterLoop] library suites on Browser+CoreCLR (Chrome). There was previously no wasm/CoreCLR coverage in the outerloop pipeline, only CoreCLR desktop and Mono browser_wasm. To make the slow suites complete, enable the previously-excluded long-running suites and extend the Helix work-item timeouts - both conditionally for the outerloop (and 'all') TestScope, preserving the existing innerloop behavior: - eng/pipelines/libraries/outerloop.yml: new browser_wasm + coreclr job mirroring the Mono browser_wasm outerloop job. - src/libraries/tests.proj: gate the long-running suite exclusions behind a new RunLongRunningWasmTests property (true for outerloop/all). Nrbf stays unconditionally excluded. - src/tasks/HelixTestTasks/ComputeBatchTimeout.cs: parameterize the per-batch timeout (MinutesPerSuite/MinimumMinutes/MaximumMinutes); the defaults reproduce the previous behavior exactly. - src/libraries/sendtohelix-browser.targets: raise the batch timeout budget for outerloop, keeping the maximum below the AzDO job timeout. - src/libraries/sendtohelixhelp.proj: extend the non-batched browser+ CoreCLR work-item timeout for outerloop. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Tagging subscribers to this area: @dotnet/area-infrastructure-libraries |
There was a problem hiding this comment.
Pull request overview
This PR adds missing CI coverage for browser-wasm + CoreCLR by introducing an outerloop leg that runs slower [OuterLoop] library test suites on Chrome, and adjusts Helix batching/work-item timeout behavior so those suites can complete without impacting innerloop defaults.
Changes:
- Add a new outerloop pipeline job for browser_wasm + CoreCLR (Chrome) in libraries outerloop CI.
- Gate previously-excluded long-running browser/CoreCLR suites behind a new
RunLongRunningWasmTestsproperty enabled forTestScope=outerloop|all. - Parameterize and extend Helix timeout budgeting for batched browser test work-items (outerloop only), plus increase the non-batched browser/CoreCLR work-item timeout for outerloop.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| eng/pipelines/libraries/outerloop.yml | Adds a scheduled outerloop job to run browser-wasm CoreCLR library tests on Chrome with an increased job timeout. |
| src/libraries/tests.proj | Enables long-running browser/CoreCLR suites for outerloop/all via RunLongRunningWasmTests, while keeping the current default behavior for innerloop/local runs. |
| src/tasks/HelixTestTasks/ComputeBatchTimeout.cs | Makes per-batch Helix timeout calculation configurable via new MSBuild-settable parameters. |
| src/libraries/sendtohelix-browser.targets | Plumbs new per-batch timeout knobs into ComputeBatchTimeout, with larger budgets for outerloop/all. |
| src/libraries/sendtohelixhelp.proj | Increases browser/CoreCLR work-item timeout specifically for outerloop/all as a safety net for non-batched paths. |
| // MinutesPerSuite minutes per suite to account for WASM startup overhead + test | ||
| // execution; MinimumMinutes floor handles the heaviest individual suites | ||
| // (e.g. Cryptography ~17m); capped at MaximumMinutes (kept below 24h to prevent | ||
| // hh format wrapping, and intended to stay under the AzDO job timeout). | ||
| int totalMinutes = Math.Min(MaximumMinutes, Math.Max(MinimumMinutes, count * MinutesPerSuite)); |
|
/azp run runtime-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run runtime-libraries outerloop |
|
No pipelines are associated with this pull request. |
|
/azp run runtime-libraries outerloop |
|
No pipelines are associated with this pull request. |
The new browser_wasm + CoreCLR outerloop job was emitted unconditionally, so it ran in all four scheduled libraries-outerloop definitions (the umbrella plus the -windows/-linux/-osx platform splits), executing the slow wasm suites 4x per day. Gate it on includeLinuxOuterloop (browser_wasm builds on a Linux host) so it runs only in the umbrella and -linux definitions, matching how the existing Linux desktop CoreCLR suites are scoped. The job stays PR-runnable via /azp run since it is not gated on isRollingBuild. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run runtime-libraries-coreclr outerloop-linux |
|
Azure Pipelines successfully started running 1 pipeline(s). |
The 6 long-running library suites enabled for the wasm/CoreCLR outerloop job were bin-packed alongside ~9 other suites per Helix batch, so a single slow suite could inflate a shared batch's wall-clock and risk a timeout. Add a SoloItems parameter to the GroupWorkItems task that forces matching work items into their own solo batch (the same path used for oversized items), and wire it up in sendtohelix-browser.targets with the _WasmSoloBatchSuite list mirroring the long-running exclusions in tests.proj. The list is a no-op in innerloop, where those suites are excluded from the test archive. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The slow [OuterLoop] library suites were killed mid-run: the per-suite xharness timeout (WasmXHarnessTestsTimeout) defaulted to 90 min for wasm/CoreCLR, so suites such as System.Memory.Tests and System.Xml.Linq.xNodeBuilder.Tests hit xharness exit code 71 at exactly 5401s, and three batches were still running when the AzDO job timed out. Coordinate the nested timeout layers (suite <= batch <= job), conditional on outerloop only so innerloop budgets are preserved: - tests.wasm.targets: per-suite xharness timeout 04:00:00 for outerloop wasm/CoreCLR (innerloop keeps 01:30:00 / 00:30:00). - sendtohelix-browser.targets: raise the per-batch Helix floor to 270 min (above the 240-min suite timeout, so a solo slow-suite batch is bounded by the suite's own timeout) and the cap to 300 min. - sendtohelix-browser.targets: gate the solo-batch list on outerloop and add System.Xml.Linq.xNodeBuilder.Tests (an outerloop-slow suite that still runs in innerloop). - outerloop.yml: raise the AzDO job timeout to 600 min so it outlasts the slowest batch plus build/queue overhead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run runtime-libraries-coreclr outerloop-linux |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Switch the WebAssembly CoreCLR outerloop lane from -testscope outerloop to -testscope all so [ConditionalClass] gating is honored on single- threaded wasm. The outerloop scope adds a '-trait category=OuterLoop' include filter. The wasm test host (xharness) treats a matching include as "always run", short-circuiting the '-notrait category=failing' exclude that backs [ConditionalClass]. As a result, [OuterLoop] tests from classes disabled via [ConditionalClass(IsMultithreadingSupported)] leaked onto single- threaded wasm and hung the job. -testscope all carries no OuterLoop include, so the failing-tagged classes are skipped as intended. Also gate the few genuinely thread-blocking [OuterLoop] tests that live in non-conditional classes (so [ConditionalClass] cannot cover them), each mirroring already-gated siblings in the same file: - MethodCoverage.cs: TaskContinuation - TaskAwaiterTests.cs: GetResult_NotCompleted_BlocksUntilCompletion - TaskAPMTest.cs: PollUntilCompleteTechnique, WaitOnAsyncWaitHandleTechnique - TaskSchedulerTests.cs: RunBlockedInjectionTest, RunSynchronizationContextTaskSchedulerTests Validated locally with WasmTestOnChrome /p:TestScope=all: System.Threading.Tasks.Tests now completes (0 fail, no hang) and System.Collections.Concurrent.Tests passes (0 fail). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The wasm/CoreCLR libraries outerloop lane runs with -testscope all, which
honors [ConditionalFact]/[ConditionalTheory]/[ConditionalClass] platform
gates. Several [OuterLoop] tests rely on real multithreading (Parallel.For,
PLINQ degree-of-parallelism, producer/consumer channels, finalizer races)
or a 64-bit address space, and fail or hang on single-threaded 32-bit wasm.
Gate them at the method/class level so they skip where
PlatformDetection.IsMultithreadingSupported (or Is64BitProcess) is false:
- System.Threading.Tasks.Parallel: ParallelState class (TaskReplicator PNSE)
- System.Linq.Parallel: DegreeOfParallelism_{Barrier,Pipelining,
Throttled_Pipelining}; ExchangeTests.Partitioning_{Default,Striped}_Longrunning
(delegate to bodies that already throw SkipTestException, so bare
ConditionalTheory suffices); CancellationParallelQueryCombinationTests
.SequenceEqual_OperationCanceledException; WithCancellationTests
.WithCancellation_DisposedEnumerator_ChannelCancellation_ProducerBlocked
- System.Runtime: GCTests.WaitForPendingFinalizersRaces (finalizer race)
- System.Collections.Immutable: Frozen{Dictionary,Set}Tests
.ToFrozen*_WithExtremelyLargeStrings (~1 GB allocations, 64-bit only)
Verified locally on wasm/Chrome (Release, -testscope all):
- System.Linq.Parallel.Tests: 0 failed (was 66)
- System.Threading.Tasks.Parallel.Tests: 0 failed (was 77)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run runtime-libraries-coreclr outerloop-linux |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…lel OuterLoop on wasm The initial Browser+CoreCLR outerloop CI run surfaced six [OuterLoop] tests that had never run on any wasm lane before (Mono runs innerloop only). Each fails because of a single-threaded/platform gap, so gate them at the test level (never via tests.proj exclusions): - TimerFiringTests.Timer_ChangeToDelete_DoesntFire: [Fact] -> [ConditionalFact(IsMultithreadingSupported)] (blocking waits + timer firing need threads), mirroring sibling Timer_CanDisposeSelfInCallback. - WindowAndCursorProps.Clear_Invoke_Success: + SkipOnPlatform(Browser) (Console.Clear is unsupported on Browser). - HttpClientTest.Send_TimeoutRequestContent_Throws: extend the existing Android skip to Android | Browser (synchronous Send is unsupported on Browser). - MetricsTest.ExternalServer_DurationMetrics_Recorded: + SkipOnPlatform (Browser) (uses System.Net.Dns, NameResolution unsupported on Browser). - ClientWebSocketOptionsTests.Proxy_SetNull_ConnectsSuccessfully: + SkipOnPlatform(Browser), mirroring sibling Proxy_Roundtrips. - ResponseStreamTest.BrowserHttpHandler_StreamingRequest_Http1Fails: + ActiveIssue(129758, IsNotMonoRuntime) on the existing browser-only ConditionalFact. This browser-only test passes on Mono-wasm but throws TaskCanceledException (100s timeout) instead of HttpRequestException on CoreCLR-wasm, so it is gated off CoreCLR-wasm until the streaming-request-over-HTTP/1 path is fixed (tracked by dotnet#129758). Also stop System.Linq.Parallel.Tests' [OuterLoop] matrix from running on the Browser+CoreCLR outerloop lane. Under -testscope all the suite expands to ~195k cases that all pass, but it produces a ~72 MB results file (which exceeds xharness's 30 MB Kestrel upload limit) and grows the wasm heap to ~1.3 GB, crashing the browser tab at teardown (xharness exit 71) and failing the work item. PLINQ has no real parallelism on a single thread, so the combinatorial OuterLoop coverage adds little value here. Skip OuterLoop for this one suite via WithoutCategories, scoped to browser+CoreCLR+outerloop/all (-notrait category=OuterLoop, the same mechanism innerloop already uses); its innerloop tests still run on the lane (locally: 27,654 pass / 0 fail, ~7 MB results, ~156 MB heap, <1 min) and all other lanes are unaffected. The suite is small again, so its Helix solo-batch entry is removed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two [OuterLoop] tests time out (4-hour xharness cap) on the wasm/CoreCLR outerloop lane because they are pathologically slow on the single-threaded wasm interpreter, not because of a defect: - System.Text.Json StreamTests.ReadTypeFromJsonWithLargeIgnoredProperties_OuterLoop streams 5 GiB of JSON (~1 hour per InlineData case). - System.Linq.Expressions CompilerTests.CompileDeepTree_NoStackOverflow compiles a 10,000-deep expression tree. Gate both with [SkipOnPlatform(TestPlatforms.Browser, ...)] so they are excluded on the browser lane while remaining covered on every other platform. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
System.Xml.Linq.xNodeBuilder.Tests is a legacy ModuleCore suite whose entire writer test surface runs through a single [Fact][OuterLoop] RunTests() entry point. On the wasm/CoreCLR outerloop lane it executes and stores results in ~9s, but the CoreCLR wasm runtime then fails to shut down cleanly, so the test harness hangs until the 4-hour xharness timeout fires (exit 71). RunTests() is [OuterLoop], so it never ran on single-threaded wasm before this lane existed (innerloop excludes it). Gate it with [SkipOnPlatform(Browser)] so it is excluded on the browser lane while remaining covered on every other platform. Verified locally (wasm/Chrome, Release, -testscope all): ungated the suite hangs until killed; gated it runs 7/8 cases to a clean WASM EXIT 0 (19 passed, 0 failed). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run runtime-libraries-coreclr outerloop-linux |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| public override bool Execute() | ||
| { | ||
| var counts = new Dictionary<string, int>(); | ||
| foreach (var item in GroupedItems) | ||
| { | ||
| string bid = item.GetMetadata("BatchId"); | ||
| counts.TryGetValue(bid, out int current); | ||
| counts[bid] = current + 1; | ||
| } | ||
|
|
||
| var result = new List<ITaskItem>(); | ||
| foreach (var batchId in BatchIds) | ||
| { | ||
| string bid = batchId.ItemSpec; | ||
| int count = counts.GetValueOrDefault(bid, 1); | ||
| // 20 minutes per suite to account for WASM startup overhead + test execution; | ||
| // minimum 30 minutes to handle the heaviest individual suites (e.g. Cryptography ~17m) | ||
| // Cap at 23:59 to prevent hh format wrapping at 24 hours | ||
| int totalMinutes = Math.Min(1439, Math.Max(30, count * 20)); | ||
| // MinutesPerSuite minutes per suite to account for WASM startup overhead + test | ||
| // execution; MinimumMinutes floor handles the heaviest individual suites | ||
| // (e.g. Cryptography ~17m); capped at MaximumMinutes (kept below 24h to prevent | ||
| // hh format wrapping, and intended to stay under the AzDO job timeout). | ||
| int totalMinutes = Math.Min(MaximumMinutes, Math.Max(MinimumMinutes, count * MinutesPerSuite)); | ||
| var ts = TimeSpan.FromMinutes(totalMinutes); |
| var soloNames = new HashSet<string>(StringComparer.Ordinal); | ||
| foreach (var solo in SoloItems) | ||
| soloNames.Add(solo.ItemSpec); |
| // This ModuleCore entry point runs the full XNodeBuilder writer suite. On CoreCLR wasm | ||
| // the runtime fails to shut down cleanly after it completes, so the test harness hangs | ||
| // until it times out. It is [OuterLoop], so it never ran on single-threaded wasm before; | ||
| // gate it off Browser to keep the lane green while preserving coverage elsewhere. | ||
| [SkipOnPlatform(TestPlatforms.Browser, "CoreCLR wasm runtime hangs at shutdown after this suite completes.")] | ||
| public static void RunTests() |
| <_workItemTimeout Condition="'$(_workItemTimeout)' == '' and '$(TargetsAppleMobile)' == 'true'">01:15:00</_workItemTimeout> | ||
| <_workItemTimeout Condition="'$(_workItemTimeout)' == '' and '$(TargetOS)' == 'android'">00:30:00</_workItemTimeout> | ||
| <_workItemTimeout Condition="'$(_workItemTimeout)' == '' and '$(TargetOS)' == 'browser' and '$(RuntimeFlavor)' == 'CoreCLR' and ('$(TestScope)' == 'outerloop' or '$(TestScope)' == 'all')">03:00:00</_workItemTimeout> | ||
| <_workItemTimeout Condition="'$(_workItemTimeout)' == '' and '$(TargetOS)' == 'browser' and '$(RuntimeFlavor)' == 'CoreCLR'">01:30:00</_workItemTimeout> | ||
| <_workItemTimeout Condition="'$(_workItemTimeout)' == '' and '$(TargetOS)' == 'browser'">00:30:00</_workItemTimeout> |
|
/azp run runtime-libraries-coreclr outerloop-linux |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Summary
Adds a scheduled outerloop leg that runs the library test suites on Browser + CoreCLR (Chrome), including the slow
[OuterLoop]tests. Previously the outerloop pipeline had no wasm/CoreCLR coverage — only CoreCLR desktop and Monobrowser_wasm.Getting the slow suites to actually complete on single-threaded wasm required three things, all scoped so innerloop behavior is unchanged:
[OuterLoop]tests that are normally excluded on wasm.[OuterLoop]tests that genuinely cannot run on a single thread (or need a 64-bit address space) with[Conditional*]attributes, so they skip cleanly instead of hanging or failing.CI pipeline
eng/pipelines/libraries/outerloop.yml— newbrowser_wasm+coreclrjob (WasmTestOnChrome,Release,/maxcpucount:1,timeoutInMinutes: 600). Gated onincludeLinuxOuterloop(browser_wasm builds on a Linux host) so it runs in the umbrella /-linuxouterloop definitions only.-testscope all, notouterloop— this is deliberate. Theouterloopscope adds a-trait category=OuterLoopinclude filter, and the wasm test host (xharness) treats a matching include as "always run", short-circuiting the-notrait category=failingexclude. That leaks[OuterLoop]tests out of classes disabled via[ConditionalClass](e.g. onIsMultithreadingSupported) onto single-threaded wasm, where they block the one thread and hang the job.-testscope allcarries no OuterLoop include, so the platform gates are honored; it also covers the slow suites' innerloop tests.Enabling slow suites + timeouts/batching
src/libraries/tests.proj— gate the long-running suite exclusions (System.Linq,System.Memory,System.Private.Uri,System.Private.Xml,System.Runtime.Numerics,System.Text.Json) behind a newRunLongRunningWasmTestsproperty (truefor outerloop/all).System.Formats.Nrbfstays unconditionally excluded (test failures, not slowness).src/libraries/sendtohelix-browser.targets— isolate each slow suite into its own solo Helix batch (via the newSoloItems) so one slow suite cannot inflate a shared batch, and raise the per-batch timeout budget for outerloop.src/tasks/HelixTestTasks/GroupWorkItems.cs— addSoloItems: work items named here each get their own negative-ID solo batch regardless of size.src/tasks/HelixTestTasks/ComputeBatchTimeout.cs— parameterize the per-batch timeout (MinutesPerSuite/MinimumMinutes/MaximumMinutes); defaults reproduce the previous 20-min/30-min-floor/1439-cap behavior exactly.eng/testing/tests.wasm.targets— outerloop/all wasm CoreCLR gets a04:00:00per-suite xharness timeout (innerloop keeps01:30:00/00:30:00).src/libraries/sendtohelixhelp.proj— extend the non-batched browser + CoreCLR work-item timeout to03:00:00for outerloop/all (safety net for non-batched paths).The four timeout layers are kept ordered suite (4h) ≤ batch (5h cap) ≤ AzDO job (10h) so a slow suite is bounded by its own timeout, and the synchronous send-to-Helix wait always fits inside the job.
Test gates (single-threaded / 64-bit only
[OuterLoop]tests)Discovered by running each candidate suite on wasm/Chrome under
-testscope alland gating only the tests that actually fail — at method or class level, never viatests.projexclusions:System.Threading.Tasks.Parallel/.../ParallelStateTest.cs[ConditionalClass(IsMultithreadingSupported)]onParallelState(all 77 throw PNSE inTaskReplicator.Run)System.Linq.Parallel/.../DegreeOfParallelismTests.csDegreeOfParallelism_{Barrier,Pipelining,Throttled_Pipelining}→[ConditionalTheory(IsMultithreadingSupported)]System.Linq.Parallel/.../ExchangeTests.csPartitioning_{Default,Striped}_Longrunning→ bare[ConditionalTheory](delegate into bodies that alreadythrow SkipTestException; the bare form keepspartitions ≤ 1rows passing and skips the rest)System.Linq.Parallel/.../CancellationParallelQueryCombinationTests.csSequenceEqual_OperationCanceledException→[ConditionalTheory(IsMultithreadingSupported)]System.Linq.Parallel/.../WithCancellationTests.csWithCancellation_DisposedEnumerator_...ProducerBlocked→[ConditionalTheory(IsMultithreadingSupported)]System.Runtime/.../System.Threading.Tasks.Tests/MethodCoverage.csTaskContinuation→[ConditionalFact(IsMultithreadingSupported)]System.Runtime/.../System.Threading.Tasks.Tests/.../TaskAwaiterTests.csGetResult_NotCompleted_BlocksUntilCompletion→[ConditionalFact(IsMultithreadingSupported)]System.Runtime/.../System.Threading.Tasks.Tests/Task/TaskAPMTest.csWaitUntilCompleteTechnique,PollUntilCompleteTechnique→[ConditionalTheory(IsMultithreadingSupported)]System.Runtime/.../System.Threading.Tasks.Tests/TaskScheduler/TaskSchedulerTests.csRunBlockedInjectionTest,RunSynchronizationContextTaskSchedulerTests→[ConditionalFact(IsMultithreadingSupported)]System.Runtime/.../System.Runtime.Tests/System/GCTests.csWaitForPendingFinalizersRaces→ addIsMultithreadingSupportedto existing gateSystem.Collections.Immutable/.../Frozen/FrozenDictionaryTests.cs,FrozenSetTests.csToFrozen*_WithExtremelyLargeStrings→[ConditionalFact(Is64BitProcess)](~1 GB strings OOM on 32-bit)Local verification (wasm/Chrome, Release,
-testscope all)ConditionalClass-gated and honored underall).Notes / things to watch on the first runs
System.Net.Http.FunctionalandSystem.Net.WebSockets.Clientwere not run locally (need loopback servers); their unsupported classes areConditionalClass-gated and honored underall. The draft run will surface any remaining gaps.Draft for CI validation.
Note
This pull request was created with assistance from GitHub Copilot.