Move delta record/collect coordination from instrument to series level by jack-berg · Pull Request #8313 · open-telemetry/opentelemetry-java

jack-berg · 2026-04-21T20:37:55Z

Alternative to #8077. Where #8077 improves performance by striping the AtomicInteger used to coordinate between record and collect threads, this comes at the cost of single threaded performance because selecting the striped instance of AtomicInteger is additional work that is not necessary.

With this PR, I also increase the number of coordinator AtomicInteger instances to reduce contention, but do so by moving the AtomicInteger from the instrument level (i.e. DeltaSynchronoousMetricStorage) to timeseries level (i.e. AggregatorHandle).

Each timeseries get its own AtomicInteger
At record time, the record thread increments / decrements the timeseries' AtomicInteger instance
At collect time, the collect thread marks all timeseries AtomicInteger instances for collect, then waits for all active records to complete, then collects

This means that there's no additional work for the single threaded case, but callers recording against multiple series concurrently get to distribute the contention across as many AtomicInteger instances as there are series. You see this manifest in the benchmark diff in that single threaded cases don't change, and there are significant improvements in the threads=4, temporality=delta, cardinality=100 cases.

One downside of this approach is that multiple threads concurrently recording to a single timeseries still are bottlenecked by a single AtomicInteger. You see this manifest in the benchmark diff in that threads=4, temporality=delta, cardinality=1 cases don't improve.

Note, the benchmarks below incorporate the changes from #8308, which leads to much less variance/noise than previously.

Threads	Temporality	Cardinality	Instrument	0_baseline (ops/s)	2_aggregator_handle_tracking_minamlist (ops/s)	Δ ops/s	Δ %
1	DELTA	1	COUNTER_SUM	139,309,882	126,772,929	-12,536,953	-9.0%
1	DELTA	1	UP_DOWN_COUNTER_SUM	138,953,746	133,244,856	-5,708,890	-4.1%
1	DELTA	1	GAUGE_LAST_VALUE	37,205,058	41,952,132	+4,747,074	+12.8%
1	DELTA	1	HISTOGRAM_EXPLICIT	76,174,648	72,036,559	-4,138,089	-5.4%
1	DELTA	1	HISTOGRAM_BASE2_EXPONENTIAL	43,956,180	41,842,933	-2,113,247	-4.8%
1	DELTA	100	COUNTER_SUM	88,390,625	101,415,374	+13,024,749	+14.7%
1	DELTA	100	UP_DOWN_COUNTER_SUM	100,682,536	104,228,731	+3,546,195	+3.5%
1	DELTA	100	GAUGE_LAST_VALUE	31,129,986	30,153,053	-976,933	-3.1%
1	DELTA	100	HISTOGRAM_EXPLICIT	79,728,720	71,241,352	-8,487,368	-10.6%
1	DELTA	100	HISTOGRAM_BASE2_EXPONENTIAL	41,119,222	42,712,082	+1,592,860	+3.9%
1	CUMULATIVE	1	COUNTER_SUM	156,207,611	156,051,490	-156,121	-0.1%
1	CUMULATIVE	1	UP_DOWN_COUNTER_SUM	164,723,673	164,464,216	-259,457	-0.2%
1	CUMULATIVE	1	GAUGE_LAST_VALUE	63,483,186	67,329,671	+3,846,485	+6.1%
1	CUMULATIVE	1	HISTOGRAM_EXPLICIT	102,708,149	102,593,029	-115,120	-0.1%
1	CUMULATIVE	1	HISTOGRAM_BASE2_EXPONENTIAL	46,786,722	47,702,621	+915,899	+2.0%
1	CUMULATIVE	100	COUNTER_SUM	93,234,874	92,660,085	-574,789	-0.6%
1	CUMULATIVE	100	UP_DOWN_COUNTER_SUM	102,773,896	93,401,286	-9,372,610	-9.1%
1	CUMULATIVE	100	GAUGE_LAST_VALUE	90,948,347	87,231,953	-3,716,394	-4.1%
1	CUMULATIVE	100	HISTOGRAM_EXPLICIT	78,204,809	77,987,516	-217,293	-0.3%
1	CUMULATIVE	100	HISTOGRAM_BASE2_EXPONENTIAL	44,678,570	44,894,584	+216,014	+0.5%
4	DELTA	1	COUNTER_SUM	20,959,707	16,469,448	-4,490,259	-21.4%
4	DELTA	1	UP_DOWN_COUNTER_SUM	20,991,865	16,210,248	-4,781,617	-22.8%
4	DELTA	1	GAUGE_LAST_VALUE	19,358,172	18,777,835	-580,337	-3.0%
4	DELTA	1	HISTOGRAM_EXPLICIT	11,868,012	11,091,512	-776,500	-6.5%
4	DELTA	1	HISTOGRAM_BASE2_EXPONENTIAL	10,236,804	10,606,743	+369,939	+3.6%
4	DELTA	100	COUNTER_SUM	24,494,401	73,211,762	+48,717,361	+198.9%
4	DELTA	100	UP_DOWN_COUNTER_SUM	24,520,580	67,246,731	+42,726,151	+174.2%
4	DELTA	100	GAUGE_LAST_VALUE	22,992,790	47,030,541	+24,037,751	+104.5%
4	DELTA	100	HISTOGRAM_EXPLICIT	23,253,146	56,965,510	+33,712,364	+145.0%
4	DELTA	100	HISTOGRAM_BASE2_EXPONENTIAL	21,707,199	57,301,938	+35,594,739	+164.0%
4	CUMULATIVE	1	COUNTER_SUM	79,758,391	75,378,143	-4,380,248	-5.5%
4	CUMULATIVE	1	UP_DOWN_COUNTER_SUM	73,199,378	73,274,059	+74,681	+0.1%
4	CUMULATIVE	1	GAUGE_LAST_VALUE	30,208,943	30,232,185	+23,242	+0.1%
4	CUMULATIVE	1	HISTOGRAM_EXPLICIT	18,515,768	18,585,553	+69,785	+0.4%
4	CUMULATIVE	1	HISTOGRAM_BASE2_EXPONENTIAL	18,665,724	16,009,460	-2,656,264	-14.2%
4	CUMULATIVE	100	COUNTER_SUM	108,295,650	99,551,354	-8,744,296	-8.1%
4	CUMULATIVE	100	UP_DOWN_COUNTER_SUM	107,571,455	101,067,120	-6,504,335	-6.0%
4	CUMULATIVE	100	GAUGE_LAST_VALUE	95,965,166	97,450,271	+1,485,105	+1.5%
4	CUMULATIVE	100	HISTOGRAM_EXPLICIT	78,957,306	76,630,707	-2,326,599	-2.9%
4	CUMULATIVE	100	HISTOGRAM_BASE2_EXPONENTIAL	73,614,813	73,852,367	+237,554	+0.3%

jack-berg · 2026-04-21T20:40:05Z

+import java.util.concurrent.ConcurrentHashMap;
+import java.util.logging.Level;
+
+class CumulativeSynchronousMetricStorage<T extends PointData>


With this, the code paths for cumulative vs. delta are different enough that including them in the same file just makes everything harder to read / understand.

Separately, just look how simple the cumulative code paths are. With delta we jump through extra hoops required for record/collect coordination, and despite that still can't match the cumulative performance, which is just dirt simple.

jack-berg · 2026-04-21T20:41:20Z

      Aggregation aggregation,
      MemoryMode memoryMode,
      InstrumentValueType instrumentValueType) {
+    for (int repetition = 0; repetition < 50; repetition++) {


TODO: revert or make configurable before merging

Increasing the reps of this test really helps build confidence that experiments are correct.

A useful tool when iterating experiments, but increases the test runtime.

codecov · 2026-04-21T20:54:01Z

Codecov Report

❌ Patch coverage is 98.20359% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.89%. Comparing base (df8063b) to head (d743384).
⚠️ Report is 24 commits behind head on main.

Files with missing lines	Patch %	Lines
.../internal/state/DeltaSynchronousMetricStorage.java	98.41%	0 Missing and 2 partials ⚠️
...rnal/state/CumulativeSynchronousMetricStorage.java	97.50%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #8313      +/-   ##
============================================
+ Coverage     90.31%   90.89%   +0.58%     
- Complexity     7725     8002     +277     
============================================
  Files           850      900      +50     
  Lines         23259    24087     +828     
  Branches       2364     2402      +38     
============================================
+ Hits          21007    21895     +888     
+ Misses         1528     1454      -74     
- Partials        724      738      +14

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…y-java into delta-aggregator-handle-coordination-1

jack-berg · 2026-05-07T14:07:16Z

                    instrument.record(measurement, attr);
                  }
-                  Uninterruptibles.sleepUninterruptibly(ONE_MICROSECOND);
+                  Thread.yield();


TIL that java 8, 11, 17 don't have the precision to sleep for periods less than 1ms, so every 1us sleep was actually sleeping for 1ms, such that this essentially timed out because it took so long after I changed it to run 50 times to increase confidence of correctness.

So to make run fast, I switched to a strategy where I start all the record and collect threads, but gate their activity on a latch that indicates the collect thread has started. After each record, yield the thread to allow interleaving. This allows the concurrent collect and record we want to simulate without needing to sleep.

Both spin loops in DeltaSynchronousMetricStorage burned the collect thread's core while waiting for a recorder that may have been preempted mid-section. On constrained CPU (single-CPU containers, pinned cores), the recorder could not be rescheduled until the OS forced a quantum, multiplied per stuck handle. Add Thread.yield() in both loops so the collect thread releases its core. Matches the existing yield in acquireHandleForRecord. Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>

zeitlinger · 2026-05-08T07:50:59Z

Reviewed against #8077. Confirms this is pareto-optimal: 8077's striped ReentrantLocks gave large multi-threaded wins but regressed single-threaded performance by 26-42% on DELTA hot paths. This PR keeps single-threaded baseline (mostly within ±5%, worst -10.6%) while still capturing the 100-200% multi-threaded wins at cardinality=100. The acknowledged miss at threads=4/cardinality=1 is the right tradeoff — that workload is rare in practice (real instruments have labels), and single-threaded performance is not.

I also compared the design with prometheus/client_java's Buffer.java. It uses the same even/odd AtomicInteger + busy-wait protocol, but stripes per-CPU rather than per-series. That makes sense for client_java because their hot path is histogram.observe() on a single series; it would not fit the OTel goal here (per-CPU striping would re-introduce the single-threaded regression that closed 8077). Per-series striping is the right axis for this workload.

Walked the happens-before chain for the subtle late-recorder case (recorder passes tryAcquireForRecord after awaitRecordersAndUnlock has decremented state back to even). The holder.isLockedForCollect() re-check catches it because the holder gate is locked first and never unlocks within a collection. The comment at line ~660 accurately describes this. Pool growth is bounded because new-series creation is gated during collect, so the previous trim step is correctly removed.

Issues

1. Busy-spin without yield in the collect-thread loops (most important).
awaitRecordersAndUnlock() (line ~354) and lockForCollectAndAwait() (line ~296) will burn the collect thread's core if a recorder is preempted mid-section. On constrained CPU (1-CPU containers, pinned cores, oversubscribed hosts), the recorder cannot be rescheduled until the OS forces a quantum, multiplied per stuck handle. Stacked fix here: jack-berg#27 — adds Thread.yield() to both loops, consistent with the existing yield in acquireHandleForRecord.

2. getDeltaAggregatorHandle name is misleading.
It has side effects (state += 2, may return null) but reads like a plain getter. Suggest renaming to tryAcquireHandleForRecord.

3. Stress test 50× repetitions.
SynchronousInstrumentStressTest.stressTest now runs 50× per parameter combo. Justified by the algorithm's subtlety, but worth gating behind a system property or moving to a dedicated stress profile so PR CI does not pay for it on every run.

laurit · 2026-05-08T11:38:45Z

Stacked fix here: jack-berg#27 — adds Thread.yield() to both loops, consistent with the existing yield in acquireHandleForRecord.

alternatively could consider Thread.onSpinWait that was introduced in jdk9

Yield collect-thread spin loops to unblock preempted recorders

jack-berg · 2026-05-08T14:19:14Z

Busy-spin without yield in the collect-thread loops (most important).

Merged

alternatively could consider Thread.onSpinWait that was introduced in jdk9

Will explore separately: #8382

getDeltaAggregatorHandle name is misleading.

Will improve

Stress test 50× repetitions.

#8313 (comment)

jack-berg added 5 commits April 20, 2026 16:25

Break out top level classes for cumulative, delta sync storage

55a2672

wip

195e93d

tests passing

aec0fd6

spotless

e09fb8d

Cleanup

ba71652

jack-berg requested a review from a team as a code owner April 21, 2026 20:37

jack-berg commented Apr 21, 2026

View reviewed changes

This was referenced Apr 21, 2026

Improve metric contention perf #8077

Closed

Prototype bound instruments #8314

Draft

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

64e8928

…y-java into delta-aggregator-handle-coordination-1

jaydeluca reviewed Apr 24, 2026

View reviewed changes

Comment thread ...src/main/java/io/opentelemetry/sdk/metrics/internal/state/DeltaSynchronousMetricStorage.java Outdated

Comment thread ...ain/java/io/opentelemetry/sdk/metrics/internal/state/CumulativeSynchronousMetricStorage.java Outdated

jack-berg added 2 commits April 28, 2026 16:05

PR feedback

3e2a598

Merge branch 'main' of https://github.com/open-telemetry/opentelemetr…

18a88d1

…y-java into delta-aggregator-handle-coordination-1

github-actions Bot mentioned this pull request May 4, 2026

Pull Request Dashboard #8366

Open

Fix slow test on java 8, 11, 17

88a0a79

github-actions Bot mentioned this pull request May 7, 2026

Pull Request Dashboard #8375

Open

jack-berg commented May 7, 2026

View reviewed changes

zeitlinger mentioned this pull request May 8, 2026

Yield collect-thread spin loops to unblock preempted recorders jack-berg/opentelemetry-java#27

Merged

2 tasks

Merge pull request #27 from zeitlinger/stacked/delta-spin-wait

d743384

Yield collect-thread spin loops to unblock preempted recorders

jack-berg mentioned this pull request May 8, 2026

Explore multi release jar to leverage Thread.onSpinWait vs. Thread.yield on java 9+ #8382

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move delta record/collect coordination from instrument to series level#8313

Move delta record/collect coordination from instrument to series level#8313
jack-berg wants to merge 11 commits intoopen-telemetry:mainfrom
jack-berg:delta-aggregator-handle-coordination-1

jack-berg commented Apr 21, 2026

Uh oh!

jack-berg Apr 21, 2026

Uh oh!

jack-berg Apr 21, 2026

Uh oh!

codecov Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

jack-berg May 7, 2026

Uh oh!

zeitlinger commented May 8, 2026

Uh oh!

laurit commented May 8, 2026

Uh oh!

jack-berg commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jack-berg commented Apr 21, 2026

Uh oh!

jack-berg Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

jack-berg Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

jack-berg May 7, 2026

Choose a reason for hiding this comment

Uh oh!

zeitlinger commented May 8, 2026

Issues

Uh oh!

laurit commented May 8, 2026

Uh oh!

jack-berg commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov Bot commented Apr 21, 2026 •

edited

Loading