Skip to content

Allow OpenTelemetry*Client.JsonSerializerOptions to control full OTel message formatting#7558

Draft
jozkee wants to merge 2 commits into
mainfrom
otel-chat-honor-jso
Draft

Allow OpenTelemetry*Client.JsonSerializerOptions to control full OTel message formatting#7558
jozkee wants to merge 2 commits into
mainfrom
otel-chat-honor-jso

Conversation

@jozkee

@jozkee jozkee commented Jun 11, 2026

Copy link
Copy Markdown
Member

OpenTelemetry clients hardcode serialization options for gen_ai.input.messages and gen_ai.output.messages, ignoring the user-supplied JsonSerializerOptions.

This PR changes the default JsonSerializerOptions on OpenTelemetryChatClient and OpenTelemetryRealtimeClient/Session from AIJsonUtilities.DefaultOptions to OtelMessageSerializer.DefaultOptions (the previously hardcoded). The provided options now control the entire serialization pipeline, including the final JSON output, not just unknown content type resolution.

Users could now customize formatting (e.g. compact JSON) by copying the default options and overriding the desired properties:

  client.JsonSerializerOptions = new JsonSerializerOptions(client.JsonSerializerOptions) { WriteIndented = false };

Breaking change

Before this fix, the setter on JsonSerializerOptions accepted any valid options object silently, even ones missing the OTel resolver. Those options were only used for a narrow fallback (unknown AIContent subtype resolution), so the missing resolver was harmless; the final serialization always used the hardcoded DefaultOptions.

After this fix, the setter throws ArgumentException if the supplied options don't contain the OtelContext resolver in their TypeInfoResolverChain.


If acceptable,
fixes #7514.

… message formatting

Change the default JsonSerializerOptions on OpenTelemetryChatClient and
OpenTelemetryRealtimeClient/Session from AIJsonUtilities.DefaultOptions
to OtelMessageSerializer.DefaultOptions (OtelContext-based, snake_case,
WriteIndented=true). The provided options now control the entire
serialization pipeline, including the final JSON output, not just
unknown content type resolution.

Users can now customize formatting (e.g. compact JSON) by copying the
default options and overriding the desired properties:

  client.JsonSerializerOptions = new JsonSerializerOptions(
      client.JsonSerializerOptions) { WriteIndented = false };

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jozkee jozkee requested a review from a team as a code owner June 11, 2026 16:37
Copilot AI review requested due to automatic review settings June 11, 2026 16:37

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the OpenTelemetry AI clients’ telemetry serialization so that user-supplied JsonSerializerOptions controls the full JSON formatting of gen_ai.input.messages / gen_ai.output.messages (including the final JSON output), rather than only influencing fallback serialization for unknown content types.

Changes:

  • Switch default serialization options for OpenTelemetry chat + realtime telemetry from AIJsonUtilities.DefaultOptions to OtelMessageSerializer.DefaultOptions.
  • Enforce that assigned JsonSerializerOptions includes the required OTel resolver (throwing ArgumentException otherwise).
  • Add/adjust tests to validate immutability, validation behavior, and WriteIndented formatting control.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/Libraries/Microsoft.Extensions.AI.Tests/Realtime/OpenTelemetryRealtimeClientTests.cs Updates tests to provide valid serializer options and adds coverage for immutability/validation/formatting.
test/Libraries/Microsoft.Extensions.AI.Tests/ChatCompletion/OpenTelemetryChatClientTests.cs Updates tests to chain the test resolver correctly and adds coverage for immutability/validation/formatting.
src/Libraries/Microsoft.Extensions.AI/Realtime/OpenTelemetryRealtimeClientSession.cs Uses configured options for all OTel message JSON serialization in realtime session telemetry.
src/Libraries/Microsoft.Extensions.AI/Realtime/OpenTelemetryRealtimeClient.cs Aligns default options + validates assigned options, and flows options into created sessions.
src/Libraries/Microsoft.Extensions.AI/Common/OtelMessageSerializer.cs Adds validation helper and updates serialization to honor provided options end-to-end.
src/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryChatClient.cs Uses configured options for OTel message JSON formatting and validates assigned options.

Comment on lines +519 to 522
private string SerializeMessage(RealtimeOtelMessage message)
{
return JsonSerializer.Serialize(new[] { message }, OtelContext.Default.IEnumerableRealtimeOtelMessage);
return JsonSerializer.Serialize(new[] { message }, _jsonSerializerOptions.GetTypeInfo(typeof(IEnumerable<RealtimeOtelMessage>)));
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed this is a real gap. SerializeMessage/SerializeMessages can run from the non-streaming path before the options are frozen (today they are only made read-only in GetStreamingResponseAsync), so a caller could mutate JsonSerializerOptions after the first serialization and get inconsistent telemetry, and it is not safe under concurrency. Suggest freezing at the point of first serialization (for example call MakeReadOnly() at the top of SerializeMessage/SerializeMessages, or freeze once in the setter) so the options stay immutable for the lifetime of any emitted span.

Comment on lines +525 to 528
private string SerializeMessages(IEnumerable<RealtimeOtelMessage> messages)
{
return JsonSerializer.Serialize(messages, OtelContext.Default.IEnumerableRealtimeOtelMessage);
return JsonSerializer.Serialize(messages, _jsonSerializerOptions.GetTypeInfo(typeof(IEnumerable<RealtimeOtelMessage>)));
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same root cause as the SerializeMessage comment above. SerializeMessages also reads _jsonSerializerOptions without ensuring it is read-only first, so options can be mutated between operations and produce inconsistent output. Whatever freezing approach we pick should cover both methods (freezing once on first use, or in the setter, would handle both at once).

Comment on lines +30 to +33
Throw.ArgumentException(
nameof(options),
"The provided JsonSerializerOptions is missing the required OpenTelemetry type resolver.");
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, the message is not actionable as written. Since the OtelContext resolver is internal, a user has no obvious way to satisfy the requirement starting from a blank JsonSerializerOptions. Suggest pointing them at the supported path, for example: "The provided JsonSerializerOptions is missing the required OpenTelemetry type resolver. Copy the client's existing JsonSerializerOptions (new JsonSerializerOptions(client.JsonSerializerOptions)) and override only the properties you need." That matches the pattern already shown in the new XML doc remarks.

@dotnet-comment-bot

Copy link
Copy Markdown
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.Extensions.Diagnostics.Testing Line 99 98.65 🔻
Microsoft.Extensions.Telemetry Line 93 91.95 🔻
Microsoft.Extensions.AI Line 89 88.74 🔻
Microsoft.Extensions.AI Branch 89 88.89 🔻
Microsoft.Extensions.AI.OpenAI Line 75 62.65 🔻
Microsoft.Extensions.AI.OpenAI Branch 75 49.63 🔻
Microsoft.Extensions.DataIngestion.MarkItDown Line 75 4.46 🔻
Microsoft.Extensions.DataIngestion.MarkItDown Branch 75 0 🔻
Microsoft.Extensions.Diagnostics.ResourceMonitoring Line 99 96.03 🔻
Microsoft.Extensions.Diagnostics.ResourceMonitoring Branch 99 94.39 🔻
Microsoft.Extensions.Diagnostics.ResourceMonitoring.Kubernetes Line 99 97.73 🔻
Microsoft.Extensions.ServiceDiscovery.Dns Line 75 69.93 🔻
Microsoft.Extensions.ServiceDiscovery.Abstractions Line 75 42.11 🔻
Microsoft.Extensions.ServiceDiscovery.Abstractions Branch 75 42.86 🔻
Microsoft.Extensions.ServiceDiscovery Line 75 67.81 🔻
Microsoft.Extensions.ServiceDiscovery Branch 75 71.43 🔻
Microsoft.Extensions.ServiceDiscovery.Yarp Line 75 73.85 🔻
Microsoft.Extensions.ServiceDiscovery.Yarp Branch 75 70 🔻
Microsoft.Extensions.VectorData.Abstractions Line 75 37.39 🔻
Microsoft.Extensions.VectorData.Abstractions Branch 75 22.73 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Gen.BuildMetadata 97 100
Microsoft.Gen.MetadataExtractor 57 73
Microsoft.Gen.MetricsReports 67 69
Microsoft.Extensions.AI.Abstractions 82 85
Microsoft.Extensions.AI.Evaluation.NLP 0 78
Microsoft.Extensions.Caching.Hybrid 82 84
Microsoft.Extensions.DataIngestion 75 89
Microsoft.Extensions.DataIngestion.Markdig 75 90
Microsoft.Extensions.Http.Resilience 97 100

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=1459916&view=codecoverage-tab

/// Validates that the given options include the required OTel type resolver.
/// </summary>
internal static void ThrowIfMissingOtelResolver(JsonSerializerOptions options)
{

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design question: is the breaking change worth it, or could we avoid it entirely? Rejecting options that lack the resolver is what makes this source/binary breaking, since the setter previously accepted any options object silently. An alternative is to repair instead of reject: if the OtelContext resolver is missing from the supplied options, add it to the TypeInfoResolverChain (cloning first if the instance is read-only) rather than throwing. That keeps previously valid callers working, still honors the user formatting (the #7514 fix), and guarantees conformance because the resolver is always present. The tradeoff is that silently augmenting a caller-provided options object is its own minor surprise. If we deliberately want the strictness, that is fine, but it would be good to call out the breaking change explicitly in the PR description and changelog and confirm it is acceptable for this type.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This begs the question, should we repair also the PropertyNamingPolicy to always be SnakeCaseLower or were you suggesting just repairing the OtelContext and throw if PropertyNamingPolicy != SnakeCaseLower?

@tarekgh tarekgh Jun 12, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question. I dug into this and the short version is that just repairing the OtelContext resolver is not enough, and repairing only PropertyNamingPolicy still leaves gaps.

The key finding is that OtelContext's source-gen settings are not baked into the property metadata when its resolver is reused on a different JsonSerializerOptions instance. The wire format follows the live options. A quick repro with a snake_case context:

  • OtelContext.Default.Options: {"content":"hello","finish_reason":"stop"}
  • fresh options + resolver, PropertyNamingPolicy = null: {"Type":null,"Content":"hello","FinishReason":"stop"}
  • fresh options + resolver, PropertyNamingPolicy = CamelCase: {"type":null,"content":"hello","finishReason":"stop"}
  • default encoder vs UnsafeRelaxedJsonEscaping changes escaping of non-ASCII and < > + characters.

So three settings are conformance critical and all come from the live options: PropertyNamingPolicy (must be SnakeCaseLower, otherwise the gen_ai attribute names are wrong), DefaultIgnoreCondition (WhenWritingNull, otherwise null parts leak in), and Encoder (UnsafeRelaxedJsonEscaping for readable UTF-8). WriteIndented is the only one that is safe to vary, and it is the actual ask in #7514.

That makes full silent repair unattractive: we would have to clone the supplied options (they are read-only), inject the resolver, and force three of the user's settings while keeping only WriteIndented. Since this is a get/set property, get would then return something different from what was set, and we would be silently discarding most of what the caller passed. That is more code and more surprising semantics than the break it avoids.

On whether avoiding the breaking change is worth it: the break is behavioral only (the property signature is unchanged), it fails loudly and early at the setter rather than corrupting data, it lands on a niche audience (only people who set this property today), and the behavior it replaces is the bug we are fixing. So I would lean toward not contorting the design to preserve it, and instead documenting it as an intentional low severity break.

My suggestions, in order of preference:

  1. If an API adjustment is acceptable, make invalid states unrepresentable instead of validating a free-form options object. For example a configure callback Action that receives a clone of DefaultOptions (re-asserting the critical settings after it runs), or a small targeted knob such as a WriteIndented/compact toggle. This removes the repair-vs-throw question entirely and directly serves Allow controlling JSON formatting for gen_ai input/output message telemetry #7514.

  2. If we keep the property, validate and throw, but strengthen the check beyond the resolver to also require PropertyNamingPolicy == SnakeCaseLower (the one that actually corrupts the convention's attribute names), and ideally DefaultIgnoreCondition and Encoder too. Pair it with the actionable "copy the existing options" message.

I would avoid the silent repair path: it is the only option that fully prevents the break, but it costs the most complexity and the least predictable behavior.

@@ -36,8 +50,10 @@ private static JsonSerializerOptions CreateDefaultOptions()
}

internal static string SerializeChatMessages(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are here: two other OTel middlewares call this same SerializeChatMessages but do not pass an options argument, so they always fall back to DefaultOptions and give the user no way to customize telemetry formatting (the same limitation #7514 describes for chat/realtime):

  • OpenTelemetrySpeechToTextClient (SerializeChatMessages([new(ChatRole.Assistant, response.Contents)]))
  • OpenTelemetryImageGenerator (two call sites, for input and output content)

Neither client exposes a JsonSerializerOptions property at all. Should this PR (or a follow-up) extend the same treatment to them, adding the property and flowing it through as options: _jsonSerializerOptions, so the behavior is consistent across all OTel clients that emit gen_ai message JSON? If it is out of scope here, a tracking issue would be good so they do not drift.

@tarekgh tarekgh left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments/questions. It will be good to address those in addition to copilot comments too before merging. LGTM, otherwise.

/// </summary>
internal static void ThrowIfMissingOtelResolver(JsonSerializerOptions options)
{
if (!options.TypeInfoResolverChain.Any(r => r is OtelContext))

@eiriktsarpalis eiriktsarpalis Jun 12, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this check is a bit too aggressive and I'm not aware of precedent elsewhere. Why do we need OtelContext specifically to present in the chain? It may not matter at all if OtelContext sits at the end of the chain with DefaultJsonTypeInfoResolver at the front (it resolves all types, so the serializer would never fall back to OtelContext).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not matter at all if OtelContext sits at the end of the chain

Good point, it won't hold in that case.

For context, OtelContext is internal and is needed for AOT serialization of OTel message-part POCOs. The goal of the change is to allow some user-options to control serialization, so we can either:

a. Reject user options that wouldn't produce otel-conformant json, that is require snake-case and OtelContext first in the resolver chain, "first" is the simple, enforceable approximation that guarantees correctness at the cost of rejecting some valid-but-unusual chains. The exception message will tell users to copy the existing options (new JsonSerializerOptions(otelClient.JsonSerializerOptions)) and then override the properties they need.

b. Copy a curated set of user knobs onto our fixed options (e.g. WriteIndented, Encoder) and ignore the rest. This never throws, but silently dropping options the user set is its own surprise as @tarekgh mentioned, which is why I leaned toward throwing.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c. Do nothing: accept any options, document the guidance. No validation, no throwing, no repairing. The remarks advise copying the default and overwriting only what you need. If a user passes options that break conformance, that's on them.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a. and c. would be breaking in different ways:

(a) Reject — API/compile/throw break. Previously-accepted options now throw ArgumentException from the setter. That's a source/binary behavioral break: code that compiled and ran fine now fails. Clearly breaking.

(c) Do nothing — behavioral break, not an API break. The setter still accepts everything (no throw, same signature), so nothing breaks at compile/bind time. But options that used to be ignored for the envelope now control it, for example, a user passing new() will see errors in AOT for the missing OtelContext.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, agreed, the check targets the wrong thing. Presence of OtelContext does not guarantee it is used (a DefaultJsonTypeInfoResolver at the front resolves everything first), and it is not necessary either, since any resolver that can resolve the OTel types works. GetTypeInfo already throws on its own when nothing resolves the types, so the guard adds no reliable value. I will drop it. Note too that the conformance critical bits (PropertyNamingPolicy, DefaultIgnoreCondition, Encoder) come from the live options, not the resolver, so if we want to guarantee correct telemetry the better levers are validating those settings or narrowing the API. Happy to go whichever way @eiriktsarpalis and @jozkee prefer.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is require snake-case and OtelContext first in the resolver chain

Users can always assign a new JSO to the setter but then modify its settings while it is still mutable. So I don't see how validation at the setter would prevent anything. You might consider running validation when you first try to serialize after the JSO has been marked read-only.

Can I ask why you specifically are looking for OtelContext? I don't think it's a good idea to demand the presence of what amounts to internal implementation detail and users cannot directly control. I would suggest instead to examine specific structural properties of the resultant JsonTypeInfo for the types you're interested in an validate on those.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why you specifically are looking for OtelContext?

As a convenient way of spotting JSO not created with the copy ctor, also for AOT, if we don't do the check, we will get a cryptic InvalidOperationException: "JsonSerializerOptions instance must specify a TypeInfoResolver setting before being marked as read-only."

I would suggest instead to examine specific structural properties of the resultant JsonTypeInfo for the types you're interested in an validate on those

I will give it a try, but I suspect that may be too brittle.

@jozkee jozkee marked this pull request as draft June 12, 2026 23:09
User-supplied JsonSerializerOptions now flow into full OpenTelemetry gen-ai message serialization. To avoid emitting non-conformant JSON, the options are validated and frozen on first use (before MakeReadOnly): the snake_case naming policy must be JsonNamingPolicy.SnakeCaseLower and the built-in OpenTelemetry type resolver must be first in the TypeInfoResolverChain so a reflection-based resolver cannot shadow it. Non-conformant options throw InvalidOperationException with an actionable copy-and-override hint.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow controlling JSON formatting for gen_ai input/output message telemetry

5 participants