Skip to content

fix(eap): Substitute nil trace ID#6079

Open
jjbayer wants to merge 11 commits into
masterfrom
fix/substitute-nil-trace-id
Open

fix(eap): Substitute nil trace ID#6079
jjbayer wants to merge 11 commits into
masterfrom
fix/substitute-nil-trace-id

Conversation

@jjbayer

@jjbayer jjbayer commented Jun 11, 2026

Copy link
Copy Markdown
Member

Nil trace IDs (UUIDs with all bytes zero) have the negative product property that they connect unrelated items into a single trace, and, if there are many of them, have negative partitioning effects (hot partitions).

Let's disallow nil trace IDs everywhere. Instead of treating them as invalid (like we would an invalid UUID), replace them with a random trace ID to avoid any product regression.

Relates to getsentry/sentry-python#6558

@jjbayer jjbayer marked this pull request as ready for review June 11, 2026 12:27
@jjbayer jjbayer requested a review from a team as a code owner June 11, 2026 12:27
Comment thread relay-spans/src/otel_to_sentry_v2.rs
Comment thread relay-event-schema/src/protocol/contexts/trace.rs
@jjbayer jjbayer marked this pull request as draft June 11, 2026 12:47
Comment on lines +1376 to +1377
TraceId::try_from(*event.id.get_or_insert_with(Default::default))
.unwrap_or_else(|_| TraceId::random())

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was another place where nil-trace IDs could originate.

Comment on lines +115 to +136
impl TryFrom<EventId> for TraceId {
type Error = InvalidTraceId;
fn try_from(event_id: EventId) -> Result<Self, Self::Error> {
Self::try_from(event_id.0)
}
}

impl From<Uuid> for TraceId {
fn from(uuid: Uuid) -> Self {
TraceId(uuid)
impl From<TraceId> for Uuid {
fn from(trace_id: TraceId) -> Self {
trace_id.0
}
}

impl From<EventId> for TraceId {
fn from(event_id: EventId) -> Self {
Self::from(event_id.0)
impl fmt::Display for TraceId {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "{}", self.0.as_simple())
}
}

impl From<TraceId> for Uuid {
fn from(trace_id: TraceId) -> Self {
trace_id.0
impl fmt::Debug for TraceId {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "TraceId(\"{}\")", self.0.as_simple())

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the constructors up so they are all in one place. Messes with the diff but makes for better readability over all IMO.

@@ -1,5 +1,5 @@
{
"trace_id": "00000000-0000-0000-0000-000000000000",
"trace_id": "00000000-0000-0000-0000-000000000001",

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests with nil IDs would now fail.

Some((kv.key, Annotated::new(attr_value)))
}));

let trace_id = TraceId::try_from_or_random(otel_link.trace_id.as_slice());

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try_from_or_random is used in other Otel converters, so I thought I might as well use it here.

@jjbayer jjbayer marked this pull request as ready for review June 11, 2026 13:25
Comment thread relay-event-schema/src/protocol/contexts/trace.rs Outdated
Comment thread relay-spans/src/otel_to_sentry_v2.rs
Comment on lines +74 to +75
/// Not a valid UUID.
Uuid,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe no need to leak that there is an underlying uuid id as there are defined rules for trace ids which don't necessarily match a uuid. Could just call it Invalid.

Ok(Self(uuid))
Uuid::from_slice(value)
.map_err(|_| InvalidTraceId::Uuid)
.map(TraceId)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this delegate to TryFrom<Uuid> other wise the nil check is missing?

@jjbayer jjbayer enabled auto-merge June 11, 2026 14:16

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 4b98c24. Configure here.

if uuid.is_nil() {
return Err(InvalidTraceId::Nil);
}
Ok(TraceId(uuid))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serde nil IDs not substituted

Medium Severity

TryFrom/FromStr now reject all-zero trace IDs, and impl_str_serde deserializes through FromStr. Nil IDs in types like DynamicSamplingContext fail parsing instead of being replaced with a random ID, unlike the FromValue path that substitutes nil and records nil_trace_id.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 4b98c24. Configure here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a follow-up PR I will make parsing more restrictive and always reject on nil IDs. For now, an error in the DSC will hit the error boundary and default to None.

Comment on lines 194 to 202
Some((kv.key, Annotated::new(attr_value)))
}));

let trace_id = TraceId::try_from_or_random(otel_link.trace_id.as_slice());
let span_link = SpanV2Link {
trace_id: Annotated::new(hex::encode(otel_link.trace_id).parse()?),
trace_id,
span_id: SpanId::try_from(otel_link.span_id.as_slice())?.into(),
sampled: (otel_link.flags & W3C_TRACE_CONTEXT_SAMPLED != 0).into(),
attributes: Annotated::new(attributes),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The try_from_or_random function incorrectly generates a "trace_id.invalid" remark for nil OTel trace IDs instead of "nil_trace_id", because it only checks if the slice is empty.
Severity: LOW

Suggested Fix

Modify try_from_or_random to handle the InvalidTraceId::Nil error case specifically. This can be done by changing the error handling to pattern match on the error type, similar to how FromValue::from_value is implemented, to distinguish between a nil ID and a generally invalid one. This will allow it to generate the correct "nil_trace_id" remark.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: relay-spans/src/otel_to_sentry_v2.rs#L194-L202

Potential issue: When processing an OTel span link with a nil trace ID (16 zero bytes),
the `TraceId::try_from_or_random` function is called. The underlying `TryFrom<Uuid> for
TraceId` implementation correctly identifies the nil ID and returns an
`InvalidTraceId::Nil` error. However, the `try_from_or_random` function discards this
specific error information. It then checks if the input slice `is_empty()`, which is
false for a 16-byte nil ID. Consequently, it incorrectly adds a remark of
`"trace_id.invalid"` instead of the expected `"nil_trace_id"`. This prevents a metric
counter, which specifically checks for the `"nil_trace_id"` remark, from being
incremented, creating an observability gap.

Also affects:

  • relay-event-schema/src/protocol/contexts/trace.rs:66~139

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants