fix(eap): Substitute nil trace ID#6079
Conversation
| TraceId::try_from(*event.id.get_or_insert_with(Default::default)) | ||
| .unwrap_or_else(|_| TraceId::random()) |
There was a problem hiding this comment.
This was another place where nil-trace IDs could originate.
| impl TryFrom<EventId> for TraceId { | ||
| type Error = InvalidTraceId; | ||
| fn try_from(event_id: EventId) -> Result<Self, Self::Error> { | ||
| Self::try_from(event_id.0) | ||
| } | ||
| } | ||
|
|
||
| impl From<Uuid> for TraceId { | ||
| fn from(uuid: Uuid) -> Self { | ||
| TraceId(uuid) | ||
| impl From<TraceId> for Uuid { | ||
| fn from(trace_id: TraceId) -> Self { | ||
| trace_id.0 | ||
| } | ||
| } | ||
|
|
||
| impl From<EventId> for TraceId { | ||
| fn from(event_id: EventId) -> Self { | ||
| Self::from(event_id.0) | ||
| impl fmt::Display for TraceId { | ||
| fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { | ||
| write!(f, "{}", self.0.as_simple()) | ||
| } | ||
| } | ||
|
|
||
| impl From<TraceId> for Uuid { | ||
| fn from(trace_id: TraceId) -> Self { | ||
| trace_id.0 | ||
| impl fmt::Debug for TraceId { | ||
| fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { | ||
| write!(f, "TraceId(\"{}\")", self.0.as_simple()) |
There was a problem hiding this comment.
I moved the constructors up so they are all in one place. Messes with the diff but makes for better readability over all IMO.
| @@ -1,5 +1,5 @@ | |||
| { | |||
| "trace_id": "00000000-0000-0000-0000-000000000000", | |||
| "trace_id": "00000000-0000-0000-0000-000000000001", | |||
There was a problem hiding this comment.
Tests with nil IDs would now fail.
| Some((kv.key, Annotated::new(attr_value))) | ||
| })); | ||
|
|
||
| let trace_id = TraceId::try_from_or_random(otel_link.trace_id.as_slice()); |
There was a problem hiding this comment.
try_from_or_random is used in other Otel converters, so I thought I might as well use it here.
| /// Not a valid UUID. | ||
| Uuid, |
There was a problem hiding this comment.
nit: Maybe no need to leak that there is an underlying uuid id as there are defined rules for trace ids which don't necessarily match a uuid. Could just call it Invalid.
| Ok(Self(uuid)) | ||
| Uuid::from_slice(value) | ||
| .map_err(|_| InvalidTraceId::Uuid) | ||
| .map(TraceId) |
There was a problem hiding this comment.
Shouldn't this delegate to TryFrom<Uuid> other wise the nil check is missing?
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 4b98c24. Configure here.
| if uuid.is_nil() { | ||
| return Err(InvalidTraceId::Nil); | ||
| } | ||
| Ok(TraceId(uuid)) |
There was a problem hiding this comment.
Serde nil IDs not substituted
Medium Severity
TryFrom/FromStr now reject all-zero trace IDs, and impl_str_serde deserializes through FromStr. Nil IDs in types like DynamicSamplingContext fail parsing instead of being replaced with a random ID, unlike the FromValue path that substitutes nil and records nil_trace_id.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 4b98c24. Configure here.
There was a problem hiding this comment.
In a follow-up PR I will make parsing more restrictive and always reject on nil IDs. For now, an error in the DSC will hit the error boundary and default to None.
| Some((kv.key, Annotated::new(attr_value))) | ||
| })); | ||
|
|
||
| let trace_id = TraceId::try_from_or_random(otel_link.trace_id.as_slice()); | ||
| let span_link = SpanV2Link { | ||
| trace_id: Annotated::new(hex::encode(otel_link.trace_id).parse()?), | ||
| trace_id, | ||
| span_id: SpanId::try_from(otel_link.span_id.as_slice())?.into(), | ||
| sampled: (otel_link.flags & W3C_TRACE_CONTEXT_SAMPLED != 0).into(), | ||
| attributes: Annotated::new(attributes), |
There was a problem hiding this comment.
Bug: The try_from_or_random function incorrectly generates a "trace_id.invalid" remark for nil OTel trace IDs instead of "nil_trace_id", because it only checks if the slice is empty.
Severity: LOW
Suggested Fix
Modify try_from_or_random to handle the InvalidTraceId::Nil error case specifically. This can be done by changing the error handling to pattern match on the error type, similar to how FromValue::from_value is implemented, to distinguish between a nil ID and a generally invalid one. This will allow it to generate the correct "nil_trace_id" remark.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: relay-spans/src/otel_to_sentry_v2.rs#L194-L202
Potential issue: When processing an OTel span link with a nil trace ID (16 zero bytes),
the `TraceId::try_from_or_random` function is called. The underlying `TryFrom<Uuid> for
TraceId` implementation correctly identifies the nil ID and returns an
`InvalidTraceId::Nil` error. However, the `try_from_or_random` function discards this
specific error information. It then checks if the input slice `is_empty()`, which is
false for a 16-byte nil ID. Consequently, it incorrectly adds a remark of
`"trace_id.invalid"` instead of the expected `"nil_trace_id"`. This prevents a metric
counter, which specifically checks for the `"nil_trace_id"` remark, from being
incremented, creating an observability gap.
Also affects:
relay-event-schema/src/protocol/contexts/trace.rs:66~139


Nil trace IDs (UUIDs with all bytes zero) have the negative product property that they connect unrelated items into a single trace, and, if there are many of them, have negative partitioning effects (hot partitions).
Let's disallow nil trace IDs everywhere. Instead of treating them as invalid (like we would an invalid UUID), replace them with a random trace ID to avoid any product regression.
Relates to getsentry/sentry-python#6558