feat(outcomes): Have relay generate metric billing outcomes#6066
feat(outcomes): Have relay generate metric billing outcomes#6066klochek wants to merge 3 commits into
Conversation
loewenheim
left a comment
There was a problem hiding this comment.
For the tests, I think it's preferable if they are run with the new feature both on and off. Although I do realize that makes it much more annoying to define what the outcome should be for every combination.
|
|
||
| /// Tracks billing-related outcomes for the list of buckets, adding the | ||
| /// "billing_outcome_accepted" tag to the bucket if that bucket is accepted. | ||
| pub fn track_billing_outcome(&self, scoping: Scoping, buckets: &mut [Bucket]) { |
There was a problem hiding this comment.
Is this intentionally only implemented for spans?
There was a problem hiding this comment.
yes, since this only serves the billing outcomes path.
| PYTEST_N ?= auto | ||
| TEST ?= tests | ||
| test-custom: build setup-venv ## run arbitrary tests | ||
| .venv/bin/pytest -n $(PYTEST_N) $(TEST) -vv | ||
| .PHONY: test-custom |
There was a problem hiding this comment.
Nice. One question (as a makefile novice) do we need PYTEST_N ?= auto on line 78 since it is already on line 73?
There was a problem hiding this comment.
Nope, that's me copy-pasting a little too fast :P Per David's comment, I'll remove this from this PR, and then move the variables up to the top so we don't wind up with them everywhere.
|
|
||
| /// Enable relay billing outcome generation. | ||
| #[serde(rename = "organizations:relay-generate-billing-outcome")] | ||
| GenerateBillingOutcome, |
There was a problem hiding this comment.
I think we would want this just under MinidumpUploads, since now it is in the Deprecated* group.
Also must admit that I am not sure if there is a functional difference these days between "organizations:... and "projects:...
There was a problem hiding this comment.
Yep, ideally it's not in the deprecated group.
these days between "organizations:... and "projects:...
You can also roll out a projects: flag to all projects in an org, but you can't roll out a organizations: flag to a single project. Either is fine here.
| ) # should be kept by dynamic sampling | ||
|
|
||
| outcomes = outcomes_consumer.get_outcomes() | ||
| outcomes = outcomes_consumer.get_aggregated_outcomes(timeout=5) |
There was a problem hiding this comment.
Correct, no timeouts, only counts.
| outcomes = outcomes_consumer.get_aggregated_outcomes(timeout=5) | |
| outcomes = outcomes_consumer.get_aggregated_outcomes(n={HOW MANY INDIVIDUAL MESSAGES ARE EXPECTED}) |
The n= is optional, but speeds up tests as we don't have to wait timeout for no more messages to arrive. A missed message will still fail the test, as we assert empty on teardown.
There was a problem hiding this comment.
I did this because the outcome aggregator is pretty timing sensitive; I was seeing both fully-aggregated results, and dis-aggregated results (the same kinds of spans broken into two buckets due to timing.) I've added a new method to make this faster.
There was a problem hiding this comment.
Isn't the default timeout already 5 seconds?
I did this because the outcome aggregator is pretty timing sensitive;
Tests run with minimal/no aggregation:
"outcomes": {
"batch_size": 1,
"batch_interval": 1,
"aggregator": {
"bucket_interval": 1,
"flush_interval": 0,
},
},
Sounds like there is something else going on we should figure out instead. Especially since this works fine for all other tests.
Have you tried specifying the exact amount of outcomes you want to wait for n=?
There was a problem hiding this comment.
Yes, I did exactly that--sometimes I would get unaggregated results, othertimes I would see some aggregation. I'm not super clear on the 'biased' bit in the aggregator loop, if it means it always prefers the timeout, or just weights things differently, such that handling a message first could be possible. I'll do a few runs with more logging to make sure of what I saw.
|
|
||
| /// Enable relay billing outcome generation. | ||
| #[serde(rename = "organizations:relay-generate-billing-outcome")] | ||
| GenerateBillingOutcome, |
There was a problem hiding this comment.
Yep, ideally it's not in the deprecated group.
these days between "organizations:... and "projects:...
You can also roll out a projects: flag to all projects in an org, but you can't roll out a organizations: flag to a single project. Either is fine here.
| }); | ||
| } | ||
| } | ||
| _ => continue, |
There was a problem hiding this comment.
I'd exhaustively match on BucketSummary so there is a compiler error once a new variant is added. This unfortunately means though you need to move the if into the match body.
|
|
||
| /// Tracks billing-related outcomes for the list of buckets, adding the | ||
| /// "billing_outcome_accepted" tag to the bucket if that bucket is accepted. | ||
| pub fn track_billing_outcome(&self, scoping: Scoping, buckets: &mut [Bucket]) { |
There was a problem hiding this comment.
I assume you opted for this additional method because we need to temporarily add the tag to prevent double counting and longterm we can merge this into track?
The comment in track is now also invalid with that change, not a big deal as long as it eventually gets fixed:
// Never emit accepted outcomes for surrogate metrics.
// These are handled from within Sentry.
if !matches!(outcome, Outcome::Accepted) {There was a problem hiding this comment.
It felt right to keep them separate given the divergence in logic, at least for now. Comment updated.
| let all_categories = [DataCategory::Span, DataCategory::Transaction]; | ||
| let num_categories = if is_segment { 2 } else { 1 }; | ||
| let categories = &all_categories[0..num_categories]; |
There was a problem hiding this comment.
| let all_categories = [DataCategory::Span, DataCategory::Transaction]; | |
| let num_categories = if is_segment { 2 } else { 1 }; | |
| let categories = &all_categories[0..num_categories]; | |
| match is_segment { | |
| true => &[DataCategory::Span, DataCategory::Transaction], | |
| false => &[DataCategory::Span], | |
| } |
| /// Tracks billing-related outcomes for the list of buckets, adding the | ||
| /// "billing_outcome_accepted" tag to the bucket if that bucket is accepted. |
There was a problem hiding this comment.
| /// Tracks billing-related outcomes for the list of buckets, adding the | |
| /// "billing_outcome_accepted" tag to the bucket if that bucket is accepted. | |
| /// Emits accepted outcomes for the provided list of buckets. | |
| /// | |
| /// Additionally adds a marker tag `billing_outcome_accepted` to all buckets for which an outcome | |
| /// has been emitted. |
Nit: billing-related would also include filtered outcomes (see more is_billing on TrackRawOutcome).
|
|
||
| /// Tracks billing-related outcomes for the list of buckets, adding the | ||
| /// "billing_outcome_accepted" tag to the bucket if that bucket is accepted. | ||
| pub fn track_billing_outcome(&self, scoping: Scoping, buckets: &mut [Bucket]) { |
There was a problem hiding this comment.
| pub fn track_billing_outcome(&self, scoping: Scoping, buckets: &mut [Bucket]) { | |
| pub fn track_accepted_outcomes(&self, scoping: Scoping, buckets: &mut [Bucket]) { |
| ) # should be kept by dynamic sampling | ||
|
|
||
| outcomes = outcomes_consumer.get_outcomes() | ||
| outcomes = outcomes_consumer.get_aggregated_outcomes(timeout=5) |
There was a problem hiding this comment.
Correct, no timeouts, only counts.
| outcomes = outcomes_consumer.get_aggregated_outcomes(timeout=5) | |
| outcomes = outcomes_consumer.get_aggregated_outcomes(n={HOW MANY INDIVIDUAL MESSAGES ARE EXPECTED}) |
The n= is optional, but speeds up tests as we don't have to wait timeout for no more messages to arrive. A missed message will still fail the test, as we assert empty on teardown.
| PYTEST_N ?= auto | ||
| TEST ?= tests | ||
| test-custom: build setup-venv ## run arbitrary tests | ||
| .venv/bin/pytest -n $(PYTEST_N) $(TEST) -vv | ||
| .PHONY: test-custom |
There was a problem hiding this comment.
Changes to build systems etc. ideally are split out into separate PRs.
Easier to review, also (which I don't expect to be a problem here), if we have to revert the PR it doesn't also revert these changes.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 76c805b. Configure here.
| let categories = match is_segment { | ||
| true => [DataCategory::Span, DataCategory::Transaction].as_slice(), | ||
| false => [DataCategory::Span].as_slice(), | ||
| }; |
There was a problem hiding this comment.
Bug: The track_accepted_outcome function incorrectly generates a transaction outcome for any segment span by only checking is_segment and ignoring was_transaction, leading to billing inaccuracies.
Severity: HIGH
Suggested Fix
Modify the track_accepted_outcome function to align with the logic in extract_quantities. The function should check for both is_segment and was_transaction being true before emitting a DataCategory::Transaction outcome. This ensures that only segments originating from transactions are counted as such for billing.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: relay-server/src/metrics/outcomes.rs#L80-L83
Potential issue: The `track_accepted_outcome` function has logic that is inconsistent
with `extract_quantities`. It emits a `DataCategory::Transaction` outcome for any span
where `is_segment` is true, but it ignores the `was_transaction` flag. The
`extract_quantities` function, however, requires both `is_segment` and `was_transaction`
to be true to count a transaction. This discrepancy will cause over-billing, as segment
spans that did not originate from a transaction (e.g., from raw span ingestion) will be
incorrectly billed as transactions.
Did we get this right? 👍 / 👎 to inform future reviews.

No description provided.