Skip to content

feat(outcomes): Have relay generate metric billing outcomes#6066

Open
klochek wants to merge 3 commits into
masterfrom
christopherklochek/ingest-932-move-billing-metrics-consumer-logic-to-relay
Open

feat(outcomes): Have relay generate metric billing outcomes#6066
klochek wants to merge 3 commits into
masterfrom
christopherklochek/ingest-932-move-billing-metrics-consumer-logic-to-relay

Conversation

@klochek

@klochek klochek commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

No description provided.

@klochek klochek requested a review from a team as a code owner June 9, 2026 15:46
@linear-code

linear-code Bot commented Jun 9, 2026

Copy link
Copy Markdown

INGEST-932

@loewenheim loewenheim left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the tests, I think it's preferable if they are run with the new feature both on and off. Although I do realize that makes it much more annoying to define what the outcome should be for every combination.

Comment thread relay-server/src/metrics/outcomes.rs Outdated

/// Tracks billing-related outcomes for the list of buckets, adding the
/// "billing_outcome_accepted" tag to the bucket if that bucket is accepted.
pub fn track_billing_outcome(&self, scoping: Scoping, buckets: &mut [Bucket]) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentionally only implemented for spans?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, since this only serves the billing outcomes path.

Comment thread Makefile Outdated
Comment on lines +78 to +82
PYTEST_N ?= auto
TEST ?= tests
test-custom: build setup-venv ## run arbitrary tests
.venv/bin/pytest -n $(PYTEST_N) $(TEST) -vv
.PHONY: test-custom

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. One question (as a makefile novice) do we need PYTEST_N ?= auto on line 78 since it is already on line 73?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, that's me copy-pasting a little too fast :P Per David's comment, I'll remove this from this PR, and then move the variables up to the top so we don't wind up with them everywhere.

Comment thread relay-dynamic-config/src/feature.rs Outdated
Comment on lines +144 to +147

/// Enable relay billing outcome generation.
#[serde(rename = "organizations:relay-generate-billing-outcome")]
GenerateBillingOutcome,

@tobias-wilfert tobias-wilfert Jun 10, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would want this just under MinidumpUploads, since now it is in the Deprecated* group.

Also must admit that I am not sure if there is a functional difference these days between "organizations:... and "projects:...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, ideally it's not in the deprecated group.

these days between "organizations:... and "projects:...

You can also roll out a projects: flag to all projects in an org, but you can't roll out a organizations: flag to a single project. Either is fine here.

Comment thread tests/integration/test_outcome.py Outdated
) # should be kept by dynamic sampling

outcomes = outcomes_consumer.get_outcomes()
outcomes = outcomes_consumer.get_aggregated_outcomes(timeout=5)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dav1dde you did this a while back (#5908) with the same rational we would not want the timeout here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, no timeouts, only counts.

Suggested change
outcomes = outcomes_consumer.get_aggregated_outcomes(timeout=5)
outcomes = outcomes_consumer.get_aggregated_outcomes(n={HOW MANY INDIVIDUAL MESSAGES ARE EXPECTED})

The n= is optional, but speeds up tests as we don't have to wait timeout for no more messages to arrive. A missed message will still fail the test, as we assert empty on teardown.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because the outcome aggregator is pretty timing sensitive; I was seeing both fully-aggregated results, and dis-aggregated results (the same kinds of spans broken into two buckets due to timing.) I've added a new method to make this faster.

@Dav1dde Dav1dde Jun 11, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the default timeout already 5 seconds?

I did this because the outcome aggregator is pretty timing sensitive;

Tests run with minimal/no aggregation:

            "outcomes": {
                "batch_size": 1,
                "batch_interval": 1,
                "aggregator": {
                    "bucket_interval": 1,
                    "flush_interval": 0,
                },
            },

Sounds like there is something else going on we should figure out instead. Especially since this works fine for all other tests.

Have you tried specifying the exact amount of outcomes you want to wait for n=?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I did exactly that--sometimes I would get unaggregated results, othertimes I would see some aggregation. I'm not super clear on the 'biased' bit in the aggregator loop, if it means it always prefers the timeout, or just weights things differently, such that handling a message first could be possible. I'll do a few runs with more logging to make sure of what I saw.

Comment thread relay-dynamic-config/src/feature.rs Outdated
Comment on lines +144 to +147

/// Enable relay billing outcome generation.
#[serde(rename = "organizations:relay-generate-billing-outcome")]
GenerateBillingOutcome,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, ideally it's not in the deprecated group.

these days between "organizations:... and "projects:...

You can also roll out a projects: flag to all projects in an org, but you can't roll out a organizations: flag to a single project. Either is fine here.

Comment thread relay-server/src/metrics/outcomes.rs Outdated
});
}
}
_ => continue,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd exhaustively match on BucketSummary so there is a compiler error once a new variant is added. This unfortunately means though you need to move the if into the match body.

Comment thread relay-server/src/metrics/outcomes.rs Outdated

/// Tracks billing-related outcomes for the list of buckets, adding the
/// "billing_outcome_accepted" tag to the bucket if that bucket is accepted.
pub fn track_billing_outcome(&self, scoping: Scoping, buckets: &mut [Bucket]) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you opted for this additional method because we need to temporarily add the tag to prevent double counting and longterm we can merge this into track?

The comment in track is now also invalid with that change, not a big deal as long as it eventually gets fixed:

        // Never emit accepted outcomes for surrogate metrics.
        // These are handled from within Sentry.
        if !matches!(outcome, Outcome::Accepted) {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It felt right to keep them separate given the divergence in logic, at least for now. Comment updated.

Comment thread relay-server/src/metrics/outcomes.rs Outdated
Comment on lines +74 to +76
let all_categories = [DataCategory::Span, DataCategory::Transaction];
let num_categories = if is_segment { 2 } else { 1 };
let categories = &all_categories[0..num_categories];

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let all_categories = [DataCategory::Span, DataCategory::Transaction];
let num_categories = if is_segment { 2 } else { 1 };
let categories = &all_categories[0..num_categories];
match is_segment {
true => &[DataCategory::Span, DataCategory::Transaction],
false => &[DataCategory::Span],
}

Comment thread relay-server/src/metrics/outcomes.rs Outdated
Comment on lines +62 to +63
/// Tracks billing-related outcomes for the list of buckets, adding the
/// "billing_outcome_accepted" tag to the bucket if that bucket is accepted.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Tracks billing-related outcomes for the list of buckets, adding the
/// "billing_outcome_accepted" tag to the bucket if that bucket is accepted.
/// Emits accepted outcomes for the provided list of buckets.
///
/// Additionally adds a marker tag `billing_outcome_accepted` to all buckets for which an outcome
/// has been emitted.

Nit: billing-related would also include filtered outcomes (see more is_billing on TrackRawOutcome).

Comment thread relay-server/src/metrics/outcomes.rs Outdated

/// Tracks billing-related outcomes for the list of buckets, adding the
/// "billing_outcome_accepted" tag to the bucket if that bucket is accepted.
pub fn track_billing_outcome(&self, scoping: Scoping, buckets: &mut [Bucket]) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn track_billing_outcome(&self, scoping: Scoping, buckets: &mut [Bucket]) {
pub fn track_accepted_outcomes(&self, scoping: Scoping, buckets: &mut [Bucket]) {

Comment thread tests/integration/test_outcome.py Outdated
) # should be kept by dynamic sampling

outcomes = outcomes_consumer.get_outcomes()
outcomes = outcomes_consumer.get_aggregated_outcomes(timeout=5)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, no timeouts, only counts.

Suggested change
outcomes = outcomes_consumer.get_aggregated_outcomes(timeout=5)
outcomes = outcomes_consumer.get_aggregated_outcomes(n={HOW MANY INDIVIDUAL MESSAGES ARE EXPECTED})

The n= is optional, but speeds up tests as we don't have to wait timeout for no more messages to arrive. A missed message will still fail the test, as we assert empty on teardown.

Comment thread Makefile Outdated
Comment on lines +78 to +82
PYTEST_N ?= auto
TEST ?= tests
test-custom: build setup-venv ## run arbitrary tests
.venv/bin/pytest -n $(PYTEST_N) $(TEST) -vv
.PHONY: test-custom

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to build systems etc. ideally are split out into separate PRs.

Easier to review, also (which I don't expect to be a problem here), if we have to revert the PR it doesn't also revert these changes.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 76c805b. Configure here.

Comment thread tests/integration/test_attachmentsv2.py Outdated
@klochek klochek requested a review from Dav1dde June 11, 2026 01:36
Comment on lines +80 to +83
let categories = match is_segment {
true => [DataCategory::Span, DataCategory::Transaction].as_slice(),
false => [DataCategory::Span].as_slice(),
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The track_accepted_outcome function incorrectly generates a transaction outcome for any segment span by only checking is_segment and ignoring was_transaction, leading to billing inaccuracies.
Severity: HIGH

Suggested Fix

Modify the track_accepted_outcome function to align with the logic in extract_quantities. The function should check for both is_segment and was_transaction being true before emitting a DataCategory::Transaction outcome. This ensures that only segments originating from transactions are counted as such for billing.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: relay-server/src/metrics/outcomes.rs#L80-L83

Potential issue: The `track_accepted_outcome` function has logic that is inconsistent
with `extract_quantities`. It emits a `DataCategory::Transaction` outcome for any span
where `is_segment` is true, but it ignores the `was_transaction` flag. The
`extract_quantities` function, however, requires both `is_segment` and `was_transaction`
to be true to count a transaction. This discrepancy will cause over-billing, as segment
spans that did not originate from a transaction (e.g., from raw span ingestion) will be
incorrectly billed as transactions.

Did we get this right? 👍 / 👎 to inform future reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants