Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion sentry_sdk/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1740,7 +1740,7 @@ def match_regex_list(
return False

for item_matcher in regex_list:
if not substring_matching and item_matcher[-1] != "$":
if not substring_matching and (not item_matcher or item_matcher[-1] != "$"):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this fixes the index error, the result that's returned from this change means match_regex_list("anything", [""]) will return True.

An empty string likely represents a typo/mistake on the developers part, and so we should treat this as if [] was passed in.

This would mean that this conditional should look like the following instead:

    for item_matcher in regex_list:
        if not item_matcher:
            return False

        if not substring_matching and item_matcher[-1] != "$":
            item_matcher += "$"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty pattern matches all strings

Medium Severity

The empty-string guard avoids IndexError by treating a falsy item_matcher like a pattern that still gets anchored to "$". With default substring_matching=False, re.search on that pattern matches any non-empty item, so match_regex_list returns True for [""]. Callers such as exclude_beat_tasks then treat every name as matched instead of ignoring the typo like an empty list.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d42a94a. Configure here.

item_matcher += "$"
Comment on lines +1743 to 1744

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: An empty string "" in regex_list causes match_regex_list to always return True. The fix item_matcher += "$" is incorrect as re.search("$", ...) always matches.
Severity: HIGH

Suggested Fix

Instead of converting an empty item_matcher to "$", the function should explicitly handle an empty string in regex_list to mean "match nothing". This could be done by returning False immediately if item_matcher is empty after processing, or by ensuring the regex pattern for an empty string does not match everything.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: sentry_sdk/utils.py#L1743-L1744

Potential issue: The function `match_regex_list` incorrectly handles an empty string
`""` in `regex_list`. The proposed fix converts an empty `item_matcher` to `"$"` before
performing a regex search. However, `re.search("$", item)` always returns a match
because the `$` metacharacter matches the end-of-string position, which is present in
every string. As a result, if any `regex_list` contains an empty string, the function
will incorrectly return `True` for all items. For example, setting `exclude_beat_tasks =
[""]` in the Celery integration would silently exclude all beat tasks from
instrumentation.

Also affects:

  • sentry_sdk/integrations/celery/beat.py:130

Did we get this right? 👍 / 👎 to inform future reviews.


matched = re.search(item_matcher, item)
Expand Down
8 changes: 8 additions & 0 deletions tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -562,6 +562,14 @@ def test_match_regex_list(item, regex_list, expected_result):
assert match_regex_list(item, regex_list) == expected_result


def test_match_regex_list_empty_string_pattern():
# An empty-string pattern must not raise IndexError (regression test).
result = match_regex_list("anything", [""])

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than have this be a standalone test, this case can be incorporated into the test above (test_match_regex_list) as one of the parameterized test cases. You'll just need to add the expected outcome as the last element in the array.

This means that last 2 assertions around foo/foobar can also be removed since we already have test cases that check similar behaviour within the array (lines 553 and 554 above)

assert isinstance(result, bool)
assert match_regex_list("foobar", ["foo"]) is False
assert match_regex_list("foo", ["foo"]) is True


@pytest.mark.parametrize(
"version,expected_result",
[
Expand Down
Loading