Skip to content

fix(utils): match_regex_list crashes on an empty-string pattern#6505

Open
devteamaegis wants to merge 1 commit into
getsentry:masterfrom
devteamaegis:fix/match-regex-list-empty-string
Open

fix(utils): match_regex_list crashes on an empty-string pattern#6505
devteamaegis wants to merge 1 commit into
getsentry:masterfrom
devteamaegis:fix/match-regex-list-empty-string

Conversation

@devteamaegis

Copy link
Copy Markdown
Contributor

What's broken

match_regex_list checks item_matcher[-1] != "$" to decide whether to anchor each pattern. With the default substring_matching=False, an empty-string element makes ""[-1] raise:

>>> from sentry_sdk.utils import match_regex_list
>>> match_regex_list("anything", [""])
IndexError: string index out of range

exclude_beat_tasks in the Celery integration flows straight into this with the default mode, so an empty string in that user-supplied list crashes the beat instrumentation.

Why it happens

Indexing item_matcher[-1] on an empty string is out of range.

Fix

Short-circuit on empty: (not item_matcher or item_matcher[-1] != "$"). An empty matcher then becomes "$", which matches end-of-string — a sane result instead of a crash.

Test

Added a case asserting match_regex_list("anything", [""]) returns without raising; existing anchored/substring behavior is unchanged.

Fixes #6504

@devteamaegis devteamaegis requested a review from a team as a code owner June 4, 2026 18:03

@ericapisani ericapisani left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening the issue and pull request to address this.

There's a couple of things we'll need to address before this gets merged - let me know if you have any questions about what I've mentioned below.

Comment thread tests/test_utils.py

def test_match_regex_list_empty_string_pattern():
# An empty-string pattern must not raise IndexError (regression test).
result = match_regex_list("anything", [""])

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than have this be a standalone test, this case can be incorporated into the test above (test_match_regex_list) as one of the parameterized test cases. You'll just need to add the expected outcome as the last element in the array.

This means that last 2 assertions around foo/foobar can also be removed since we already have test cases that check similar behaviour within the array (lines 553 and 554 above)

Comment thread sentry_sdk/utils.py

for item_matcher in regex_list:
if not substring_matching and item_matcher[-1] != "$":
if not substring_matching and (not item_matcher or item_matcher[-1] != "$"):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this fixes the index error, the result that's returned from this change means match_regex_list("anything", [""]) will return True.

An empty string likely represents a typo/mistake on the developers part, and so we should treat this as if [] was passed in.

This would mean that this conditional should look like the following instead:

    for item_matcher in regex_list:
        if not item_matcher:
            return False

        if not substring_matching and item_matcher[-1] != "$":
            item_matcher += "$"

@devteamaegis devteamaegis force-pushed the fix/match-regex-list-empty-string branch from 4043b13 to d42a94a Compare June 11, 2026 19:44

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit d42a94a. Configure here.

Comment thread sentry_sdk/utils.py

for item_matcher in regex_list:
if not substring_matching and item_matcher[-1] != "$":
if not substring_matching and (not item_matcher or item_matcher[-1] != "$"):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty pattern matches all strings

Medium Severity

The empty-string guard avoids IndexError by treating a falsy item_matcher like a pattern that still gets anchored to "$". With default substring_matching=False, re.search on that pattern matches any non-empty item, so match_regex_list returns True for [""]. Callers such as exclude_beat_tasks then treat every name as matched instead of ignoring the typo like an empty list.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d42a94a. Configure here.

Comment thread sentry_sdk/utils.py
Comment on lines +1743 to 1744
if not substring_matching and (not item_matcher or item_matcher[-1] != "$"):
item_matcher += "$"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: An empty string "" in regex_list causes match_regex_list to always return True. The fix item_matcher += "$" is incorrect as re.search("$", ...) always matches.
Severity: HIGH

Suggested Fix

Instead of converting an empty item_matcher to "$", the function should explicitly handle an empty string in regex_list to mean "match nothing". This could be done by returning False immediately if item_matcher is empty after processing, or by ensuring the regex pattern for an empty string does not match everything.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: sentry_sdk/utils.py#L1743-L1744

Potential issue: The function `match_regex_list` incorrectly handles an empty string
`""` in `regex_list`. The proposed fix converts an empty `item_matcher` to `"$"` before
performing a regex search. However, `re.search("$", item)` always returns a match
because the `$` metacharacter matches the end-of-string position, which is present in
every string. As a result, if any `regex_list` contains an empty string, the function
will incorrectly return `True` for all items. For example, setting `exclude_beat_tasks =
[""]` in the Celery integration would silently exclude all beat tasks from
instrumentation.

Also affects:

  • sentry_sdk/integrations/celery/beat.py:130

Did we get this right? 👍 / 👎 to inform future reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

match_regex_list raises IndexError when a pattern list contains an empty string

3 participants