Skip to content

Legit Bot Classification for bot-detection-feature#232

Merged
buixor merged 2 commits into
masterfrom
bot-detection-exclusion-lists
Jun 22, 2026
Merged

Legit Bot Classification for bot-detection-feature#232
buixor merged 2 commits into
masterfrom
bot-detection-exclusion-lists

Conversation

@buixor

@buixor buixor commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Add a first round of common bots to be excluded of the bot detection feature.

Copilot AI review requested due to automatic review settings June 22, 2026 11:55

@sabban sabban left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an initial set of “legit bot” definitions under whitelists/benign_bots/legit_bots/ intended to be excluded from a bot-detection feature.

Changes:

  • Added bot identification entries for GPTBot, Googlebot, Bingbot, Applebot, and Amazonbot.
  • Captured bot verification hints using rDNS regexes and/or IP CIDR ranges in per-bot files.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
whitelists/benign_bots/legit_bots/gptbot.json Adds GPTBot user-agent plus a list of CIDR ranges.
whitelists/benign_bots/legit_bots/googlebot.json Adds Googlebot user-agent plus an rDNS suffix regex.
whitelists/benign_bots/legit_bots/bingbot.json Adds Bingbot-related user-agent value plus an rDNS suffix regex.
whitelists/benign_bots/legit_bots/applebot.json Adds Applebot user-agent plus an rDNS suffix regex.
whitelists/benign_bots/legit_bots/amazonbot.json Adds Amazonbot user-agent plus an rDNS suffix regex.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1 to +2
# Googlebot - https://developers.google.com/search/docs/crawling-indexing/verifying-googlebot
{"name":"googlebot","user_agent":"googlebot","rdns":["(^|\\.)googlebot\\.com$"]}
Comment on lines +1 to +2
# Bingbot - https://www.bing.com/webmasters/help/verifying-that-bingbot-is-bingbot-3905dc26
{"name":"bingbot","user_agent":"bingbot|adidxbot|bingpreview","rdns":["(^|\\.)search\\.msn\\.com$"]}
Comment on lines +1 to +2
# Applebot - https://support.apple.com/en-us/119829
{"name":"applebot","user_agent":"applebot","rdns":["(^|\\.)applebot\\.apple\\.com$"]}
Comment on lines +1 to +2
# Amazonbot - https://developer.amazon.com/amazonbot
{"name":"amazonbot","user_agent":"amazonbot","rdns":["(^|\\.)crawl\\.amazonbot\\.amazon$"]}
Comment on lines +1 to +2
# GPTBot (OpenAI) - https://openai.com/gptbot.json (creationTime 2025-10-30; refresh periodically)
{"name":"gptbot","user_agent":"gptbot","ranges":["4.227.36.0/25","20.125.66.80/28","20.171.206.0/24","20.171.207.0/24","52.230.152.0/24","74.7.175.128/25","74.7.227.0/25","74.7.227.128/25","74.7.228.0/25","74.7.230.0/25","74.7.241.0/25","74.7.241.128/25","74.7.242.0/25","74.7.243.128/25","74.7.244.0/25","132.196.86.0/24","172.182.202.0/25","172.182.204.0/24","172.182.207.0/25","172.182.214.0/24","172.182.215.0/24"]}
@@ -0,0 +1,2 @@
# Bingbot - https://www.bing.com/webmasters/help/verifying-that-bingbot-is-bingbot-3905dc26
{"name":"bingbot","user_agent":"bingbot|adidxbot|bingpreview","rdns":["(^|\\.)search\\.msn\\.com$"]}
@buixor buixor merged commit 87f7d81 into master Jun 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants