README
¶
Go library to detect bots based on the HTTP request. A "bot" is defined as any request that isn't a regular browser request initiated by the user. This includes things like web crawlers, but also stuff like "preview" renderers and the like.
Bot()
accepts a http.Request
since it looks at all information, not just
the User-Agent
. You can use UserAgent()
if you just have a User-Agent
, but
it's highly recommended to use Bot()
.
Import as zgo.at/isbot
; API docs: https://pkg.go.dev/zgo.at/isbot
It's not 100% reliable, and there are some known cases where it gets things
wrong. See isbot_test.go
for a list of test cases.
The performance is pretty good; turns out that running a few string.Contains()
is loads faster than a (bot|crawler|search|...)
regexp.
Documentation
¶
Overview ¶
Package isbot attempts to detect HTTP bots.
A "bot" is defined as any request that isn't a regular browser request initiated by the user. This includes things like web crawlers, but also stuff like "preview" renderers and the like.
Index ¶
Constants ¶
const ( NoBotKnown = 0 // Known to not be a bot. NoBotNoMatch = 1 // None of the rules matches, so probably not a bot. BotPrefetch = 2 // Prefetch algorithm BotLink = 3 // User-Agent contained an URL. BotClientLibrary = 4 // Known client library. BotKnownBot = 5 // Known bot. BotBoty = 6 // User-Agent string looks "boty". BotShort = 7 // User-Agent is short of strangely formatted. )
const ( BotRangeAWS = 8 // AWS cloud BotRangeDigitalOcean = 9 // Digital Ocean BotRangeServersCom = 10 // servers.com BotRangeGoogleCloud = 11 // Google Cloud BotRangeHetzner = 12 // hetzner.de )
const ( BotJSPhanton = 150 // Phantom headless browser. BotJSNightmare = 151 // Nightmare headless browser. BotJSSelenium = 152 // Selenium headless browser. BotJSWebDriver = 153 // Generic WebDriver-based headless browser. )
These are never set by isbot, but can be used to send signals from JS; for example:
var is_bot = function() { var w = window, d = document if (w.callPhantom || w._phantom || w.phantom) return 150 if (w.__nightmare) return 151 if (d.__selenium_unwrapped || d.__webdriver_evaluate || d.__driver_evaluate) return 152 if (navigator.webdriver) return 153 return 0 }
Variables ¶
Functions ¶
func Bot ¶
Bot checks if this HTTP request looks like a bot.
It returns one of the constants as the reason we think this is a bot.
Note: this assumes that r.RemoteAddr is set to the real IP, and does not check X-Forwarded-For or X-Real-IP.
func IPRange ¶
IPRange checks if this IP address is from a range that should normally never send browser requests, such as AWS and other cloud providers.
func IsUserAgent ¶
IsUserAgent reports if this is considered a bot because of the User-Agent header.
func Prefetch ¶
Prefetch checks if this request is a browser "pre-fetch" request.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Link_prefetching_FAQ