README

Go library to detect bots based on the HTTP request. A "bot" is defined as any request that isn't a regular browser request initiated by the user. This includes things like web crawlers, but also stuff like "preview" renderers and the like.

Bot() accepts a http.Request since it looks at all information, not just the User-Agent. You can use UserAgent() if you just have a User-Agent, but it's highly recommended to use Bot().

Import as zgo.at/isbot; API docs: https://pkg.go.dev/zgo.at/isbot

It's not 100% reliable, and there are some known cases where it gets things wrong. See isbot_test.go for a list of test cases.

The performance is pretty good; turns out that running a few string.Contains() is loads faster than a (bot|crawler|search|...) regexp.

Expand ▾ Collapse ▴

Documentation

Overview

Package isbot attempts to detect HTTP bots.

A "bot" is defined as any request that isn't a regular browser request initiated by the user. This includes things like web crawlers, but also stuff like "preview" renderers and the like.

Index

Constants

const (
	NoBotKnown           = 0  // Known to not be a bot.
	NoBotNoMatch         = 1  // None of the rules matches, so probably not a bot.
	BotPrefetch          = 2  // Prefetch algorithm
	BotLink              = 3  // User-Agent contained an URL.
	BotClientLibrary     = 4  // Known client library.
	BotKnownBot          = 5  // Known bot.
	BotBoty              = 6  // User-Agent string looks "boty".
	BotShort             = 7  // User-Agent is short of strangely formatted.
	BotRangeAWS          = 8  // AWS cloud
	BotRangeDigitalOcean = 9  // Digital Ocean
	BotRangeServersCom   = 10 // servers.com
	BotRangeGoogleCloud  = 11 // Google Cloud
	BotRangeHetzner      = 12 // hetzner.de
)

const (
	BotJSPhanton   = 150 // Phantom headless browser.
	BotJSNightmare = 151 // Nightmare headless browser.
	BotJSSelenium  = 152 // Selenium headless browser.
	BotJSWebDriver = 153 // Generic WebDriver-based headless browser.
)

These are never set by isbot, but can be used to send signals from JS; for example:

var is_bot = function() {
    var w = window, d = document
    if (w.callPhantom || w._phantom || w.phantom)
        return 150
    if (w.__nightmare)
        return 151
    if (d.__selenium_unwrapped || d.__webdriver_evaluate || d.__driver_evaluate)
        return 152
    if (navigator.webdriver)
        return 153
    return 0
}

Variables

This section is empty.

Functions

func Bot

func Bot(r *http.Request) uint8

Bot checks if this HTTP request looks like a bot.

It returns one of the constants as the reason we think this is a bot.

Note: this assumes that r.RemoteAddr is set to the real IP, and does not check X-Forwarded-For or X-Real-IP.

func IPRange

func IPRange(addr string) uint8

IPRange checks if this IP address is from a range that should normally never send browser requests, such as AWS and other cloud providers.

func Is

func Is(r uint8) bool

Is this constant a bot?

func IsNot

func IsNot(r uint8) bool

IsNot is the inverse of Is().

func IsUserAgent

func IsUserAgent(r uint8) bool

IsUserAgent reports if this is considered a bot because of the User-Agent header.

func Prefetch

func Prefetch(h http.Header) bool

Prefetch checks if this request is a browser "pre-fetch" request.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Link_prefetching_FAQ

func UserAgent

func UserAgent(ua string) uint8

UserAgent checks if this User-Agent header looks like a bot.

It returns one of the constants as the reason we think this is a bot.

Types

This section is empty.