goodbot

package module
v0.0.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 20, 2024 License: MIT Imports: 9 Imported by: 0

README

Good Bot

codecov

drawing

Good Bot is an open-source Go library designed to enhance web application security and user experience by distinguishing beneficial automated agents, or "good bots", from potentially harmful traffic. In the digital ecosystem where bots play a crucial role—from search engine indexing to social media insights and link previews—it's essential to identify and welcome these friendly bots. Good Bot equips Go developers with the tools to recognize these agents accurately, ensuring your analytics remain accurate and your services optimized.

Features
  • Accurate Bot Recognition: Utilizes a comprehensive database of user-agent strings, IP addresses, DNS verification methods, and more to identify good bots with high precision.
  • High Performance: With embedded data, Good Bot starts quickly and operates efficiently, requiring no external dependencies.
  • Flexibility and Customization: The bot database is easily extendable and supports customization to align with various application requirements. Contributions are highly encouraged!
How It Works

Good Bot meticulously analyzes HTTP request headers, verifying user-agent strings and IP addresses against a curated list of known friendly bots. By focusing on valid domain names, CIDR ranges, ASNs, and specific user-agent patterns, Good Bot can accurately classify a bot's intentions, distinguishing between those that enhance your web ecosystem and those that do not.

Getting Started

To use Good Bot in your Go project, simply add it as a dependency:

go get github.com/rynmccrmck/good-bot
Basic Usage

Here's a quick example of how to use Good Bot to detect whether a request comes from a known good bot:

package main

import (
    "fmt"
    goodbot "github.com/rynmccrmck/good-bot"
)

func main() {
    userAgent := "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
    ipAddress := "66.249.66.1"

    result := goodbot.CheckBotStatus(userAgent, ipAddress)
    if result.Status == goodbot.BotStatusFriendly {
        fmt.Printf("Friendly bot detected: %s\n", result.BotName)
    } else {
        fmt.Println("This bot is not recognized as friendly.")
    }
}
Bulk Verifier Tool

Goodbot includes a command-line tool, bulkVerifier, designed to process a CSV file and identify whether each entry (based on user agent and IP address) corresponds to a known good bot. This tool adds two columns to the output CSV: is_good_bot (true/false) and bot_name (the name of the bot if identified).

Installation

Ensure you have Go installed on your system. Clone the repository and navigate to the bulkVerifier directory:

git clone https://github.com/rynmccmrmck/good-bot.git
cd good-bot/cmd/bulkVerifier

Build the tool with Go:

go build -o bulkVerifier
Usage

After building the tool, you can run it directly from the command line:

./bulkVerifier <input.csv> <output.csv>
  • <input.csv>: Path to the input CSV file containing the data to be processed. The CSV should have headers, with the first two columns being user_agent and ip_address.
  • <output.csv>: Path where the output CSV file will be saved. This file will include the original data plus two additional columns: is_good_bot and bot_name.
Input File Format

The input CSV file should be formatted with at least two columns: user_agent and ip_address. Here's an example:

user_agent,ip_address
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html),66.249.66.1
Output File Format

The output CSV file will include the same data as the input file, with two additional columns indicating whether each entry is a known good bot and the bot's name if identified:

user_agent,ip_address,is_good_bot,bot_name
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html),66.249.66.1,Yes,Googlebot
Example

To process an input file named requests.csv and save the output to results.csv, use the following command:

./bulkVerifier requests.csv results.csv

This will analyze each row in requests.csv, determine if the user agent and IP address match a known good bot, and append the results to results.csv.

Contributing

Contributions to Good Bot are welcome! Whether it's enhancing detection logic, reporting bugs, or improving documentation, your input helps make Good Bot better for everyone.

License

Good Bot is released under the MIT License. See the LICENSE file for more information.

Support

For support, please open an issue on our GitHub repository.

Documentation

Overview

Package goodbot provides utilities for network operations such as domain name resolution, ASN lookup, and bot detection mechanisms based on various criteria including IP verification, User-Agent matching, and more. It utilizes external libraries for enhanced functionality like IP to ASN mapping and CIDR checks.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func IsUserAgentMatch

func IsUserAgentMatch(userAgent, uaPattern string) bool

isUserAgentMatch checks if the user agent matches the pattern.

Types

type BotCheckResult

type BotCheckResult struct {
	BotStatus BotStatus
	BotName   string
}

func CheckBotStatus

func CheckBotStatus(userAgent, ipAddress string) (BotCheckResult, error)

CheckBotStatus is a convenience function that uses a default BotService instance to check the bot status for the given user agent and IP address.

type BotService

type BotService struct {
	// contains filtered or unexported fields
}

BotService provides methods for bot detection using network utilities.

func NewBotService

func NewBotService(nu NetworkUtils) *BotService

NewBotService creates a new instance of BotService with the provided NetworkUtils implementation.

func (*BotService) CheckBotStatus

func (bs *BotService) CheckBotStatus(ctx context.Context, userAgent, ipAddress string) (BotCheckResult, error)

CheckBotStatus determines the status of a bot based on the given user agent and IP address. It utilizes internal and external checks to classify bots.

type BotStatus

type BotStatus int
const (
	BotStatusUnknown  BotStatus = iota // Bot is not recognized
	BotStatusFriendly                  // Bot is recognized as friendly
)

type NetworkUtils

type NetworkUtils interface {
	GetDomainName(ipAddress string) string
	GetASN(ipAddress string) (string, error)
}

NetworkUtils defines an interface for network-related utilities including domain name resolution and ASN lookup for a given IP address.

Directories

Path Synopsis
cmd
Package mocks is a generated GoMock package.
Package mocks is a generated GoMock package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL