config

package
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 28, 2025 License: MIT Imports: 0 Imported by: 0

Documentation

Overview

Package config - configuration for the crawler, including ignored file extensions, domains, TLDs and query strings

Index

Constants

This section is empty.

Variables

View Source
var FileExtensions = []string{
	".ai", ".bmp", ".css", ".csv", ".doc", ".gif", ".ico", ".jpeg", ".jpg", ".js", ".pdf", ".png", ".ppsx", ".ps", ".psd", ".svg", ".tif", ".tiff", ".txt", ".xls", ".xml", ".3g2", ".3gp", ".avi", ".flv", ".h264", ".m4v", ".mkv", ".mov", ".mp4", ".mpg", ".mpeg", ".rm", ".swf", ".vob", ".wmv", ".aif", ".cda", ".mid", ".midi", ".mp3", ".mpa", ".ogg", ".wav", ".wma", ".wpl", ".doc", ".docx", ".odt", ".pdf", ".rtf", ".tex", ".txt", ".wks", ".wps", ".wpd", ".xml", ".ods", ".xlr", ".xls", ".xlsx", ".7z", ".arj", ".deb", ".pkg", ".rar", ".rpm", ".gz", ".z", ".zip",
}

FileExtensions - list of file extensions to ignore

View Source
var IgnoreDomains = []string{}/* 178 elements not displayed */

IgnoreDomains - ignore links to these domains

View Source
var IgnoreQuery = []string{
	"lang",
	"utm_",
	"ref",
}

IgnoreQuery - ignore query starting with these strings

View Source
var IgnoreTLD = []string{
	".cn", ".blogspot.com",
}

IgnoreTLD - ignore pages and links with these domains

Functions

This section is empty.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL