crawler

package
v0.0.0-...-03df7be Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 3, 2022 License: GPL-3.0 Imports: 7 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Crawl

func Crawl(endpoint config.Endpoint, nThreads int, depth uint8)

Crawl spawns worker threads to make requests to the endpoint. It takes the results and appends them to the set of links, making sure that the depth does not exceed the supplied parameter. After 10 seconds, if the set does not enlarge, we stop crawling.

func Do

func Do(work chan *Page, results chan *Result)

Do makes an HTTP request to the URL supplied in the work pool, parses it if it follows the HTML schema, looks for "form" and "a" tags and sends back the URLs associated respectively.

Types

type Page

type Page struct {
	Location string
	Depth    uint8
}

Page represents a page that is crawled. It includes the depth at which it was discovered.

type Result

type Result struct {
	*Page
	Err error
}

Result is returned by the Do function. It contains an embedded pointer to a page that gets discovered with its respective depth. The Err field may indicate any error associated while crawling.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL