spider

package
v0.0.0-...-1afa7a4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 13, 2018 License: GPL-2.0 Imports: 8 Imported by: 0

Documentation

Index

Constants

View Source
const UserAgent = "wospi - https://github.com/vlad-s/wospi"

UserAgent is the default user agent of the Spider.

Variables

This section is empty.

Functions

This section is empty.

Types

type Options

type Options struct {
	MinLength int    // minimum length of words to be stored by the Spider
	MaxDepth  int    // maximum depth for the crawling
	UserAgent string // user agent used in HTTP requests

	// strict domain matching; if set, www sub-domain won't be added on non-www URL,
	// and non-www domain won't be added on www URL
	StrictDomain bool

	StripResult  bool
	StripCharset []rune
}

Options provides configuration params for the Spider.

func (*Options) Default

func (o *Options) Default()

Default applies default values to the Options.

type Spider

type Spider struct {
	*Options
	// contains filtered or unexported fields
}

Spider provides the options and engines for the scraper.

func New

func New(opts *Options) *Spider

New returns a new Spider.

func (*Spider) Run

func (s *Spider) Run(URL string) error

Run starts the crawling & scraping on the provided URL.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL