data

package
v0.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 17, 2019 License: MIT Imports: 7 Imported by: 0

Documentation

Overview

Package data provides types appropriate for describing the output of a crawler. All of the fields of these types are exported, since they are intended to be marshalled into some transmission format.

In general, the approach is to define simple concepts and embed them in more complex types when that will simplify implementation. For instance, an Address object describes a single URL. A Link object describes a link scraped from a webpage. this Link has an Address to which it points, but it also has anchor text which might be interesting to analyze.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Address

type Address struct {
	Full   string
	Scheme string
	Opaque string
	Host   string
	Path   string
	Query  string
}

Address represents the useful parts of a URL we'd like to have available for analysis. It is the basic type which other address-related types embed.

func MakeAddress

func MakeAddress(rawurl string) *Address

func MakeAddressResolved

func MakeAddressResolved(base *Address, rawurl string) *Address

type Canonical

type Canonical struct {
	Address *Address
	Href    string
}

func MakeCanonical

func MakeCanonical(base *Address, href string) *Canonical

type Hreflang

type Hreflang struct {
	Address  *Address
	Href     string
	Hreflang string
}

func MakeHreflang

func MakeHreflang(base *Address, href, lang string) *Hreflang
type Link struct {
	Address  *Address
	Anchor   string
	Href     string
	Nofollow bool
}
func MakeLink(base *Address, href string, anchor string, nofollow bool) *Link

type Pair

type Pair struct {
	K string
	V string
}

type Result

type Result struct {
	// Crawler state
	Address *Address `json:",omitempty"`
	Depth   int      `mode:"REQUIRED"`

	// Meta
	BodyTextHash string `json:",omitempty"`

	// Content
	Description string
	Title       string
	H1          string
	Robots      string
	Canonical   *Canonical  `json:",omitempty"`
	Links       []*Link     `json:",omitempty"`
	Hreflang    []*Hreflang `json:",omitempty"`

	// Response
	Status     string   `json:",omitempty"`
	StatusCode int      `json:",omitempty"`
	Proto      string   `json:",omitempty"`
	ProtoMajor int      `json:",omitempty"`
	ProtoMinor int      `json:",omitempty"`
	Header     []*Pair  `json:",omitempty"`
	ResolvesTo *Address `json:",omitempty"` // In case of redirect
}

func MakeResult

func MakeResult(rawurl string, depth int, resp *http.Response) *Result

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL