descrape

package module
v0.1.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 22, 2021 License: MIT Imports: 6 Imported by: 0

README

descrape

A declarative web scraping library for Go (WIP)

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Collection

type Collection struct {
	Name     string
	Selector string
	Fields   []Element
}

type CollectionList

type CollectionList []ElementsMap

type CollectionsMap

type CollectionsMap map[string]CollectionList

type Element

type Element struct {
	Name     string
	Type     string
	Selector string
	Unique   bool
}

type ElementsMap

type ElementsMap map[string]interface{}

type OutFormat added in v0.1.1

type OutFormat string
const (
	JSON OutFormat = "json"
	CSV  OutFormat = "csv"
	XML  OutFormat = "xml"
)

type PageData

type PageData struct {
	PageElements ElementsMap    `json:"page"`
	Collections  CollectionsMap `json:"collections"`
}

type Scraper added in v0.1.1

type Scraper struct {
	URL          string `yaml:"url"`
	Doc          *goquery.Document
	PageData     PageData
	PageElements []Element `yaml:"page"`
	Collections  []Collection
}

func NewScraper added in v0.1.1

func NewScraper(configData []byte) (*Scraper, error)

NewScraper parses a scraper config and returns a Scraper

func (*Scraper) Init added in v0.1.2

func (s *Scraper) Init() error

func (*Scraper) Output added in v0.1.1

func (s *Scraper) Output(format OutFormat) (string, error)

func (*Scraper) Scrape added in v0.1.1

func (s *Scraper) Scrape(outputFormat OutFormat) (string, error)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL