hiphtml

package module
v0.0.0-...-fe41f36 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 16, 2019 License: Unlicense Imports: 4 Imported by: 0

README

HipHTML: Simple HTML for Go

HipHTML is a wrapper around golang.org/x/net/html, providing a layer of abstraction that simplifies web scraping.

At the moment, the API is not stable. The package is still in development and may undergo major refactoring in the near future.

Contributions are welcome!

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrBegOfDoc       = errors.New("reached beginning of document")
	ErrEndOfDoc       = errors.New("reached end of document")
	ErrNoSuchRelative = errors.New("node does not have requested rel")
)

Parser errors

Functions

This section is empty.

Types

type Parser

type Parser struct {
	// contains filtered or unexported fields
}

Parser is a HTML parser.

func NewParser

func NewParser(r io.Reader) (p *Parser, err error)

NewParser returns a html parser from an io.Reader.

func (*Parser) Body

func (p *Parser) Body() (*html.Node, error)

Body advances the parser to the body element in the document.

func (*Parser) FirstChild

func (p *Parser) FirstChild() (*html.Node, error)

FirstChild advances the parser to the first child if it exists.

func (*Parser) FirstElementByAtom

func (p *Parser) FirstElementByAtom(a atom.Atom) (*html.Node, error)

FirstElementByAtom advances the parser to the first element in a document with the given atom.

func (*Parser) FirstMeta

func (p *Parser) FirstMeta() (*html.Node, error)

FirstMeta advances the parser to the first meta tag in the document.

func (*Parser) Head

func (p *Parser) Head() (*html.Node, error)

Head advances the parser to the head element in the document.

func (*Parser) LastChild

func (p *Parser) LastChild() (*html.Node, error)

LastChild advances the parser to the last child if it exists.

func (*Parser) Level

func (p *Parser) Level() int

Level returns the current level of the html parser

The <html> tag is fixed to level 1, and nested tags are one level higher than their parent.

func (*Parser) Next

func (p *Parser) Next() (*html.Node, error)

Next advances the parser to the next node if it exists.

func (*Parser) NextElement

func (p *Parser) NextElement() (*html.Node, error)

NextElement advances the parser to the next element node if it exists.

func (*Parser) NextElementByAtom

func (p *Parser) NextElementByAtom(a atom.Atom) (*html.Node, error)

NextElementByAtom advances to the next element with the given atom.

func (*Parser) NextMeta

func (p *Parser) NextMeta() (*html.Node, error)

NextMeta advances the parser to the next meta tag in the document.

func (*Parser) NextSibling

func (p *Parser) NextSibling() (*html.Node, error)

NextSibling advances the parser to the next sibling if it exists.

func (*Parser) Node

func (p *Parser) Node() *html.Node

Node returns the current node of the html parser

func (*Parser) Parent

func (p *Parser) Parent() (*html.Node, error)

Parent retreats the parser to the parent if it exists.

func (*Parser) Prev

func (p *Parser) Prev() (*html.Node, error)

Prev retreats the parser to the previous node if it exists.

func (*Parser) PrevElement

func (p *Parser) PrevElement() (*html.Node, error)

PrevElement retreats the parser to the previous element node if it exists.

func (*Parser) PrevSibling

func (p *Parser) PrevSibling() (*html.Node, error)

PrevSibling retreats the parser to the previous sibling if it exists.

func (*Parser) Reset

func (p *Parser) Reset() *html.Node

Reset retreats the parser to the beginning of the document.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL