Documentation
¶
Overview ¶
Package gopherio provides a simple and lightweight way to parse and manipulate HTML in Go. Inspired by cheerio in JavaScript, it gives you a clean API for selecting, traversing, and extracting content from HTML documents.
gopherio is built for developers who need to:
- scrape or query HTML from web pages
- extract text, attributes, or structured data
- modify or inspect DOM trees in memory
it focuses on being fast, minimal, and intuitive, making it a good fit for projects where pulling in a full browser engine would be overkill.
Example usage:
doc, _ := gopherio.Load(`<html><body><h1>hello</h1></body></html>`) title := doc.Find("h1").Text() fmt.Println(title) // output: hello
gopherio helps you work with HTML in Go the same way cheerio helps in JS.
Index ¶
- type Document
- type Selection
- func (s *Selection) After(content string)
- func (s *Selection) Append(content string)
- func (s *Selection) Attr(key string) string
- func (s *Selection) Attrs() map[string]string
- func (s *Selection) Before(content string)
- func (s *Selection) Children() *Selection
- func (s *Selection) Clone() *Selection
- func (s *Selection) Contains(text string) *Selection
- func (s *Selection) Each(f func(int, *Selection)) *Selection
- func (s *Selection) Empty() bool
- func (s *Selection) Eq(index int) *Selection
- func (s *Selection) Filter(selector string) *Selection
- func (s *Selection) Find(selector string) *Selection
- func (s *Selection) First() *Selection
- func (s *Selection) Has(selector string) *Selection
- func (s *Selection) Html() string
- func (s *Selection) Last() *Selection
- func (s *Selection) Length() int
- func (s *Selection) Map(f func(int, *Selection) string) []string
- func (s *Selection) Next() *Selection
- func (s *Selection) Nodes() []*html.Node
- func (s *Selection) Not(selector string) *Selection
- func (s *Selection) Parent() *Selection
- func (s *Selection) Prepend(content string)
- func (s *Selection) Prev() *Selection
- func (s *Selection) Remove()
- func (s *Selection) ReplaceWith(content string)
- func (s *Selection) Siblings() *Selection
- func (s *Selection) Text() string
- func (s *Selection) Unwrap()
- func (s *Selection) Wrap(content string)
- func (s *Selection) WrapAll(content string)
- func (s *Selection) WrapInner(content string)
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Document ¶
type Document struct {
// contains filtered or unexported fields
}
func Load ¶
Load parses html from different sources: string, []byte, url, or file. The first parameter can be of type string or []byte. If it's a string, gopherio will detect whether it's a raw html snippet, a file path, or a url. For url, you can pass optional headers as a map[string]string.
Examples:
// from raw html string doc, _ := gopherio.Load("<html><body><h1>hello</h1></body></html>") // from []byte data := []byte("<p>world</p>") doc, _ := gopherio.Load(data) // from file doc, _ := gopherio.Load("index.html") // from url doc, _ := gopherio.Load("https://example.com") // from url with headers headers := map[string]string{"User-Agent": "gopherio"} doc, _ := gopherio.Load("https://example.com", headers)
Example (Basic) ¶
package main import ( "fmt" "github.com/AstroX11/gopherio" ) func main() { doc, _ := gopherio.Load(`<html><body><h1>hello</h1></body></html>`) title := doc.Find("h1").Text() fmt.Println(title) }
Output: hello
Example (Combinators) ¶
package main import ( "fmt" "github.com/AstroX11/gopherio" ) func main() { html := ` <div class="wrap"> <h1>Header</h1> <p class="first">para1</p> <p>para2</p> <span>end</span> </div>` doc, _ := gopherio.Load(html) fmt.Println(doc.Find("div.wrap > h1").Text()) fmt.Println(doc.Find("h1 + p").Text()) fmt.Println(doc.Find("h1 ~ span").Text()) }
Output: Header para1 end
Example (Groups) ¶
package main import ( "fmt" "github.com/AstroX11/gopherio" ) func main() { html := `<div><h1>head</h1><p class="a">p1</p><p>p2</p><span>done</span></div>` doc, _ := gopherio.Load(html) nodes := doc.Find("h1, p.a, span") for _, n := range nodes.Nodes() { fmt.Println(n.Data) } }
Output: h1 p span
Example (Pseudo) ¶
package main import ( "fmt" "github.com/AstroX11/gopherio" ) func main() { html := ` <ul> <li>one</li> <li>two</li> <li>three</li> <li>four</li> </ul>` doc, _ := gopherio.Load(html) fmt.Println(doc.Find("li:first-child").Text()) fmt.Println(doc.Find("li:last-child").Text()) fmt.Println(doc.Find("li:nth-child(2)").Text()) fmt.Println(doc.Find("li:nth-of-type(3)").Text()) }
Output: one four two three
Example (Selectors) ¶
package main import ( "fmt" "github.com/AstroX11/gopherio" ) func main() { html := ` <div id="main" class="container"> <h1>Title</h1> <p class="intro">Hello</p> <a href="link1">Link 1</a> <span class="note">Note</span> </div>` doc, _ := gopherio.Load(html) fmt.Println(doc.Find("div#main.container a[href]").Text()) fmt.Println(doc.Find(".intro").Text()) fmt.Println(doc.Find("#main .note").Text()) }
Output: Link 1 Hello Note
func (*Document) Doc ¶
Doc returns the underlying *html.Node of the document. This allows direct access to the raw parsed DOM tree, useful when you need to inspect or manipulate the document beyond what Document helpers provide.
func (*Document) Find ¶
Find searches the document tree using CSS-like selectors. Supports:
- tag (div, h1)
- id (#main)
- class (.btn)
- attributes ([href="x"])
- descendant/child chains (div.container a[href])
- compound selectors (div#main.container)
Example:
doc, _ := gopherio.Load(`<div id="main" class="container"><a href="x">link</a></div>`) doc.Find("div.container a[href]").Text() // link
type Selection ¶
type Selection struct {
// contains filtered or unexported fields
}
func NewSelection ¶
NewSelection creates a Selection from a slice of html.Node pointers.
func (*Selection) After ¶
After inserts the given HTML or node(s) immediately after each element in the selection.
Example:
doc, _ := gopherio.Load(`<div><p>hi</p></div>`) doc.Find("p").After("<span>after</span>") fmt.Println(doc.Find("div").Html()) // <p>hi</p><span>after</span>
func (*Selection) Append ¶
Append inserts the given HTML or node(s) as the last child of each element in the selection.
Example:
doc, _ := gopherio.Load(`<div><p>hi</p></div>`) doc.Find("div").Append("<span>world</span>") fmt.Println(doc.Find("div").Html()) // <p>hi</p><span>world</span>
func (*Selection) Attr ¶
Attr returns the value of the given attribute from the first node in the selection. If the attribute does not exist or the selection is empty, it returns an empty string.
Example:
doc, _ := gopherio.Load(`<a href="/x">link</a>`) fmt.Println(doc.Find("a").Attr("href")) // /x
func (*Selection) Attrs ¶
Attrs returns all attributes of the first node in the selection as a map. If the selection is empty, it returns an empty map.
Example:
doc, _ := gopherio.Load(`<a href="/x" id="link1">link</a>`) fmt.Println(doc.Find("a").Attrs()) // map[href:/x id:link1]
func (*Selection) Before ¶
Before inserts the given HTML or node(s) immediately before each element in the selection.
Example:
doc, _ := gopherio.Load(`<div><p>hi</p></div>`) doc.Find("p").Before("<span>before</span>") fmt.Println(doc.Find("div").Html()) // <span>before</span><p>hi</p>
func (*Selection) Children ¶
Children returns all direct child elements of the nodes in the selection.
Example:
doc, _ := gopherio.Load(`<div><p>one</p><p>two</p></div>`) fmt.Println(doc.Find("div").Children().Length()) // 2
func (*Selection) Clone ¶
Clone creates a deep copy of all nodes in the selection and returns them as a new selection.
Example:
doc, _ := gopherio.Load(`<div><p>hello</p></div>`) clone := doc.Find("p").Clone() fmt.Println(clone.Text()) // hello
func (*Selection) Contains ¶
Contains reduces the selection to elements that contain the given text.
Example:
doc, _ := gopherio.Load(`<ul><li>foo</li><li>bar</li></ul>`) fmt.Println(doc.Find("li").Contains("bar").Text()) // bar
func (*Selection) Each ¶
Each iterates over the nodes in the selection, executing the callback with the index and node wrapped in a new selection. It returns the original selection for chaining.
Example:
doc, _ := gopherio.Load(`<ul><li>one</li><li>two</li></ul>`) doc.Find("li").Each(func(i int, sel *Selection) { fmt.Println(i, sel.Text()) }) // 0 one // 1 two
func (*Selection) Empty ¶
Empty returns true if the selection has no nodes.
Example:
doc, _ := gopherio.Load(`<div></div>`) fmt.Println(doc.Find("p").Empty()) // true
func (*Selection) Eq ¶
Eq returns the element at the specified index as a new selection. If the index is out of range, it returns an empty selection.
Example:
doc, _ := gopherio.Load(`<ul><li>one</li><li>two</li></ul>`) fmt.Println(doc.Find("li").Eq(1).Text()) // two
func (*Selection) Filter ¶
Filter reduces the selection to elements that match the given selector.
Example:
doc, _ := gopherio.Load(`<ul><li class="x">one</li><li>two</li></ul>`) fmt.Println(doc.Find("li").Filter(".x").Text()) // one
func (*Selection) Find ¶
Find searches descendants of the selection using a selector (same as Document.Find).
Example:
doc, _ := gopherio.Load(`<div><p>one</p><p>two</p></div>`) doc.Find("div").Find("p").Each(func(i int, sel *gopherio.Selection) { fmt.Println(sel.Text()) })
func (*Selection) First ¶
First returns the first node in the selection as a new selection. If the selection is empty, it returns an empty selection.
Example:
doc, _ := gopherio.Load(`<ul><li>one</li><li>two</li></ul>`) fmt.Println(doc.Find("li").First().Text()) // one
func (*Selection) Has ¶
Has reduces the selection to elements that have at least one descendant matching the given selector.
Example:
doc, _ := gopherio.Load(`<div><p>inside</p></div><div>empty</div>`) fmt.Println(doc.Find("div").Has("p").Length()) // 1
func (*Selection) Html ¶
Html returns the inner HTML of all nodes in the selection concatenated.
Example:
doc, _ := gopherio.Load(`<div><b>hi</b></div>`) fmt.Println(doc.Find("div").Html()) // <b>hi</b>
func (*Selection) Last ¶
Last returns the last node in the selection as a new selection. If the selection is empty, it returns an empty selection.
Example:
doc, _ := gopherio.Load(`<ul><li>one</li><li>two</li></ul>`) fmt.Println(doc.Find("li").Last().Text()) // two
func (*Selection) Length ¶
Length returns the number of nodes in the selection.
Example:
doc, _ := gopherio.Load(`<ul><li></li><li></li></ul>`) fmt.Println(doc.Find("li").Length()) // 2
func (*Selection) Map ¶
Map applies the callback to each node in the selection and returns a slice of results.
Example:
doc, _ := gopherio.Load(`<ul><li>one</li><li>two</li></ul>`) texts := doc.Find("li").Map(func(i int, sel *Selection) string { return sel.Text() }) fmt.Println(texts) // [one two]
func (*Selection) Next ¶
Next returns the immediately following sibling elements of the nodes in the selection.
Example:
doc, _ := gopherio.Load(`<ul><li>one</li><li>two</li></ul>`) fmt.Println(doc.Find("li").First().Next().Text()) // two
func (*Selection) Nodes ¶
Nodes returns the underlying slice of *html.Node in the selection. this allows direct access to the raw parsed DOM nodes, useful when you need to inspect or manipulate nodes beyond what Selection helpers provide.
func (*Selection) Not ¶
Not removes elements that match the given selector from the selection.
Example:
doc, _ := gopherio.Load(`<ul><li class="x">one</li><li>two</li></ul>`) fmt.Println(doc.Find("li").Not(".x").Text()) // two
func (*Selection) Parent ¶
Parent returns the parent elements of all nodes in the selection (unique).
Example:
doc, _ := gopherio.Load(`<div><p>hi</p></div>`) fmt.Println(doc.Find("p").Parent().Nodes()[0].Data) // div
func (*Selection) Prepend ¶
Prepend inserts the given HTML or node(s) as the first child of each element in the selection.
Example:
doc, _ := gopherio.Load(`<div><p>hi</p></div>`) doc.Find("div").Prepend("<span>start</span>") fmt.Println(doc.Find("div").Html()) // <span>start</span><p>hi</p>
func (*Selection) Prev ¶
Prev returns the immediately preceding sibling elements of the nodes in the selection.
Example:
doc, _ := gopherio.Load(`<ul><li>one</li><li>two</li></ul>`) fmt.Println(doc.Find("li").Last().Prev().Text()) // one
func (*Selection) Remove ¶
func (s *Selection) Remove()
Remove deletes the nodes in the selection from their parent.
Example:
doc, _ := gopherio.Load(`<div><p>hello</p><p>bye</p></div>`) doc.Find("p").First().Remove() fmt.Println(doc.Find("p").Text()) // bye
func (*Selection) ReplaceWith ¶
ReplaceWith replaces each element in the selection with the given HTML or node(s).
Example:
doc, _ := gopherio.Load(`<div><p>hi</p></div>`) doc.Find("p").ReplaceWith("<span>hello</span>") fmt.Println(doc.Find("div").Html()) // <span>hello</span>
func (*Selection) Siblings ¶
Siblings returns all sibling elements of the nodes in the selection (excluding themselves).
Example:
doc, _ := gopherio.Load(`<ul><li>one</li><li>two</li><li>three</li></ul>`) fmt.Println(doc.Find("li").Eq(1).Siblings().Length()) // 2
func (*Selection) Unwrap ¶
func (s *Selection) Unwrap()
Unwrap removes the parent of each element in the selection.
Example:
doc, _ := gopherio.Load(`<div><section><p>hi</p></section></div>`) doc.Find("p").Unwrap() fmt.Println(doc.Find("div").Html()) // <p>hi</p>
func (*Selection) Wrap ¶
Wrap wraps each element in the selection inside the given HTML structure.
Example:
doc, _ := gopherio.Load(`<div><p>hi</p></div>`) doc.Find("p").Wrap("<section class='wrap'></section>") fmt.Println(doc.Find("div").Html()) // <section class="wrap"><p>hi</p></section>
func (*Selection) WrapAll ¶
WrapAll wraps the entire selection with a single wrapper element.
Example:
doc, _ := gopherio.Load(`<div><p>a</p><p>b</p></div>`) doc.Find("p").WrapAll("<section></section>") fmt.Println(doc.Find("div").Html()) // <section><p>a</p><p>b</p></section>