gohtmlutil

package module
v0.0.0-...-461a48c Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 15, 2014 License: MIT Imports: 3 Imported by: 0

README

gohtmlutil Build Status GoDoc

Utilities for HTML Parsing

Right now this just implements a single function, Find, which can walk the DOM tree to find a particular node.

Documentation

Overview

gohtmlutil provides helper functions for use with the package `code.google.com/p/go.net/html`.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func Find

func Find(root *html.Node, path string) (node *html.Node, ok bool)

Find a particular node in the tree. The input is a slash-separated list of elements, #names, and .classes to search for. Elements and classes may be combined, such as div#contentName or ul.listClass. A token may also be prefixed with a count N and asterisk (e.g. 2*span), which will find the Nth match.

Example
document := `<html><body><div>
		<span>Some text</span>
		<span name="abc">ABC</span>
		<span class="fancytext">Fancy Text</span>
		</div>
		</body>
		</html>`
root, _ := html.Parse(strings.NewReader(document))

node, _ := Find(root, "html/body/div/#abc")
fmt.Println("Text for #abc is", node.FirstChild.Data)

node, _ = Find(root, "html/body/div/span.fancytext")
fmt.Println("Text for span.fancytext is", node.FirstChild.Data)

node, _ = Find(root, "html/body/div/2*span")
fmt.Println("Text for 2nd span element is", node.FirstChild.Data)

divNode, _ := Find(root, "html/body/div")
node, _ = Find(divNode, "3*span")
fmt.Println("Text for 3rd span element is", node.FirstChild.Data)
Output:

Text for #abc is ABC
Text for span.fancytext is Fancy Text
Text for 2nd span element is ABC
Text for 3rd span element is Fancy Text

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL