xmlquery

package module
v1.1.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 6, 2019 License: MIT Imports: 11 Imported by: 0

README

xmlquery

Build Status Coverage Status GoDoc Go Report Card

Overview

xmlquery is an XPath query package for XML document, lets you extract data or evaluate from XML documents by an XPath expression.

Change Logs

2018-12-23

  • added XML output will including comment node. #9

2018-12-03

  • added support attribute name with namespace prefix and XML output. #6

Installation

$ go get github.com/antchfx/xmlquery

Getting Started

Parse a XML from URL.
doc, err := xmlquery.LoadURL("http://www.example.com/sitemap.xml")
Parse a XML from string.
s := `<?xml version="1.0" encoding="utf-8"?><rss version="2.0"></rss>`
doc, err := xmlquery.Parse(strings.NewReader(s))
Parse a XML from io.Reader.
f, err := os.Open("../books.xml")
doc, err := xmlquery.Parse(f)
Find authors of all books in the bookstore.
list := xmlquery.Find(doc, "//book//author")
// or
list := xmlquery.Find(doc, "//author")
Find the second book.
book := xmlquery.FindOne(doc, "//book[2]")
Find all book elements and only get id attribute self. (New Feature)
list := xmlquery.Find(doc,"//book/@id")
Find all books with id is bk104.
list := xmlquery.Find(doc, "//book[@id='bk104']")
Find all books that price less than 5.
list := xmlquery.Find(doc, "//book[price<5]")
Evaluate the total price of all books.
expr, err := xpath.Compile("sum(//book/price)")
price := expr.Evaluate(xmlquery.CreateXPathNavigator(doc)).(float64)
fmt.Printf("total price: %f\n", price)
Evaluate the number of all books element.
expr, err := xpath.Compile("count(//book)")
price := expr.Evaluate(xmlquery.CreateXPathNavigator(doc)).(float64)
Create XML document.
doc := &xmlquery.Node{
	Type: xmlquery.DeclarationNode,
	Data: "xml",
	Attr: []xml.Attr{
		xml.Attr{Name: xml.Name{Local: "version"}, Value: "1.0"},
	},
}
root := &xmlquery.Node{
	Data: "rss",
	Type: xmlquery.ElementNode,
}
doc.FirstChild = root
channel := &xmlquery.Node{
	Data: "channel",
	Type: xmlquery.ElementNode,
}
root.FirstChild = channel
title := &xmlquery.Node{
	Data: "title",
	Type: xmlquery.ElementNode,
}
title_text := &xmlquery.Node{
	Data: "W3Schools Home Page",
	Type: xmlquery.TextNode,
}
title.FirstChild = title_text
channel.FirstChild = title
fmt.Println(doc.OutputXML(true))
// <?xml version="1.0"?><rss><channel><title>W3Schools Home Page</title></channel></rss>

Quick Tutorial

import (
	"github.com/antchfx/xmlquery"
)

func main(){
	s := `<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
  <title>W3Schools Home Page</title>
  <link>https://www.w3schools.com</link>
  <description>Free web building tutorials</description>
  <item>
    <title>RSS Tutorial</title>
    <link>https://www.w3schools.com/xml/xml_rss.asp</link>
    <description>New RSS tutorial on W3Schools</description>
  </item>
  <item>
    <title>XML Tutorial</title>
    <link>https://www.w3schools.com/xml</link>
    <description>New XML tutorial on W3Schools</description>
  </item>
</channel>
</rss>`

	doc, err := xmlquery.Parse(strings.NewReader(s))
	if err != nil {
		panic(err)
	}
	channel := xmlquery.FindOne(doc, "//channel")
	if n := channel.SelectElement("title"); n != nil {
		fmt.Printf("title: %s\n", n.InnerText())
	}
	if n := channel.SelectElement("link"); n != nil {
		fmt.Printf("link: %s\n", n.InnerText())
	}
	for i, n := range xmlquery.Find(doc, "//item/title") {
		fmt.Printf("#%d %s\n", i, n.InnerText())
	}
}

List of supported XPath query packages

Name Description
htmlquery XPath query package for the HTML document
xmlquery XPath query package for the XML document
jsonquery XPath query package for the JSON document

Questions

Please let me know if you have any questions

Documentation

Overview

Package xmlquery provides extract data from XML documents using XPath expression.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func FindEach

func FindEach(top *Node, expr string, cb func(int, *Node))

FindEach searches the html.Node and calls functions cb. Important: this method has deprecated, recommend use for .. = range Find(){}.

func FindEachWithBreak

func FindEachWithBreak(top *Node, expr string, cb func(int, *Node) bool)

FindEachWithBreak functions the same as FindEach but allows you to break the loop by returning false from your callback function, cb. Important: this method has deprecated, recommend use for .. = range Find(){}.

Types

type Node

type Node struct {
	Parent, FirstChild, LastChild, PrevSibling, NextSibling *Node

	Type         NodeType
	Data         string
	Prefix       string
	NamespaceURI string
	Attr         []xml.Attr

	// Application specific field that is never encoded to XML
	Info interface{}
	// contains filtered or unexported fields
}

A Node consists of a NodeType and some Data (tag name for element nodes, content for text) and are part of a tree of Nodes.

func Find

func Find(top *Node, expr string) []*Node

Find searches the Node that matches by the specified XPath expr.

func FindOne

func FindOne(top *Node, expr string) *Node

FindOne searches the Node that matches by the specified XPath expr, and returns first element of matched.

func LoadURL

func LoadURL(url string) (*Node, error)

LoadURL loads the XML document from the specified URL.

func Parse

func Parse(r io.Reader) (*Node, error)

Parse returns the parse tree for the XML from the given Reader.

func (*Node) AddAfter added in v1.1.0

func (n *Node) AddAfter(sibling *Node)

func (*Node) AddBefore added in v1.1.0

func (n *Node) AddBefore(sibling *Node)

func (*Node) AddChild added in v1.1.0

func (n *Node) AddChild(child *Node)

func (*Node) AddSibling added in v1.1.0

func (n *Node) AddSibling(sibling *Node)

func (*Node) AppendAttr added in v1.1.2

func (n *Node) AppendAttr(key, val string)

Useful for the @class HTML attribute.

func (*Node) DelAttr added in v1.1.0

func (n *Node) DelAttr(key string) bool

Returns true if the attribute existed and was deleted; false otherwise.

func (*Node) DeleteMe added in v1.1.0

func (n *Node) DeleteMe()

Dereference this node from others so GC can delete them. Also fixes pointers of other nodes.

func (*Node) GetAttr added in v1.1.0

func (n *Node) GetAttr(key string) (string, bool)

func (*Node) GetAttrWithDefault added in v1.1.0

func (n *Node) GetAttrWithDefault(key, empty string) string

func (*Node) InnerText

func (n *Node) InnerText() string

InnerText returns the text between the start and end tags of the object.

func (*Node) IsEmpty added in v1.1.5

func (n *Node) IsEmpty() bool

Returns true if and only if the node is text consisting only of whitespaces

func (*Node) NthChild added in v1.1.0

func (n *Node) NthChild() int

func (*Node) NthChildOfElem added in v1.1.0

func (n *Node) NthChildOfElem() int

func (*Node) OutputPrettyXML added in v1.1.5

func (n *Node) OutputPrettyXML(self bool) string

Same as OutputXML, but pretty.

func (*Node) OutputXML

func (n *Node) OutputXML(self bool) string

OutputXML returns the text that including tags name.

func (*Node) OutputXMLToWriter added in v1.1.0

func (n *Node) OutputXMLToWriter(output io.Writer, self bool, pretty bool)

Same as OutputXML, but different.

func (*Node) Reparent added in v1.1.2

func (n *Node) Reparent(new_parent *Node)

Inserts a node between this and the old parent.

func (*Node) SelectAttr

func (n *Node) SelectAttr(name string) string

SelectAttr returns the attribute value with the specified name.

func (*Node) SelectElement

func (n *Node) SelectElement(name string) *Node

SelectElement finds child elements with the specified name.

func (*Node) SelectElements

func (n *Node) SelectElements(name string) []*Node

SelectElements finds child elements with the specified name.

func (*Node) SetAttr added in v1.1.0

func (n *Node) SetAttr(key, val string) bool

Returns true if the attribute existed and was altered; false if it was added.

func (*Node) String added in v1.1.0

func (n *Node) String() string

func (*Node) TrimText added in v1.1.5

func (n *Node) TrimText() string

Replaces all whitespaces sequences with a single whitespace and trims both extremities.

type NodeNavigator

type NodeNavigator struct {
	// contains filtered or unexported fields
}

func CreateXPathNavigator

func CreateXPathNavigator(top *Node) *NodeNavigator

CreateXPathNavigator creates a new xpath.NodeNavigator for the specified html.Node.

func (*NodeNavigator) Copy

func (x *NodeNavigator) Copy() xpath.NodeNavigator

func (*NodeNavigator) Current

func (x *NodeNavigator) Current() *Node

func (*NodeNavigator) LocalName

func (x *NodeNavigator) LocalName() string

func (*NodeNavigator) MoveTo

func (x *NodeNavigator) MoveTo(other xpath.NodeNavigator) bool

func (*NodeNavigator) MoveToChild

func (x *NodeNavigator) MoveToChild() bool

func (*NodeNavigator) MoveToFirst

func (x *NodeNavigator) MoveToFirst() bool

func (*NodeNavigator) MoveToNext

func (x *NodeNavigator) MoveToNext() bool

func (*NodeNavigator) MoveToNextAttribute

func (x *NodeNavigator) MoveToNextAttribute() bool

func (*NodeNavigator) MoveToParent

func (x *NodeNavigator) MoveToParent() bool

func (*NodeNavigator) MoveToPrevious

func (x *NodeNavigator) MoveToPrevious() bool

func (*NodeNavigator) MoveToRoot

func (x *NodeNavigator) MoveToRoot()

func (*NodeNavigator) NodeType

func (x *NodeNavigator) NodeType() xpath.NodeType

func (*NodeNavigator) Prefix

func (x *NodeNavigator) Prefix() string

func (*NodeNavigator) String

func (x *NodeNavigator) String() string

func (*NodeNavigator) Value

func (x *NodeNavigator) Value() string

type NodeType

type NodeType uint

A NodeType is the type of a Node.

const (
	// DocumentNode is a document object that, as the root of the document tree,
	// provides access to the entire XML document.
	DocumentNode NodeType = iota
	// DeclarationNode is the document type declaration, indicated by the following
	// tag (for example, <!DOCTYPE...> ).
	DeclarationNode
	// ElementNode is an element (for example, <item> ).
	ElementNode
	// TextNode is the text content of a node.
	TextNode
	// CommentNode a comment (for example, <!-- my comment --> ).
	CommentNode
	// AttributeNode is an attribute of element.
	AttributeNode
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL