xmlquery

package module

v1.1.6 Latest Latest Go to latest Published: Apr 6, 2019 License: MIT Imports: 11 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/gjvnq/xmlquery

Links

Open Source Insights

README ¶

xmlquery

Overview

xmlquery is an XPath query package for XML document, lets you extract data or evaluate from XML documents by an XPath expression.

Change Logs

2018-12-23

added XML output will including comment node. #9

2018-12-03

added support attribute name with namespace prefix and XML output. #6

Installation

$ go get github.com/antchfx/xmlquery

Getting Started

Parse a XML from URL.

doc, err := xmlquery.LoadURL("http://www.example.com/sitemap.xml")

Parse a XML from string.

s := `<?xml version="1.0" encoding="utf-8"?><rss version="2.0"></rss>`
doc, err := xmlquery.Parse(strings.NewReader(s))

Parse a XML from io.Reader.

f, err := os.Open("../books.xml")
doc, err := xmlquery.Parse(f)

Find authors of all books in the bookstore.

list := xmlquery.Find(doc, "//book//author")
// or
list := xmlquery.Find(doc, "//author")

Find the second book.

book := xmlquery.FindOne(doc, "//book[2]")

Find all book elements and only get `id` attribute self. (New Feature)

list := xmlquery.Find(doc,"//book/@id")

Find all books with id is bk104.

list := xmlquery.Find(doc, "//book[@id='bk104']")

Find all books that price less than 5.

list := xmlquery.Find(doc, "//book[price<5]")

Evaluate the total price of all books.

expr, err := xpath.Compile("sum(//book/price)")
price := expr.Evaluate(xmlquery.CreateXPathNavigator(doc)).(float64)
fmt.Printf("total price: %f\n", price)

Evaluate the number of all books element.

expr, err := xpath.Compile("count(//book)")
price := expr.Evaluate(xmlquery.CreateXPathNavigator(doc)).(float64)

Create XML document.

doc := &xmlquery.Node{
	Type: xmlquery.DeclarationNode,
	Data: "xml",
	Attr: []xml.Attr{
		xml.Attr{Name: xml.Name{Local: "version"}, Value: "1.0"},
	},
}
root := &xmlquery.Node{
	Data: "rss",
	Type: xmlquery.ElementNode,
}
doc.FirstChild = root
channel := &xmlquery.Node{
	Data: "channel",
	Type: xmlquery.ElementNode,
}
root.FirstChild = channel
title := &xmlquery.Node{
	Data: "title",
	Type: xmlquery.ElementNode,
}
title_text := &xmlquery.Node{
	Data: "W3Schools Home Page",
	Type: xmlquery.TextNode,
}
title.FirstChild = title_text
channel.FirstChild = title
fmt.Println(doc.OutputXML(true))
// <?xml version="1.0"?><rss><channel><title>W3Schools Home Page</title></channel></rss>

Quick Tutorial

import (
	"github.com/antchfx/xmlquery"
)

func main(){
	s := `<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
  <title>W3Schools Home Page</title>
  <link>https://www.w3schools.com</link>
  <description>Free web building tutorials</description>
  <item>
    <title>RSS Tutorial</title>
    <link>https://www.w3schools.com/xml/xml_rss.asp</link>
    <description>New RSS tutorial on W3Schools</description>
  </item>
  <item>
    <title>XML Tutorial</title>
    <link>https://www.w3schools.com/xml</link>
    <description>New XML tutorial on W3Schools</description>
  </item>
</channel>
</rss>`

	doc, err := xmlquery.Parse(strings.NewReader(s))
	if err != nil {
		panic(err)
	}
	channel := xmlquery.FindOne(doc, "//channel")
	if n := channel.SelectElement("title"); n != nil {
		fmt.Printf("title: %s\n", n.InnerText())
	}
	if n := channel.SelectElement("link"); n != nil {
		fmt.Printf("link: %s\n", n.InnerText())
	}
	for i, n := range xmlquery.Find(doc, "//item/title") {
		fmt.Printf("#%d %s\n", i, n.InnerText())
	}
}

List of supported XPath query packages

Name	Description
htmlquery	XPath query package for the HTML document
xmlquery	XPath query package for the XML document
jsonquery	XPath query package for the JSON document

Questions

Please let me know if you have any questions

Documentation ¶

Overview ¶

Package xmlquery provides extract data from XML documents using XPath expression.

Index ¶

func FindEach(top *Node, expr string, cb func(int, *Node))
func FindEachWithBreak(top *Node, expr string, cb func(int, *Node) bool)
type Node
type NodeNavigator
- func CreateXPathNavigator(top *Node) *NodeNavigator
type NodeType

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func FindEach ¶

func FindEach(top *Node, expr string, cb func(int, *Node))

FindEach searches the html.Node and calls functions cb. Important: this method has deprecated, recommend use for .. = range Find(){}.

func FindEachWithBreak ¶

func FindEachWithBreak(top *Node, expr string, cb func(int, *Node) bool)

FindEachWithBreak functions the same as FindEach but allows you to break the loop by returning false from your callback function, cb. Important: this method has deprecated, recommend use for .. = range Find(){}.

Types ¶

type Node ¶

type Node struct {
	Parent, FirstChild, LastChild, PrevSibling, NextSibling *Node

	Type         NodeType
	Data         string
	Prefix       string
	NamespaceURI string
	Attr         []xml.Attr

	// Application specific field that is never encoded to XML
	Info interface{}
	// contains filtered or unexported fields
}

A Node consists of a NodeType and some Data (tag name for element nodes, content for text) and are part of a tree of Nodes.

func Find ¶

func Find(top *Node, expr string) []*Node

Find searches the Node that matches by the specified XPath expr.

func FindOne ¶

func FindOne(top *Node, expr string) *Node

FindOne searches the Node that matches by the specified XPath expr, and returns first element of matched.

func LoadURL ¶

func LoadURL(url string) (*Node, error)

LoadURL loads the XML document from the specified URL.

func Parse ¶

func Parse(r io.Reader) (*Node, error)

Parse returns the parse tree for the XML from the given Reader.

func (*Node) AddAfter ¶ added in v1.1.0

func (n *Node) AddAfter(sibling *Node)

func (*Node) AddBefore ¶ added in v1.1.0

func (n *Node) AddBefore(sibling *Node)

func (*Node) AddChild ¶ added in v1.1.0

func (n *Node) AddChild(child *Node)

func (*Node) AddSibling ¶ added in v1.1.0

func (n *Node) AddSibling(sibling *Node)

func (*Node) AppendAttr ¶ added in v1.1.2

func (n *Node) AppendAttr(key, val string)

Useful for the @class HTML attribute.

func (*Node) DelAttr ¶ added in v1.1.0

func (n *Node) DelAttr(key string) bool

Returns true if the attribute existed and was deleted; false otherwise.

func (*Node) DeleteMe ¶ added in v1.1.0

func (n *Node) DeleteMe()

Dereference this node from others so GC can delete them. Also fixes pointers of other nodes.

func (*Node) GetAttr ¶ added in v1.1.0

func (n *Node) GetAttr(key string) (string, bool)

func (*Node) GetAttrWithDefault ¶ added in v1.1.0

func (n *Node) GetAttrWithDefault(key, empty string) string

func (*Node) InnerText ¶

func (n *Node) InnerText() string

InnerText returns the text between the start and end tags of the object.

func (*Node) IsEmpty ¶ added in v1.1.5

func (n *Node) IsEmpty() bool

Returns true if and only if the node is text consisting only of whitespaces

func (*Node) NthChild ¶ added in v1.1.0

func (n *Node) NthChild() int

func (*Node) NthChildOfElem ¶ added in v1.1.0

func (n *Node) NthChildOfElem() int

func (*Node) OutputPrettyXML ¶ added in v1.1.5

func (n *Node) OutputPrettyXML(self bool) string

Same as OutputXML, but pretty.

func (*Node) OutputXML ¶

func (n *Node) OutputXML(self bool) string

OutputXML returns the text that including tags name.

func (*Node) OutputXMLToWriter ¶ added in v1.1.0

func (n *Node) OutputXMLToWriter(output io.Writer, self bool, pretty bool)

Same as OutputXML, but different.

func (*Node) Reparent ¶ added in v1.1.2

func (n *Node) Reparent(new_parent *Node)

Inserts a node between this and the old parent.

func (*Node) SelectAttr ¶

func (n *Node) SelectAttr(name string) string

SelectAttr returns the attribute value with the specified name.

func (*Node) SelectElement ¶

func (n *Node) SelectElement(name string) *Node

SelectElement finds child elements with the specified name.

func (*Node) SelectElements ¶

func (n *Node) SelectElements(name string) []*Node

SelectElements finds child elements with the specified name.

func (*Node) SetAttr ¶ added in v1.1.0

func (n *Node) SetAttr(key, val string) bool

Returns true if the attribute existed and was altered; false if it was added.

func (*Node) String ¶ added in v1.1.0

func (n *Node) String() string

func (*Node) TrimText ¶ added in v1.1.5

func (n *Node) TrimText() string

Replaces all whitespaces sequences with a single whitespace and trims both extremities.

type NodeNavigator ¶

type NodeNavigator struct {
	// contains filtered or unexported fields
}

func CreateXPathNavigator ¶

func CreateXPathNavigator(top *Node) *NodeNavigator

CreateXPathNavigator creates a new xpath.NodeNavigator for the specified html.Node.

func (*NodeNavigator) Copy ¶

func (x *NodeNavigator) Copy() xpath.NodeNavigator

func (*NodeNavigator) Current ¶

func (x *NodeNavigator) Current() *Node

func (*NodeNavigator) LocalName ¶

func (x *NodeNavigator) LocalName() string

func (*NodeNavigator) MoveTo ¶

func (x *NodeNavigator) MoveTo(other xpath.NodeNavigator) bool

func (*NodeNavigator) MoveToChild ¶

func (x *NodeNavigator) MoveToChild() bool

func (*NodeNavigator) MoveToFirst ¶

func (x *NodeNavigator) MoveToFirst() bool

func (*NodeNavigator) MoveToNext ¶

func (x *NodeNavigator) MoveToNext() bool

func (*NodeNavigator) MoveToNextAttribute ¶

func (x *NodeNavigator) MoveToNextAttribute() bool

func (*NodeNavigator) MoveToParent ¶

func (x *NodeNavigator) MoveToParent() bool

func (*NodeNavigator) MoveToPrevious ¶

func (x *NodeNavigator) MoveToPrevious() bool

func (*NodeNavigator) MoveToRoot ¶

func (x *NodeNavigator) MoveToRoot()

func (*NodeNavigator) NodeType ¶

func (x *NodeNavigator) NodeType() xpath.NodeType

func (*NodeNavigator) Prefix ¶

func (x *NodeNavigator) Prefix() string

func (*NodeNavigator) String ¶

func (x *NodeNavigator) String() string

func (*NodeNavigator) Value ¶

func (x *NodeNavigator) Value() string

type NodeType ¶

type NodeType uint

A NodeType is the type of a Node.

const (
	// DocumentNode is a document object that, as the root of the document tree,
	// provides access to the entire XML document.
	DocumentNode NodeType = iota
	// DeclarationNode is the document type declaration, indicated by the following
	// tag (for example, <!DOCTYPE...> ).
	DeclarationNode
	// ElementNode is an element (for example, <item> ).
	ElementNode
	// TextNode is the text content of a node.
	TextNode
	// CommentNode a comment (for example, <!-- my comment --> ).
	CommentNode
	// AttributeNode is an attribute of element.
	AttributeNode
)

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

xmlquery

Overview

Change Logs

Installation

Getting Started

Parse a XML from URL.

Parse a XML from string.

Parse a XML from io.Reader.

Find authors of all books in the bookstore.

Find the second book.

Find all book elements and only get id attribute self. (New Feature)

Find all books with id is bk104.

Find all books that price less than 5.

Evaluate the total price of all books.

Evaluate the number of all books element.

Create XML document.

Quick Tutorial

List of supported XPath query packages

Questions

Documentation ¶

Overview ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func FindEach ¶

func FindEachWithBreak ¶

Types ¶

type Node ¶

func Find ¶

func FindOne ¶

func LoadURL ¶

func Parse ¶

func (*Node) AddAfter ¶ added in v1.1.0

func (*Node) AddBefore ¶ added in v1.1.0

func (*Node) AddChild ¶ added in v1.1.0

func (*Node) AddSibling ¶ added in v1.1.0

func (*Node) AppendAttr ¶ added in v1.1.2

func (*Node) DelAttr ¶ added in v1.1.0

func (*Node) DeleteMe ¶ added in v1.1.0

func (*Node) GetAttr ¶ added in v1.1.0

func (*Node) GetAttrWithDefault ¶ added in v1.1.0

func (*Node) InnerText ¶

func (*Node) IsEmpty ¶ added in v1.1.5

func (*Node) NthChild ¶ added in v1.1.0

func (*Node) NthChildOfElem ¶ added in v1.1.0

func (*Node) OutputPrettyXML ¶ added in v1.1.5

func (*Node) OutputXML ¶

func (*Node) OutputXMLToWriter ¶ added in v1.1.0

func (*Node) Reparent ¶ added in v1.1.2

func (*Node) SelectAttr ¶

func (*Node) SelectElement ¶

func (*Node) SelectElements ¶

func (*Node) SetAttr ¶ added in v1.1.0

func (*Node) String ¶ added in v1.1.0

func (*Node) TrimText ¶ added in v1.1.5

type NodeNavigator ¶

func CreateXPathNavigator ¶

func (*NodeNavigator) Copy ¶

func (*NodeNavigator) Current ¶

func (*NodeNavigator) LocalName ¶

func (*NodeNavigator) MoveTo ¶

func (*NodeNavigator) MoveToChild ¶

func (*NodeNavigator) MoveToFirst ¶

func (*NodeNavigator) MoveToNext ¶

func (*NodeNavigator) MoveToNextAttribute ¶

func (*NodeNavigator) MoveToParent ¶

func (*NodeNavigator) MoveToPrevious ¶

func (*NodeNavigator) MoveToRoot ¶

func (*NodeNavigator) NodeType ¶

func (*NodeNavigator) Prefix ¶

func (*NodeNavigator) String ¶

func (*NodeNavigator) Value ¶

type NodeType ¶

Source Files ¶

Find all book elements and only get `id` attribute self. (New Feature)