xmlx

package module
v0.0.0-...-76f54ee Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 1, 2015 License: CC0-1.0 Imports: 12 Imported by: 1

README

XMLX

This package wraps the standard XML library and uses it to build a node tree of any document you load. This allows you to look up nodes forwards and backwards, as well as perform search queries (no xpath support).

Nodes now simply become collections and don't require you to read them in the order in which the xml.Parser finds them.

Dependencies

None.

API

The Document currently implements 2 simple search functions which allow you to look for specific nodes.

*document.SelectNode(namespace, name string) *Node;
*document.SelectNodes(namespace, name string) []*Node;

SelectNode() returns the first, single node it finds matching the given name and namespace. SelectNodes() returns a slice containing all the matching nodes.

Note that these search functions can be invoked on individual nodes as well. This allows you to search only a subset of the entire document.

Each node exposes also a number of functions which allow easy access to a node value or an attribute value. They come in various forms to allow transparent conversion to types: int, int64, uint, uint64, float32, float64:

*node.S(ns, name string) string
*node.I(ns, name string) int
*node.I8(ns, name string) int8
*node.I16(ns, name string) int16
*node.I32(ns, name string) int32
*node.I64(ns, name string) int64
*node.U(ns, name string) uint
*node.U8(ns, name string) uint8
*node.U16(ns, name string) uint16
*node.U32(ns, name string) uint32
*node.U64(ns, name string) uint64
*node.F32(ns, name string) float32
*node.F64(ns, name string) float64
*node.B(ns, name string) bool

Note that these functions actually consider child nodes for matching names as well as the current node. In effect they first perform a node.SelectNode() and then return the value of the resulting node converted to the appropriate type. This allows you to do this:

Consider this piece of xml:

<car>
   <color>red</color>
   <brand>BMW</brand>
</car>

Now this code:

node := doc.SelectNode("", "car")
brand := node.S("", "brand")

Eventhough brand is not the name of node, we still get the right value back (BMW), because node.S() searches through the child nodes when looking for the value if the current node does not match the given namespace and name.

For attributes, we only go through the attributes of the current node this function is invoked on:

*node.As(ns, name string) string
*node.Ai(ns, name string) int
*node.Ai8(ns, name string) int8
*node.Ai16(ns, name string) int16
*node.Ai32(ns, name string) int32
*node.Ai64(ns, name string) int64
*node.Au(ns, name string) uint
*node.Au8(ns, name string) uint8
*node.Au16(ns, name string) uint16
*node.Au32(ns, name string) uint32
*node.Au64(ns, name string) uint64
*node.Af32(ns, name string) float32
*node.Af64(ns, name string) float64
*node.Ab(ns, name string) bool

All of these functions return either "" or 0 when the specified node or attribute could not be found. No errors are generated.

The namespace name specified in the functions above must either match the namespace you expect a node/attr to have, or you can specify a wildcard *. This makes node searches easier in case you do not care what namespace name there is or if there is one at all. Node and attribute names as well, may be supplied as the wildcard *. This allows us to fetch all child nodes for a given namespace, regardless of their names.

All numeric type-conversion methods assume base-10 numbers data.

License

This work is subject to the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication license. Its contents can be found in the LICENSE file.

Documentation

Overview

This package wraps the standard XML library and uses it to build a node tree of any document you load. This allows you to look up nodes forwards and backwards, as well as perform simple search queries.

Nodes now simply become collections and don't require you to read them in the order in which the xml.Parser finds them.

The Document currently implements 2 search functions which allow you to look for specific nodes.

*xmlx.Document.SelectNode(namespace, name string) *Node;
*xmlx.Document.SelectNodes(namespace, name string) []*Node;
*xmlx.Document.SelectNodesRecursive(namespace, name string) []*Node;

SelectNode() returns the first, single node it finds matching the given name and namespace. SelectNodes() returns a slice containing all the matching nodes (without recursing into matching nodes). SelectNodesRecursive() returns a slice of all matching nodes, including nodes inside other matching nodes.

Note that these search functions can be invoked on individual nodes as well. This allows you to search only a subset of the entire document.

Index

Constants

View Source
const (
	NT_ROOT = iota
	NT_DIRECTIVE
	NT_PROCINST
	NT_COMMENT
	NT_TEXT
	NT_ELEMENT
)

Variables

View Source
var IndentPrefix = ""

IndentPrefix holds the value for a single identation level, if one chooses to want indentation in the node.String() and node.Bytes() output. This would normally be set to a single tab, or a number of spaces.

Functions

func EntityToUtf8

func EntityToUtf8(entity string) string

Converts a single numerical html entity to a regular Go utf8-token.

func Utf8ToEntity

func Utf8ToEntity(entity string) string

Converts a single Go utf8-token to a Html entity.

Types

type Attr

type Attr struct {
	Name  xml.Name // Attribute namespace and name.
	Value string   // Attribute value.
}

type CharsetFunc

type CharsetFunc func(charset string, input io.Reader) (io.Reader, error)

This signature represents a character encoding conversion routine. Used to tell the xml decoder how to deal with non-utf8 characters.

type Document

type Document struct {
	Version     string            // XML version
	Encoding    string            // Encoding found in document. If absent, assumes UTF-8.
	StandAlone  string            // Value of XML doctype's 'standalone' attribute.
	Entity      map[string]string // Mapping of custom entity conversions.
	Root        *Node             // The document's root node.
	SaveDocType bool              // Whether not to include the XML doctype in saves.
	// contains filtered or unexported fields
}

represents a single XML document.

func New

func New() *Document

Create a new, empty XML document instance.

func (*Document) LoadBytes

func (this *Document) LoadBytes(d []byte, charset CharsetFunc) (err error)

Load the contents of this document from the supplied byte slice.

func (*Document) LoadExtendedEntityMap

func (this *Document) LoadExtendedEntityMap()

This loads a rather massive table of non-conventional xml escape sequences. Needed to make the parser map them to characters properly. It is advised to set only those entities needed manually using the document.Entity map, but if need be, this method can be called to fill the map with the entire set defined on http://www.w3.org/TR/html4/sgml/entities.html

func (*Document) LoadFile

func (this *Document) LoadFile(filename string, charset CharsetFunc) (err error)

Load the contents of this document from the supplied file.

func (*Document) LoadStream

func (this *Document) LoadStream(r io.Reader, charset CharsetFunc) (err error)

Load the contents of this document from the supplied reader.

func (*Document) LoadString

func (this *Document) LoadString(s string, charset CharsetFunc) (err error)

Load the contents of this document from the supplied string.

func (*Document) LoadUri

func (this *Document) LoadUri(uri string, charset CharsetFunc) (err error)

Load the contents of this document from the supplied uri. (calls LoadUriClient with http.DefaultClient)

func (*Document) LoadUriClient

func (this *Document) LoadUriClient(uri string, client *http.Client, charset CharsetFunc) (err error)

Load the contents of this document from the supplied uri using the specifed client.

func (*Document) SaveBytes

func (this *Document) SaveBytes() []byte

Save the contents of this document as a byte slice.

func (*Document) SaveFile

func (this *Document) SaveFile(path string) error

Save the contents of this document to the supplied file.

func (*Document) SaveStream

func (this *Document) SaveStream(w io.Writer) (err error)

Save the contents of this document to the supplied writer.

func (*Document) SaveString

func (this *Document) SaveString() string

Save the contents of this document as a string.

func (*Document) SelectNode

func (this *Document) SelectNode(namespace, name string) *Node

Select a single node with the given namespace and name. Returns nil if no matching node was found.

func (*Document) SelectNodes

func (this *Document) SelectNodes(namespace, name string) []*Node

Select all nodes with the given namespace and name. Returns an empty slice if no matches were found. Select all nodes with the given namespace and name, without recursing into the children of those matches. Returns an empty slice if no matching node was found.

func (*Document) SelectNodesDirect

func (this *Document) SelectNodesDirect(namespace, name string) []*Node

Select all nodes directly under this document, with the given namespace and name. Returns an empty slice if no matches were found.

func (*Document) SelectNodesRecursive

func (this *Document) SelectNodesRecursive(namespace, name string) []*Node

Select all nodes with the given namespace and name, also recursing into the children of those matches. Returns an empty slice if no matches were found.

func (*Document) SetUserAgent

func (this *Document) SetUserAgent(s string)

Set a custom user agent when making a new request.

func (*Document) String

func (this *Document) String() string

Alias for Document.SaveString(). This one is invoked by anything looking for the standard String() method (eg: fmt.Printf("%s\n", mydoc).

type Node

type Node struct {
	Type       byte     // Node type.
	Name       xml.Name // Node namespace and name.
	Children   []*Node  // Child nodes.
	Attributes []*Attr  // Node attributes.
	Parent     *Node    // Parent node.
	Value      string   // Node value.
	Target     string   // procinst field.
}

func NewNode

func NewNode(tid byte) *Node

func (*Node) Ab

func (this *Node) Ab(namespace, name string) bool

Get attribute value as bool

func (*Node) AddChild

func (this *Node) AddChild(t *Node)

Add a child node

func (*Node) Af32

func (this *Node) Af32(namespace, name string) float32

Get attribute value as float32

func (*Node) Af64

func (this *Node) Af64(namespace, name string) float64

Get attribute value as float64

func (*Node) Ai

func (this *Node) Ai(namespace, name string) int

Get attribute value as int

func (*Node) Ai16

func (this *Node) Ai16(namespace, name string) int16

Get attribute value as int16

func (*Node) Ai32

func (this *Node) Ai32(namespace, name string) int32

Get attribute value as int32

func (*Node) Ai64

func (this *Node) Ai64(namespace, name string) int64

Get attribute value as int64

func (*Node) Ai8

func (this *Node) Ai8(namespace, name string) int8

Get attribute value as int8

func (*Node) As

func (this *Node) As(namespace, name string) string

Get attribute value as string

func (*Node) Au

func (this *Node) Au(namespace, name string) uint

Get attribute value as uint

func (*Node) Au16

func (this *Node) Au16(namespace, name string) uint16

Get attribute value as uint16

func (*Node) Au32

func (this *Node) Au32(namespace, name string) uint32

Get attribute value as uint32

func (*Node) Au64

func (this *Node) Au64(namespace, name string) uint64

Get attribute value as uint64

func (*Node) Au8

func (this *Node) Au8(namespace, name string) uint8

Get attribute value as uint8

func (*Node) B

func (this *Node) B(namespace, name string) bool

Get node value as bool

func (*Node) Bytes

func (this *Node) Bytes() []byte

Convert node to appropriate []byte representation based on it's @Type. Note that NT_ROOT is a special-case empty node used as the root for a Document. This one has no representation by itself. It merely forwards the String() call to it's child nodes.

func (*Node) F32

func (this *Node) F32(namespace, name string) float32

Get node value as float32

func (*Node) F64

func (this *Node) F64(namespace, name string) float64

Get node value as float64

func (*Node) GetValue

func (this *Node) GetValue() string

func (*Node) HasAttr

func (this *Node) HasAttr(namespace, name string) bool

Returns true if this node has the specified attribute. False otherwise.

func (*Node) I

func (this *Node) I(namespace, name string) int

Get node value as int

func (*Node) I16

func (this *Node) I16(namespace, name string) int16

Get node value as int16

func (*Node) I32

func (this *Node) I32(namespace, name string) int32

Get node value as int32

func (*Node) I64

func (this *Node) I64(namespace, name string) int64

Get node value as int64

func (*Node) I8

func (this *Node) I8(namespace, name string) int8

Get node value as int8

func (*Node) RemoveAttr

func (this *Node) RemoveAttr(name string)

func (*Node) RemoveChild

func (this *Node) RemoveChild(t *Node)

Remove a child node

func (*Node) RemoveNameSpace

func (this *Node) RemoveNameSpace()

func (*Node) S

func (this *Node) S(namespace, name string) string

Get node value as string

func (*Node) SelectNode

func (this *Node) SelectNode(namespace, name string) *Node

Select single node by name

func (*Node) SelectNodes

func (this *Node) SelectNodes(namespace, name string) []*Node

Select multiple nodes by name

func (*Node) SelectNodesDirect

func (this *Node) SelectNodesDirect(namespace, name string) []*Node

Select multiple nodes directly under this node, by name.

func (*Node) SelectNodesRecursive

func (this *Node) SelectNodesRecursive(namespace, name string) []*Node

Select multiple nodes by name

func (*Node) SetAttr

func (this *Node) SetAttr(name, value string)

func (*Node) SetValue

func (this *Node) SetValue(val string)

SetValue sets the value of the node to the given parameter. It deletes all children of the node so the old data does not get back at node.GetValue

func (*Node) String

func (this *Node) String() (s string)

Convert node to appropriate string representation based on it's @Type. Note that NT_ROOT is a special-case empty node used as the root for a Document. This one has no representation by itself. It merely forwards the String() call to it's child nodes.

func (*Node) U

func (this *Node) U(namespace, name string) uint

Get node value as uint

func (*Node) U16

func (this *Node) U16(namespace, name string) uint16

Get node value as uint16

func (*Node) U32

func (this *Node) U32(namespace, name string) uint32

Get node value as uint32

func (*Node) U64

func (this *Node) U64(namespace, name string) uint64

Get node value as uint64

func (*Node) U8

func (this *Node) U8(namespace, name string) uint8

Get node value as uint8

func (*Node) Unmarshal

func (this *Node) Unmarshal(obj interface{}) error

This wraps the standard xml.Unmarshal function and supplies this particular node as the content to be unmarshalled.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL