xmlstream

package module
v0.15.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 11, 2023 License: BSD-2-Clause Imports: 10 Imported by: 45

README

mellium.im/xmlstream

Issue Tracker Docs Chat License

An API for manipulating XML streams in Go; io but for XML.

To use it in your project, import it like so:

import "mellium.im/xmlstream"

If you'd like to contribute to the project, see CONTRIBUTING.md.

License

The package may be used under the terms of the BSD 2-Clause License a copy of which may be found in the LICENSE file. Some code in this package has been copied from Go and is used under the terms of Go's modified BSD license, a copy of which can be found in the LICENSE-GO file.

Unless you explicitly state otherwise, any contribution submitted for inclusion in the work by you shall be licensed as above, without any additional terms or conditions.

Documentation

Overview

Package xmlstream provides an API for streaming, transforming, and otherwise manipulating XML data.

BE ADVISED: The API is unstable and subject to change.

Index

Examples

Constants

This section is empty.

Variables

View Source
var ErrClosedPipe = errors.New("xmlstream: read/write on closed pipe")

ErrClosedPipe is the error used for read or write operations on a closed pipe.

Functions

func Copy added in v0.3.0

func Copy(dst TokenWriter, src xml.TokenReader) (n int, err error)

Copy consumes a xml.TokenReader and writes its tokens to a TokenWriter until either io.EOF is reached on src or an error occurs. It returns the number of tokens copied and the first error encountered while copying, if any. If an error is returned by the reader or writer, copy returns it immediately. Since Copy is defined as consuming the stream until the end, io.EOF is not returned.

If src implements the WriterTo interface, the copy is implemented by calling src.WriteXML(dst). Otherwise, if dst implements the ReaderFrom interface, the copy is implemented by calling dst.ReadXML(src).

func Fmt

func Fmt(d xml.TokenReader, opts ...FmtOption) xml.TokenReader

Fmt returns a transformer that indents the given XML stream. The default indentation style is to remove non-significant whitespace, start elements on a new line and indent two spaces per level.

Example (Indentation)
package main

import (
	"bytes"
	"encoding/xml"
	"fmt"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	tokenizer := xmlstream.Fmt(xml.NewDecoder(strings.NewReader(`
<quote>  <p>
                 <!-- Chardata is not indented -->
  How now, my hearts! did you never see the picture
of 'we three'?</p>
</quote>`)), xmlstream.Indent("    "))

	buf := new(bytes.Buffer)
	e := xml.NewEncoder(buf)
	for t, err := tokenizer.Token(); err == nil; t, err = tokenizer.Token() {
		e.EncodeToken(t)
	}
	e.Flush()
	fmt.Println(buf.String())
}
Output:

<quote>
    <p>
        <!-- Chardata is not indented -->

  How now, my hearts! did you never see the picture
of &#39;we three&#39;?
    </p>
</quote>

func Inner added in v0.6.0

func Inner(r xml.TokenReader) xml.TokenReader

Inner returns a new TokenReader that returns nil, io.EOF when it consumes the end element matching the most recent start element already consumed.

func InnerElement added in v0.15.2

func InnerElement(r xml.TokenReader) xml.TokenReader

InnerElement wraps a TokenReader to return nil, io.EOF after returning the end element matching the most recent start element already consumed. It is like Inner except that it returns the end element.

func InnerReader

func InnerReader(r io.Reader) io.Reader

InnerReader is an io.Reader which attempts to decode an xml.StartElement from the stream on the first call to Read (returning an error if an invalid start token is found) and returns a new reader which only reads the inner XML without parsing it or checking its validity. After the inner XML is read, the end token is parsed and if it does not exist or does not match the original start token an error is returned.

Example
package main

import (
	"io"
	"os"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	r := xmlstream.InnerReader(strings.NewReader(`<stream:features>
<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'>
<required/>
</starttls>
</stream:features>`))
	io.Copy(os.Stdout, r)
}
Output:

<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'>
<required/>
</starttls>

func LimitReader added in v0.5.0

func LimitReader(r xml.TokenReader, n uint) xml.TokenReader

LimitReader returns a xml.TokenReader that reads from r but stops with EOF after n tokens (regardless of the validity of the XML at that point in the stream).

Example
package main

import (
	"encoding/xml"
	"log"
	"os"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	e := xml.NewEncoder(os.Stdout)
	var r xml.TokenReader = xml.NewDecoder(strings.NewReader(`<one>One hen</one><two>Two ducks</two>`))

	r = xmlstream.LimitReader(r, 3)

	if _, err := xmlstream.Copy(e, r); err != nil {
		log.Fatal("Error in LimitReader example:", err)
	}
	if err := e.Flush(); err != nil {
		log.Fatal("Error flushing:", err)
	}

}
Output:

<one>One hen</one>

func MultiReader added in v0.1.0

func MultiReader(readers ...xml.TokenReader) xml.TokenReader

MultiReader returns an xml.TokenReader that's the logical concatenation of the provided input readers. They're read sequentially. Once all inputs have returned io.EOF, Token will return io.EOF. If any of the readers return a non-nil, non-EOF error, Token will return that error.

Example
package main

import (
	"encoding/xml"
	"log"
	"os"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	e := xml.NewEncoder(os.Stdout)
	e.Indent("", "  ")

	r1 := xml.NewDecoder(strings.NewReader(`<title>Dover Beach</title>`))
	r2 := xml.NewDecoder(strings.NewReader(`<author>Matthew Arnold</author>`))
	r3 := xml.NewDecoder(strings.NewReader(`<incipit>The sea is calm to-night.</incipit>`))

	r := xmlstream.MultiReader(r1, r2, r3)

	if _, err := xmlstream.Copy(e, r); err != nil {
		log.Fatal("Error in MultiReader example:", err)
	}
	if err := e.Flush(); err != nil {
		log.Fatal("Error flushing:", err)
	}
}
Output:

<title>Dover Beach</title>
<author>Matthew Arnold</author>
<incipit>The sea is calm to-night.</incipit>

func Pipe added in v0.1.0

func Pipe() (*PipeReader, *PipeWriter)

Pipe creates a synchronous in-memory pipe of tokens. It can be used to connect code expecting an TokenReader with code expecting an xmlstream.TokenWriter.

Reads and Writes on the pipe are matched one to one. That is, each Write to the PipeWriter blocks until it has satisfied a Read from the corresponding PipeReader.

It is safe to call Read and Write in parallel with each other or with Close. Parallel calls to Read and parallel calls to Write are also safe: the individual calls will be gated sequentially.

func ReadAll added in v0.13.6

func ReadAll(r xml.TokenReader) ([]xml.Token, error)

ReadAll reads from r until an error or io.EOF and returns the data it reads. A successful call returns err == nil, not err == io.EOF. Because ReadAll is defined to read from src until io.EOF, it does not treat an io.EOF from Read as an error to be reported.

func Skip added in v0.2.0

func Skip(r xml.TokenReader) error

Skip reads tokens until it has consumed the end element matching the most recent start element already consumed. It recurs if it encounters a start element, so it can be used to skip nested structures. It returns nil if it finds an end element at the same nesting level as the start element; otherwise it returns an error describing the problem. Skip does not verify that the start and end elements match.

Example
package main

import (
	"encoding/xml"
	"io"
	"log"
	"os"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	e := xml.NewEncoder(os.Stdout)

	r := xml.NewDecoder(strings.NewReader(`<par>I don't like to look out of the windows even—there are so many of those creeping women, and they creep so fast.</par><par>I wonder if they all come out of that wall paper, as I did?</par>`))

	r.Token() // <par>

	if err := xmlstream.Skip(r); err != nil && err != io.EOF {
		log.Fatal("Error in skipping par:", err)
	}
	if _, err := xmlstream.Copy(e, r); err != nil {
		log.Fatal("Error in Skip example:", err)
	}
	if err := e.Flush(); err != nil {
		log.Fatal("Error flushing:", err)
	}

}
Output:

<par>I wonder if they all come out of that wall paper, as I did?</par>

func TeeReader added in v0.12.2

func TeeReader(r xml.TokenReader, w TokenWriter) xml.TokenReader

TeeReader returns a Reader that writes to w what it reads from r. All reads from r performed through it are matched with corresponding writes to w. There is no internal buffering - the write must complete before the read completes. Any error encountered while writing is reported as a read error.

func Token added in v0.13.1

func Token(t xml.Token) xml.TokenReader

Token returns a reader that returns the given token and io.EOF, then nil io.EOF thereafter.

func Unwrap added in v0.1.0

Unwrap reads the next token from the provided TokenReader and, if it is a start element, returns a new TokenReader that skips the corresponding end element. If the token is not a start element the original TokenReader is returned.

Example
package main

import (
	"encoding/xml"
	"fmt"
	"log"
	"os"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	var r xml.TokenReader = xml.NewDecoder(strings.NewReader(`<message from="ismene@example.org/dIoK6Wi3"><body>No mind that ever lived stands firm in evil days, but goes astray.</body></message>`))
	e := xml.NewEncoder(os.Stdout)

	r, tok, err := xmlstream.Unwrap(r)
	if err != nil {
		log.Fatal("Error unwraping:", err)
	}

	fmt.Printf("%s:\n", tok.(xml.StartElement).Name.Local)
	if _, err := xmlstream.Copy(e, r); err != nil {
		log.Fatal("Error in unwrap example:", err)
	}
	if err := e.Flush(); err != nil {
		log.Fatal("Error flushing:", err)
	}

}
Output:

message:
<body>No mind that ever lived stands firm in evil days, but goes astray.</body>

func Wrap added in v0.1.0

Wrap wraps a token stream in a start element and its corresponding end element.

Example
package main

import (
	"encoding/xml"
	"log"
	"os"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	var r xml.TokenReader = xml.NewDecoder(strings.NewReader(`<body>No mind that ever lived stands firm in evil days, but goes astray.</body>`))
	e := xml.NewEncoder(os.Stdout)
	e.Indent("", "  ")

	r = xmlstream.Wrap(r, xml.StartElement{
		Name: xml.Name{Local: "message"},
		Attr: []xml.Attr{
			{Name: xml.Name{Local: "from"}, Value: "ismene@example.org/Fo6Eeb2e"},
		},
	})

	if _, err := xmlstream.Copy(e, r); err != nil {
		log.Fatal("Error in wrap example:", err)
	}
	if err := e.Flush(); err != nil {
		log.Fatal("Error flushing:", err)
	}
}
Output:

<message from="ismene@example.org/Fo6Eeb2e">
  <body>No mind that ever lived stands firm in evil days, but goes astray.</body>
</message>

Types

type DecodeCloser added in v0.15.0

type DecodeCloser interface {
	Decoder
	io.Closer
}

DecodeCloser is the interface that groups Decoder and io.Closer.

type DecodeEncoder added in v0.15.0

type DecodeEncoder interface {
	Decoder
	Encoder
}

DecodeEncoder is the interface that groups the Encoder and Decoder interfaces.

type Decoder added in v0.15.0

type Decoder interface {
	xml.TokenReader
	Decode(v interface{}) error
	DecodeElement(v interface{}, start *xml.StartElement) error
}

Decoder is the interface that groups the Decode, DecodeElement, and Token methods. Decoder is implemented by xml.Decoder.

type EncodeCloser added in v0.15.0

type EncodeCloser interface {
	Encoder
	io.Closer
}

EncodeCloser is the interface that groups Encoder and io.Closer.

type Encoder added in v0.13.5

type Encoder interface {
	TokenWriter
	Encode(v interface{}) error
	EncodeElement(v interface{}, start xml.StartElement) error
}

Encoder is the interface that groups the Encode, EncodeElement, and EncodeToken methods. Encoder is implemented by xml.Encoder.

type Flusher added in v0.13.0

type Flusher interface {
	Flush() error
}

The Flusher interface is implemented by TokenWriters that can flush buffered data to an underlying receiver.

type FmtOption

type FmtOption func(*fmter)

FmtOption is used to configure a formatters behavior.

func Indent

func Indent(s string) FmtOption

Indent is inserted before XML elements zero or more times according to their nesting depth in the stream. The default indentation is " " (two ASCII spaces).

func Prefix

func Prefix(s string) FmtOption

Prefix is inserted at the start of every XML element in the stream.

func Suffix added in v0.5.1

func Suffix(s string) FmtOption

Suffix is inserted at the start of every XML element in the stream. If no option is specified the default suffix is '\n'.

type Iter added in v0.15.0

type Iter struct {
	// contains filtered or unexported fields
}

Iter provides a mechanism for iterating over the children of an XML element. Successive calls to the Next method will step through each child, returning its start element (if applicable) and a reader that is limited to the remainder of the child.

func NewIter added in v0.15.0

func NewIter(r xml.TokenReader) *Iter

NewIter returns a new iterator that iterates over the children of the most recent start element already consumed from r.

func (*Iter) Close added in v0.15.0

func (i *Iter) Close() error

Close indicates that we are finished with the given iterator. Calling it multiple times has no effect.

If the underlying TokenReader is also an io.Closer, Close calls the readers Close method.

func (*Iter) Current added in v0.15.0

func (i *Iter) Current() (*xml.StartElement, xml.TokenReader)

Current returns a reader over the most recent child.

func (*Iter) Err added in v0.15.0

func (i *Iter) Err() error

Err returns the last error encountered by the iterator (if any).

func (*Iter) Next added in v0.15.0

func (i *Iter) Next() bool

Next returns true if there are more items to decode.

type Marshaler added in v0.9.2

type Marshaler interface {
	TokenReader() xml.TokenReader
}

Marshaler is the interface implemented by objects that can marshal themselves into valid XML elements.

type PipeReader added in v0.1.0

type PipeReader struct {
	// contains filtered or unexported fields
}

A PipeReader is the read half of a token pipe.

func (*PipeReader) Close added in v0.1.0

func (r *PipeReader) Close() error

Close closes the PipeReader; subsequent reads from the read half of the pipe will return a nil token and EOF.

func (*PipeReader) CloseWithError added in v0.1.0

func (r *PipeReader) CloseWithError(err error)

CloseWithError closes the PipeReader; subsequent reads from the read half of the pipe will return no tokens and the error err, or EOF if err is nil.

func (*PipeReader) Token added in v0.1.0

func (r *PipeReader) Token() (t xml.Token, err error)

Token implements the TokenReader interface. It reads a token from the pipe, blocking until a writer arrives or the write end is closed. If the write end is closed with an error, that error is returned as err; otherwise err is io.EOF.

type PipeWriter added in v0.1.0

type PipeWriter struct {
	// contains filtered or unexported fields
}

A PipeWriter is the write half of a token pipe.

func (*PipeWriter) Close added in v0.1.0

func (w *PipeWriter) Close() error

Close closes the PipeWriter; subsequent reads from the read half of the pipe will return a nil token and EOF.

func (*PipeWriter) CloseWithError added in v0.1.0

func (w *PipeWriter) CloseWithError(err error)

CloseWithError closes the PipeWriter; subsequent reads from the read half of the pipe will return no tokens and the error err, or EOF if err is nil.

func (*PipeWriter) EncodeToken added in v0.1.0

func (w *PipeWriter) EncodeToken(t xml.Token) error

EncodeToken implements the TokenWriter interface. It writes a token to the pipe, blocking until one or more readers have consumed all the data or the read end is closed. If the read end is closed with an error, that err is returned as err; otherwise err is ErrClosedPipe.

func (*PipeWriter) Flush added in v0.1.0

func (w *PipeWriter) Flush() error

Flush is currently a noop and always returns nil.

type ReaderFrom added in v0.10.1

type ReaderFrom interface {
	ReadXML(xml.TokenReader) (n int, err error)
}

ReaderFrom reads tokens from r until EOF or error. The return value n is the number of tokens read. Any error except io.EOF encountered during the read is also returned.

The Copy function uses ReaderFrom if available.

type ReaderFunc added in v0.1.0

type ReaderFunc func() (xml.Token, error)

ReaderFunc type is an adapter to allow the use of ordinary functions as an TokenReader. If f is a function with the appropriate signature, ReaderFunc(f) is an TokenReader that calls f.

Example
package main

import (
	"encoding/xml"
	"io"
	"log"
	"os"

	"mellium.im/xmlstream"
)

func main() {
	state := 0
	start := xml.StartElement{Name: xml.Name{Local: "quote"}}
	d := xmlstream.ReaderFunc(func() (xml.Token, error) {
		switch state {
		case 0:
			state++
			return start, nil
		case 1:
			state++
			return xml.CharData("the rain it raineth every day"), nil
		case 2:
			state++
			return start.End(), nil
		default:
			return nil, io.EOF
		}
	})

	e := xml.NewEncoder(os.Stdout)
	if _, err := xmlstream.Copy(e, d); err != nil {
		log.Fatal("Error in func example:", err)
	}
	if err := e.Flush(); err != nil {
		log.Fatal("Error flushing:", err)
	}
}
Output:

<quote>the rain it raineth every day</quote>

func (ReaderFunc) Token added in v0.1.0

func (f ReaderFunc) Token() (xml.Token, error)

Token calls f.

type TokenReadCloser added in v0.12.3

type TokenReadCloser interface {
	xml.TokenReader
	io.Closer
}

TokenReadCloser is the interface that groups the basic Token and Close methods.

func NopCloser added in v0.13.4

func NopCloser(r xml.TokenReader) TokenReadCloser

NopCloser returns a TokenReadCloser with a no-op Close method wrapping the provided Reader r.

type TokenReadEncoder added in v0.13.5

type TokenReadEncoder interface {
	xml.TokenReader
	Encoder
}

TokenReadEncoder is the interface that groups the Encode, EncodeElement, EncodeToken, and Token methods.

type TokenReadWriteCloser added in v0.8.1

type TokenReadWriteCloser interface {
	xml.TokenReader
	TokenWriter
	io.Closer
}

TokenReadWriteCloser is the interface that groups the basic Token, EncodeToken, Flush, and Close methods.

type TokenReadWriter added in v0.8.1

type TokenReadWriter interface {
	xml.TokenReader
	TokenWriter
}

TokenReadWriter is the interface that groups the basic Token, EncodeToken, and Flush methods.

type TokenWriteCloser added in v0.8.2

type TokenWriteCloser interface {
	TokenWriter
	io.Closer
}

TokenWriteCloser is the interface that groups the basic EncodeToken, and Close methods.

type TokenWriteFlushCloser added in v0.13.4

type TokenWriteFlushCloser interface {
	TokenWriter
	io.Closer
	Flusher
}

TokenWriteFlushCloser is the interface that groups the basic EncodeToken, Flush, and Close methods.

type TokenWriteFlusher added in v0.15.1

type TokenWriteFlusher interface {
	TokenWriter
	Flusher
}

TokenWriteFlusher is the interface that groups the basic EncodeToken, and Flush methods.

type TokenWriter

type TokenWriter interface {
	EncodeToken(t xml.Token) error
}

TokenWriter is anything that can encode tokens to an XML stream, including an xml.Encoder.

func Discard added in v0.8.0

func Discard() TokenWriter

Discard returns a TokenWriter on which all calls succeed without doing anything.

func MultiWriter added in v0.2.0

func MultiWriter(writers ...TokenWriter) TokenWriter

MultiWriter creates a writer that duplicates its writes to all the provided writers, similar to the Unix tee(1) command. If any of the writers return an error, the MultiWriter immediately returns the error and stops writing.

type Transformer

type Transformer func(src xml.TokenReader) xml.TokenReader

A Transformer returns a new TokenReader that returns transformed tokens read from src.

func Insert added in v0.15.2

func Insert(name xml.Name, m Marshaler) Transformer

Insert adds one XML stream to another just before the close token, matching on the token name. If either component of the name is empty it is considered a wildcard.

func InsertFunc added in v0.15.2

func InsertFunc(f func(start xml.StartElement, level uint64, w TokenWriter) error) Transformer

InsertFunc calls f after writing any start element to the stream. The function can decide based on the passed in StartElement whether to insert any additional tokens into the stream by writing them to w.

func Inspect

func Inspect(f func(t xml.Token)) Transformer

Inspect performs an operation for each token in the stream without transforming the stream in any way.

func Map

func Map(mapping func(t xml.Token) xml.Token) Transformer

Map returns a Transformer that maps the tokens in the input using the given mapping.

func Remove

func Remove(f func(t xml.Token) bool) Transformer

Remove returns a Transformer that removes tokens for which f matches.

Example
package main

import (
	"bytes"
	"encoding/xml"
	"fmt"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	removequote := xmlstream.Remove(func(t xml.Token) bool {
		switch tok := t.(type) {
		case xml.StartElement:
			return tok.Name.Local == "quote"
		case xml.EndElement:
			return tok.Name.Local == "quote"
		}
		return false
	})

	tokenizer := removequote(xml.NewDecoder(strings.NewReader(`
<quote>
  <p>Foolery, sir, does walk about the orb, like the sun; it shines everywhere.</p>
</quote>`)))

	buf := new(bytes.Buffer)
	e := xml.NewEncoder(buf)
	for t, err := tokenizer.Token(); err == nil; t, err = tokenizer.Token() {
		e.EncodeToken(t)
	}
	e.Flush()
	fmt.Println(buf.String())
}
Output:

<p>Foolery, sir, does walk about the orb, like the sun; it shines everywhere.</p>

func RemoveAttr

func RemoveAttr(f func(start xml.StartElement, attr xml.Attr) bool) Transformer

RemoveAttr returns a Transformer that removes attributes from xml.StartElement's if f matches.

func RemoveElement

func RemoveElement(f func(start xml.StartElement) bool) Transformer

RemoveElement returns a Transformer that removes entire elements (and their children) if f matches the elements start token.

Example
package main

import (
	"bytes"
	"encoding/xml"
	"fmt"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	removeLangEn := xmlstream.RemoveElement(func(start xml.StartElement) bool {
		// TODO: Probably be more specific and actually check the name.
		if len(start.Attr) > 0 && start.Attr[0].Value == "en" {
			return true
		}
		return false
	})

	d := removeLangEn(xml.NewDecoder(strings.NewReader(`
<quote>
<p xml:lang="en">Thus the whirligig of time brings in his revenges.</p>
<p xml:lang="fr">et c’est ainsi que la roue du temps amène les occasions de revanche.</p>
</quote>
`)))

	buf := new(bytes.Buffer)
	e := xml.NewEncoder(buf)
	for t, err := d.Token(); err == nil; t, err = d.Token() {
		e.EncodeToken(t)
	}
	e.Flush()
	fmt.Println(buf.String())
}
Output:

<quote>

<p xml:lang="fr">et c’est ainsi que la roue du temps amène les occasions de revanche.</p>
</quote>

type WriterTo added in v0.10.1

type WriterTo interface {
	WriteXML(TokenWriter) (n int, err error)
}

WriterTo writes tokens to w until there are no more tokens to write or when an error occurs. The return value n is the number of tokens written. Any error encountered during the write is also returned.

The Copy function uses WriterTo if available.

Notes

Bugs

  • Multiple uses of RemoveAttr will iterate over the attr list multiple times.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL