xmlstream

package module
v0.8.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 2, 2017 License: BSD-2-Clause Imports: 10 Imported by: 45

README

mellium.im/xmlstream

GoDoc License

An experimental API for manipulating streams of XML data.

import "mellium.im/xmlstream"

License

The package may be used under the terms of the BSD 2-Clause License a copy of which may be found in the LICENSE file. Some code in this package may have been copied from Go and is used under the terms of Go's modified BSD license, a copy of which can be found in the LICENSE.GO file.

Documentation

Overview

Package xmlstream provides an API for streaming, transforming, and otherwise manipulating XML data.

If you are using Go built from source, you will need a build from 2017-09-13 or later that includes this patch: https://golang.org/cl/38791 When Go 1.10 is released, this package will be modified to use the xml.TokenReader interface, the Go 1.9 shim (which only exists to let godoc.org generate documentation) will be removed, and a 1.0 release will be made.

BE ADVISED: The API is unstable and subject to change.

Index

Examples

Constants

This section is empty.

Variables

View Source
var ErrClosedPipe = errors.New("xmlstream: read/write on closed pipe")

ErrClosedPipe is the error used for read or write operations on a closed pipe.

Functions

func Copy added in v0.3.0

func Copy(e TokenWriter, d TokenReader) (err error)

Copy consumes a TokenReader and writes its tokens to a TokenWriter. If an error is returned by the reader or writer, copy returns it immediately. Since Copy is defined as consuming the stream until the end, io.EOF is not returned. If no error would be returned, Copy flushes the TokenWriter when it is done.

func InnerReader

func InnerReader(r io.Reader) io.Reader

InnerReader is an io.Reader which attempts to decode an xml.StartElement from the stream on the first call to Read (returning an error if an invalid start token is found) and returns a new reader which only reads the inner XML without parsing it or checking its validity. After the inner XML is read, the end token is parsed and if it does not exist or does not match the original start token an error is returned.

Example
package main

import (
	"io"
	"os"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	r := xmlstream.InnerReader(strings.NewReader(`<stream:features>
<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'>
<required/>
</starttls>
</stream:features>`))
	io.Copy(os.Stdout, r)
}
Output:

<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'>
<required/>
</starttls>

func Pipe added in v0.1.0

func Pipe() (*PipeReader, *PipeWriter)

Pipe creates a synchronous in-memory pipe of tokens. It can be used to connect code expecting an TokenReader with code expecting an xmlstream.TokenWriter.

Reads and Writes on the pipe are matched one to one. That is, each Write to the PipeWriter blocks until it has satisfied a Read from the corresponding PipeReader.

It is safe to call Read and Write in parallel with each other or with Close. Parallel calls to Read and parallel calls to Write are also safe: the individual calls will be gated sequentially.

func Skip added in v0.2.0

func Skip(r TokenReader) error

Skip reads tokens until it has consumed the end element matching the most recent start element already consumed. It recurs if it encounters a start element, so it can be used to skip nested structures. It returns nil if it finds an end element at the same nesting level as the start element; otherwise it returns an error describing the problem. Skip does not verify that the start and end elements match.

Example
package main

import (
	"encoding/xml"
	"io"
	"log"
	"os"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	e := xml.NewEncoder(os.Stdout)

	r := xml.NewDecoder(strings.NewReader(`<par>I don't like to look out of the windows even—there are so many of those creeping women, and they creep so fast.</par><par>I wonder if they all come out of that wall paper, as I did?</par>`))

	r.Token() // <par>

	if err := xmlstream.Skip(r); err != nil && err != io.EOF {
		log.Fatal("Error in skipping par:", err)
	}
	if err := xmlstream.Copy(e, r); err != nil {
		log.Fatal("Error in Skip example:", err)
	}

}
Output:

<par>I wonder if they all come out of that wall paper, as I did?</par>

Types

type FmtOption

type FmtOption func(*fmter)

FmtOption is used to configure a formatters behavior.

func Indent

func Indent(s string) FmtOption

Indent is inserted before XML elements zero or more times according to their nesting depth in the stream. The default indentation is " " (two ASCII spaces).

func Prefix

func Prefix(s string) FmtOption

Prefix is inserted at the start of every XML element in the stream.

func Suffix added in v0.5.1

func Suffix(s string) FmtOption

Suffix is inserted at the start of every XML element in the stream. If no option is specified the default suffix is '\n'.

type PipeReader added in v0.1.0

type PipeReader struct {
	// contains filtered or unexported fields
}

A PipeReader is the read half of a token pipe.

func (*PipeReader) Close added in v0.1.0

func (r *PipeReader) Close() error

Close closes the PipeReader; subsequent reads from the read half of the pipe will return no bytes and EOF.

func (*PipeReader) CloseWithError added in v0.1.0

func (r *PipeReader) CloseWithError(err error)

CloseWithError closes the PipeReader; subsequent reads from the read half of the pipe will return no tokens and the error err, or EOF if err is nil.

func (*PipeReader) Token added in v0.1.0

func (r *PipeReader) Token() (t xml.Token, err error)

Token implements the TokenReader interface. It reads a token from the pipe, blocking until a writer arrives or the write end is closed. If the write end is closed with an error, that error is returned as err; otherwise err is io.EOF.

type PipeWriter added in v0.1.0

type PipeWriter struct {
	// contains filtered or unexported fields
}

A PipeWriter is the write half of a token pipe.

func (*PipeWriter) Close added in v0.1.0

func (w *PipeWriter) Close() error

Close closes the PipeWriter; subsequent reads from the read half of the pipe will return no bytes and EOF.

func (*PipeWriter) CloseWithError added in v0.1.0

func (w *PipeWriter) CloseWithError(err error)

CloseWithError closes the PipeWriter; subsequent reads from the read half of the pipe will return no tokens and the error err, or EOF if err is nil.

func (*PipeWriter) EncodeToken added in v0.1.0

func (w *PipeWriter) EncodeToken(t xml.Token) error

EncodeToken implements the TokenWriter interface. It writes a token to the pipe, blocking until one or more readers have consumed all the data or the read end is closed. If the read end is closed with an error, that err is returned as err; otherwise err is ErrClosedPipe.

func (*PipeWriter) Flush added in v0.1.0

func (w *PipeWriter) Flush() error

Flush is currently a noop and always returns nil.

type ReaderFunc added in v0.1.0

type ReaderFunc func() (xml.Token, error)

ReaderFunc type is an adapter to allow the use of ordinary functions as an TokenReader. If f is a function with the appropriate signature, ReaderFunc(f) is an TokenReader that calls f.

Example
package main

import (
	"encoding/xml"
	"io"
	"log"
	"os"

	"mellium.im/xmlstream"
)

func main() {
	state := 0
	start := xml.StartElement{Name: xml.Name{Local: "quote"}}
	d := xmlstream.ReaderFunc(func() (xml.Token, error) {
		switch state {
		case 0:
			state++
			return start, nil
		case 1:
			state++
			return xml.CharData("the rain it raineth every day"), nil
		case 2:
			state++
			return start.End(), nil
		default:
			return nil, io.EOF
		}
	})

	e := xml.NewEncoder(os.Stdout)
	if err := xmlstream.Copy(e, d); err != nil {
		log.Fatal("Error in func example:", err)
	}
}
Output:

<quote>the rain it raineth every day</quote>

func (ReaderFunc) Token added in v0.1.0

func (f ReaderFunc) Token() (xml.Token, error)

Token calls f.

type TokenReader

type TokenReader = xml.TokenReader

A TokenReader is anything that can decode a stream of XML tokens, including a Decoder. For more information see the documentation for xml.TokenReader.

func Fmt

func Fmt(d TokenReader, opts ...FmtOption) TokenReader

Fmt returns a transformer that indents the given XML stream. The default indentation style is to remove non-significant whitespace, start elements on a new line and indent two spaces per level.

Example (Indentation)
package main

import (
	"bytes"
	"encoding/xml"
	"fmt"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	tokenizer := xmlstream.Fmt(xml.NewDecoder(strings.NewReader(`
<quote>  <p>
                 <!-- Chardata is not indented -->
  How now, my hearts! did you never see the picture
of 'we three'?</p>
</quote>`)), xmlstream.Indent("    "))

	buf := new(bytes.Buffer)
	e := xml.NewEncoder(buf)
	for t, err := tokenizer.Token(); err == nil; t, err = tokenizer.Token() {
		e.EncodeToken(t)
	}
	e.Flush()
	fmt.Println(buf.String())
}
Output:

<quote>
    <p>
        <!-- Chardata is not indented -->

  How now, my hearts! did you never see the picture
of &#39;we three&#39;?
    </p>
</quote>

func Inner added in v0.6.0

func Inner(r TokenReader) TokenReader

Inner returns a new TokenReader that returns nil, io.EOF when it consumes the end element matching the most recent start element already consumed.

func LimitReader added in v0.5.0

func LimitReader(r TokenReader, n uint) TokenReader

LimitReader returns a TokenReader that reads from r but stops with EOF after n tokens (regardless of the validity of the XML at that point in the stream).

Example
package main

import (
	"encoding/xml"
	"log"
	"os"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	e := xml.NewEncoder(os.Stdout)
	var r xml.TokenReader = xml.NewDecoder(strings.NewReader(`<one>One hen</one><two>Two ducks</two>`))

	r = xmlstream.LimitReader(r, 3)

	if err := xmlstream.Copy(e, r); err != nil {
		log.Fatal("Error in LimitReader example:", err)
	}

}
Output:

<one>One hen</one>

func MultiReader added in v0.1.0

func MultiReader(readers ...TokenReader) TokenReader

MultiReader returns an TokenReader that's the logical concatenation of the provided input readers. They're read sequentially. Once all inputs have returned io.EOF, Token will return io.EOF. If any of the readers return a non-nil, non-EOF error, Token will return that error.

Example
package main

import (
	"encoding/xml"
	"log"
	"os"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	e := xml.NewEncoder(os.Stdout)
	e.Indent("", "  ")

	r1 := xml.NewDecoder(strings.NewReader(`<title>Dover Beach</title>`))
	r2 := xml.NewDecoder(strings.NewReader(`<author>Matthew Arnold</author>`))
	r3 := xml.NewDecoder(strings.NewReader(`<incipit>The sea is calm to-night.</incipit>`))

	r := xmlstream.MultiReader(r1, r2, r3)

	if err := xmlstream.Copy(e, r); err != nil {
		log.Fatal("Error in MultiReader example:", err)
	}
}
Output:

<title>Dover Beach</title>
<author>Matthew Arnold</author>
<incipit>The sea is calm to-night.</incipit>

func Unwrap added in v0.1.0

func Unwrap(r TokenReader) (TokenReader, xml.Token, error)

Unwrap reads the next token from the provided TokenReader and, if it is a start element, returns a new TokenReader that skips the corresponding end element. If the token is not a start element the original TokenReader is returned.

Example
package main

import (
	"encoding/xml"
	"fmt"
	"log"
	"os"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	var r xml.TokenReader = xml.NewDecoder(strings.NewReader(`<message from="ismene@example.org/dIoK6Wi3"><body>No mind that ever lived stands firm in evil days, but goes astray.</body></message>`))
	e := xml.NewEncoder(os.Stdout)

	r, tok, err := xmlstream.Unwrap(r)
	if err != nil {
		log.Fatal("Error unwraping:", err)
	}

	fmt.Printf("%s:\n", tok.(xml.StartElement).Name.Local)
	if err := xmlstream.Copy(e, r); err != nil {
		log.Fatal("Error in unwrap example:", err)
	}

}
Output:

message:
<body>No mind that ever lived stands firm in evil days, but goes astray.</body>

func Wrap added in v0.1.0

func Wrap(r TokenReader, start xml.StartElement) TokenReader

Wrap wraps a token stream in a start element and its corresponding end element.

Example
package main

import (
	"encoding/xml"
	"log"
	"os"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	var r xml.TokenReader = xml.NewDecoder(strings.NewReader(`<body>No mind that ever lived stands firm in evil days, but goes astray.</body>`))
	e := xml.NewEncoder(os.Stdout)
	e.Indent("", "  ")

	r = xmlstream.Wrap(r, xml.StartElement{
		Name: xml.Name{Local: "message"},
		Attr: []xml.Attr{
			{Name: xml.Name{Local: "from"}, Value: "ismene@example.org/Fo6Eeb2e"},
		},
	})

	if err := xmlstream.Copy(e, r); err != nil {
		log.Fatal("Error in wrap example:", err)
	}
}
Output:

<message from="ismene@example.org/Fo6Eeb2e">
  <body>No mind that ever lived stands firm in evil days, but goes astray.</body>
</message>

type TokenWriter

type TokenWriter interface {
	EncodeToken(t xml.Token) error
	Flush() error
}

TokenWriter is anything that can encode tokens to an XML stream, including an xml.Encoder.

func Discard added in v0.8.0

func Discard() TokenWriter

Discard is a TokenWriter on which all Write calls succeed without doing anything.

func MultiWriter added in v0.2.0

func MultiWriter(writers ...TokenWriter) TokenWriter

MultiWriter creates a writer that duplicates its writes to all the provided writers, similar to the Unix tee(1) command. If any of the writers return an error, the MultiWriter immediately returns the error and stops writing.

type Transformer

type Transformer func(src TokenReader) TokenReader

A Transformer returns a new TokenReader that returns transformed tokens read from src.

func Inspect

func Inspect(f func(t xml.Token)) Transformer

Inspect performs an operation for each token in the stream without transforming the stream in any way.

func Map

func Map(mapping func(t xml.Token) xml.Token) Transformer

Map returns a Transformer that maps the tokens in the input using the given mapping.

func Remove

func Remove(f func(t xml.Token) bool) Transformer

Remove returns a Transformer that removes tokens for which f matches.

Example
package main

import (
	"bytes"
	"encoding/xml"
	"fmt"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	removequote := xmlstream.Remove(func(t xml.Token) bool {
		switch tok := t.(type) {
		case xml.StartElement:
			return tok.Name.Local == "quote"
		case xml.EndElement:
			return tok.Name.Local == "quote"
		}
		return false
	})

	tokenizer := removequote(xml.NewDecoder(strings.NewReader(`
<quote>
  <p>Foolery, sir, does walk about the orb, like the sun; it shines everywhere.</p>
</quote>`)))

	buf := new(bytes.Buffer)
	e := xml.NewEncoder(buf)
	for t, err := tokenizer.Token(); err == nil; t, err = tokenizer.Token() {
		e.EncodeToken(t)
	}
	e.Flush()
	fmt.Println(buf.String())
}
Output:

<p>Foolery, sir, does walk about the orb, like the sun; it shines everywhere.</p>

func RemoveAttr

func RemoveAttr(f func(start xml.StartElement, attr xml.Attr) bool) Transformer

RemoveAttr returns a Transformer that removes attributes from xml.StartElement's if f matches.

func RemoveElement

func RemoveElement(f func(start xml.StartElement) bool) Transformer

RemoveElement returns a Transformer that removes entire elements (and their children) if f matches the elements start token.

Example
package main

import (
	"bytes"
	"encoding/xml"
	"fmt"
	"strings"

	"mellium.im/xmlstream"
)

func main() {
	removeLangEn := xmlstream.RemoveElement(func(start xml.StartElement) bool {
		// TODO: Probably be more specific and actually check the name.
		if len(start.Attr) > 0 && start.Attr[0].Value == "en" {
			return true
		}
		return false
	})

	d := removeLangEn(xml.NewDecoder(strings.NewReader(`
<quote>
<p xml:lang="en">Thus the whirligig of time brings in his revenges.</p>
<p xml:lang="fr">et c’est ainsi que la roue du temps amène les occasions de revanche.</p>
</quote>
`)))

	buf := new(bytes.Buffer)
	e := xml.NewEncoder(buf)
	for t, err := d.Token(); err == nil; t, err = d.Token() {
		e.EncodeToken(t)
	}
	e.Flush()
	fmt.Println(buf.String())
}
Output:

<quote>

<p xml:lang="fr">et c’est ainsi que la roue du temps amène les occasions de revanche.</p>
</quote>

Notes

Bugs

  • Multiple uses of RemoveAttr will iterate over the attr list

    multiple times.
    

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL