strum

package module

v0.2.0 Latest Latest Go to latest Published: Jan 3, 2022 License: Apache-2.0 Imports: 12 Imported by: 1

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/xdg-go/strum

Links

Open Source Insights

README ¶

strum – String Unmarshaler

The strum package provides line-oriented text decoding into simple Go variables, slices, and structs.

Splits on whitespace, a delimiter, a regular expression, or a custom tokenizer.
Supports basic primitive types: strings, booleans, ints, uints, floats.
Supports decoding time.Time using the dateparse library.
Supports decoding time.Duration.
Supports encoding.TextUnmarshaler types.
Decodes a line into a single variable, a slice, or a struct.
Decodes all lines into a slice of the above.

Synopsis

	d := strum.NewDecoder(os.Stdin)

	// Decode a line to a single int
	var x int
	err = d.Decode(&x)

	// Decode a line to a slice of int
	var xs []int
	err = d.Decode(&xs)

	// Decode a line to a struct
	type person struct {
		Name string
		Age  int
	}
	var p person
	err = d.Decode(&p)

	// Decode all lines to a slice of struct
	var people []person
	err = d.DecodeAll(&people)

Copyright and License

Licensed under the Apache License, Version 2.0 (the "License"). You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

Documentation ¶

Overview ¶

Package strum provides a string unmarshaler to tokenize line-oriented text (such as from stdin) and convert tokens into simple Go types.

Tokenization defaults to whitespace-separated fields, but strum supports using delimiters, regular expressions, or a custom tokenizer.

A line with a single token can be unmarshaled into a single variable of any supported type.

A line with multiple tokens can be unmarshaled into a slice or a struct of supported types. It can also be unmarshaled into a single string, in which case tokenization is skipped.

Trying to unmarshal multiple tokens into a single variable or too many tokens for the number of fields in a struct will result in an error. Having too few tokens for the fields in a struct is allowed; remaining fields will be zeroed. When unmarshaling to a slice, decoded values are appended; existing values are untouched.

strum supports the following types:

strings
booleans (like strconv.ParseBool but case insensitive)
integers (signed and unsigned, all widths)
floats (32-bit and 64-bit)

Additionally, there is special support for certain types:

time.Duration
time.Time
any type implementing encoding.TextUnmarshaler
pointers to supported types (which will auto-instantiate)

For numeric types, all Go literal formats are supported, including base prefixes (`0xff`) and underscores (`1_000_000`) for integers.

For time.Time, strum detects and parses a wide varity of formats using the github.com/araddon/dateparse library. By default, it favors United States interpretation of MM/DD/YYYY and has time zone semantics equivalent to `time.Parse`. strum allows specifying a custom parser instead.

strum provides `DecodeAll` to unmarshal all lines of input at once.

Example (Synopsis) ¶

package main

import (
	"log"
	"os"

	"github.com/xdg-go/strum"
)

func main() {
	var err error
	d := strum.NewDecoder(os.Stdin)

	// Decode a line to a single int
	var x int
	err = d.Decode(&x)
	if err != nil {
		log.Fatal(err)
	}

	// Decode a line to a slice of int
	var xs []int
	err = d.Decode(&xs)
	if err != nil {
		log.Fatal(err)
	}

	// Decode a line to a struct
	type person struct {
		Name string
		Age  int
	}
	var p person
	err = d.Decode(&p)
	if err != nil {
		log.Fatal(err)
	}

	// Decode all lines to a slice of struct
	var people []person
	err = d.DecodeAll(&people)
	if err != nil {
		log.Fatal(err)
	}
}

Output:

Index ¶

func Unmarshal(data []byte, v interface{}) error
type DateParser
type Decoder
- func NewDecoder(r io.Reader) *Decoder
type Tokenizer

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Unmarshal ¶ added in v0.1.0

func Unmarshal(data []byte, v interface{}) error

Unmarshal parses the input data as newline delimited strings and appends the result to the value pointed to by `v`, where `v` must be a pointer to a slice of a type that would valid for Decode. If `v` points to an uninitialized slice, the slice will be created.

Types ¶

type DateParser ¶ added in v0.1.0

type DateParser func(s string) (time.Time, error)

A DateParser parses a string into a time.Time struct.

type Decoder ¶

type Decoder struct {
	// contains filtered or unexported fields
}

A Decoder converts an input stream into Go types.

func NewDecoder ¶

func NewDecoder(r io.Reader) *Decoder

NewDecoder returns a Decoder that reads from r. The default Decoder will tokenize with `strings.Fields` function. The default date parser uses github.com/araddon/dateparse.ParseAny.

func (*Decoder) Decode ¶

func (d *Decoder) Decode(v interface{}) error

Decode reads the next line of input and stores it in the value pointed to by `v`. It returns `io.EOF` when no more data is available.

Example (Struct) ¶

package main

import (
	"bytes"
	"fmt"
	"io"
	"log"
	"strings"
	"time"

	"github.com/xdg-go/strum"
)

func main() {
	type person struct {
		Name   string
		Age    int
		Active bool
		Joined time.Time
	}

	lines := []string{
		"John 42 true  2020-03-01T00:00:00Z",
		"Jane 23 false 2022-02-22T00:00:00Z",
	}

	r := bytes.NewBufferString(strings.Join(lines, "\n"))
	d := strum.NewDecoder(r)

	for {
		var p person
		err := d.Decode(&p)
		if err == io.EOF {
			return
		}
		if err != nil {
			log.Fatal(err)
		}
		fmt.Println(p)
	}

}

Output:

{John 42 true 2020-03-01 00:00:00 +0000 UTC}
{Jane 23 false 2022-02-22 00:00:00 +0000 UTC}

func (*Decoder) DecodeAll ¶ added in v0.0.2

func (d *Decoder) DecodeAll(v interface{}) error

DecodeAll reads the remaining lines of input into `v`, where `v` must be a pointer to a slice of a type that would valid for Decode. It works as if `Decode` were called for all lines and the resulting values were appended to the slice. If `v` points to an uninitialized slice, the slice will be created. DecodeAll returns `nil` when EOF is reached.

Example (Ints) ¶

package main

import (
	"bytes"
	"fmt"
	"log"
	"strings"

	"github.com/xdg-go/strum"
)

func main() {
	lines := []string{
		"42",
		"23",
	}

	r := bytes.NewBufferString(strings.Join(lines, "\n"))
	d := strum.NewDecoder(r)

	var xs []int
	err := d.DecodeAll(&xs)
	if err != nil {
		log.Fatalf("decoding error: %v", err)
	}

	for _, x := range xs {
		fmt.Printf("%d\n", x)
	}

}

Output:

42
23

Example (Struct) ¶

package main

import (
	"bytes"
	"fmt"
	"log"
	"strings"
	"time"

	"github.com/xdg-go/strum"
)

func main() {
	type person struct {
		Name   string
		Age    int
		Active bool
		Joined time.Time
	}

	lines := []string{
		"John 42 true  2020-03-01T00:00:00Z",
		"Jane 23 false 2022-02-22T00:00:00Z",
	}

	r := bytes.NewBufferString(strings.Join(lines, "\n"))
	d := strum.NewDecoder(r)

	var people []person
	err := d.DecodeAll(&people)
	if err != nil {
		log.Fatalf("decoding error: %v", err)
	}

	for _, p := range people {
		fmt.Printf("%v\n", p)
	}

}

Output:

{John 42 true 2020-03-01 00:00:00 +0000 UTC}
{Jane 23 false 2022-02-22 00:00:00 +0000 UTC}

func (*Decoder) Tokens ¶

func (d *Decoder) Tokens() ([]string, error)

Tokens consumes a line of input and returns all strings generated by the tokenizer. It is used internally by `Decode`, but available for testing or for skipping over a line of input that should not be decoded.

func (*Decoder) WithDateParser ¶ added in v0.1.0

func (d *Decoder) WithDateParser(dp DateParser) *Decoder

WithDateParser modifies a Decoder to use a custom date parsing function.

func (*Decoder) WithSplitOn ¶

func (d *Decoder) WithSplitOn(sep string) *Decoder

WithSplitOn modifies a Decoder to split fields on a separator string.

Example ¶

package main

import (
	"bytes"
	"fmt"
	"io"
	"log"

	"github.com/xdg-go/strum"
)

func main() {
	type person struct {
		Last  string
		First string
	}

	text := "Doe,John"
	r := bytes.NewBufferString(text)

	d := strum.NewDecoder(r).WithSplitOn(",")

	var p person
	err := d.Decode(&p)
	if err != nil && err != io.EOF {
		log.Fatal(err)
	}

	fmt.Println(p)

}

Output:

{Doe John}

func (*Decoder) WithTokenRegexp ¶

func (d *Decoder) WithTokenRegexp(re *regexp.Regexp) *Decoder

WithTokenRegexp modifies a Decoder to use a regular expression to extract tokens. The regular expression is called with `FindStringSubmatches` for each line of input, so it must encompass an entire line of input. If the line fails to match or if the regular expression has no subexpressions, an error is returned.

Example ¶

package main

import (
	"bytes"
	"fmt"
	"io"
	"log"
	"regexp"

	"github.com/xdg-go/strum"
)

func main() {
	type jeans struct {
		Color  string
		Waist  int
		Inseam int
	}

	text := "Blue 36x32"
	r := bytes.NewBufferString(text)

	re := regexp.MustCompile(`^(\S+)\s+(\d+)x(\d+)`)
	d := strum.NewDecoder(r).WithTokenRegexp(re)

	var j jeans
	err := d.Decode(&j)
	if err != nil && err != io.EOF {
		log.Fatal(err)
	}

	fmt.Println(j)

}

Output:

{Blue 36 32}

func (*Decoder) WithTokenizer ¶

func (d *Decoder) WithTokenizer(t Tokenizer) *Decoder

WithTokenizer modifies a Decoder to use a custom tokenizing function.

type Tokenizer ¶

type Tokenizer func(s string) ([]string, error)

A Tokenizer is a function that breaks an input string into tokens.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL