Documentation ¶
Overview ¶
Package strum provides a string unmarshaler to tokenize line-oriented text (such as from stdin) and convert tokens into simple Go types.
Tokenization defaults to whitespace-separated fields, but strum supports using delimiters, regular expressions, or a custom tokenizer.
A line with a single token can be unmarshaled into a single variable of any supported type.
A line with multiple tokens can be unmarshaled into a slice or a struct of supported types. It can also be unmarshaled into a single string, in which case tokenization is skipped.
Trying to unmarshal multiple tokens into a single variable or too many tokens for the number of fields in a struct will result in an error. Having too few tokens for the fields in a struct is allowed; remaining fields will be zeroed. When unmarshaling to a slice, decoded values are appended; existing values are untouched.
strum supports the following types:
- strings
- booleans (like strconv.ParseBool but case insensitive)
- integers (signed and unsigned, all widths)
- floats (32-bit and 64-bit)
Additionally, there is special support for certain types:
- time.Duration
- time.Time
- any type implementing encoding.TextUnmarshaler
- pointers to supported types (which will auto-instantiate)
For numeric types, all Go literal formats are supported, including base prefixes (`0xff`) and underscores (`1_000_000`) for integers.
For time.Time, strum detects and parses a wide varity of formats using the github.com/araddon/dateparse library. By default, it favors United States interpretation of MM/DD/YYYY and has time zone semantics equivalent to `time.Parse`. strum allows specifying a custom parser instead.
strum provides `DecodeAll` to unmarshal all lines of input at once.
Example (Synopsis) ¶
package main import ( "log" "os" "github.com/xdg-go/strum" ) func main() { var err error d := strum.NewDecoder(os.Stdin) // Decode a line to a single int var x int err = d.Decode(&x) if err != nil { log.Fatal(err) } // Decode a line to a slice of int var xs []int err = d.Decode(&xs) if err != nil { log.Fatal(err) } // Decode a line to a struct type person struct { Name string Age int } var p person err = d.Decode(&p) if err != nil { log.Fatal(err) } // Decode all lines to a slice of struct var people []person err = d.DecodeAll(&people) if err != nil { log.Fatal(err) } }
Output:
Index ¶
- func Unmarshal(data []byte, v interface{}) error
- type DateParser
- type Decoder
- func (d *Decoder) Decode(v interface{}) error
- func (d *Decoder) DecodeAll(v interface{}) error
- func (d *Decoder) Tokens() ([]string, error)
- func (d *Decoder) WithDateParser(dp DateParser) *Decoder
- func (d *Decoder) WithSplitOn(sep string) *Decoder
- func (d *Decoder) WithTokenRegexp(re *regexp.Regexp) *Decoder
- func (d *Decoder) WithTokenizer(t Tokenizer) *Decoder
- type Tokenizer
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type DateParser ¶ added in v0.1.0
A DateParser parses a string into a time.Time struct.
type Decoder ¶
type Decoder struct {
// contains filtered or unexported fields
}
A Decoder converts an input stream into Go types.
func NewDecoder ¶
NewDecoder returns a Decoder that reads from r. The default Decoder will tokenize with `strings.Fields` function. The default date parser uses github.com/araddon/dateparse.ParseAny.
func (*Decoder) Decode ¶
Decode reads the next line of input and stores it in the value pointed to by `v`. It returns `io.EOF` when no more data is available.
Example (Struct) ¶
package main import ( "bytes" "fmt" "io" "log" "strings" "time" "github.com/xdg-go/strum" ) func main() { type person struct { Name string Age int Active bool Joined time.Time } lines := []string{ "John 42 true 2020-03-01T00:00:00Z", "Jane 23 false 2022-02-22T00:00:00Z", } r := bytes.NewBufferString(strings.Join(lines, "\n")) d := strum.NewDecoder(r) for { var p person err := d.Decode(&p) if err == io.EOF { return } if err != nil { log.Fatal(err) } fmt.Println(p) } }
Output: {John 42 true 2020-03-01 00:00:00 +0000 UTC} {Jane 23 false 2022-02-22 00:00:00 +0000 UTC}
func (*Decoder) DecodeAll ¶ added in v0.0.2
DecodeAll reads the remaining lines of input into `v`, where `v` must be a pointer to a slice of a type that would valid for Decode. It works as if `Decode` were called for all lines and the resulting values were appended to the slice. If `v` points to an uninitialized slice, the slice will be created. DecodeAll returns `nil` when EOF is reached.
Example (Ints) ¶
package main import ( "bytes" "fmt" "log" "strings" "github.com/xdg-go/strum" ) func main() { lines := []string{ "42", "23", } r := bytes.NewBufferString(strings.Join(lines, "\n")) d := strum.NewDecoder(r) var xs []int err := d.DecodeAll(&xs) if err != nil { log.Fatalf("decoding error: %v", err) } for _, x := range xs { fmt.Printf("%d\n", x) } }
Output: 42 23
Example (Struct) ¶
package main import ( "bytes" "fmt" "log" "strings" "time" "github.com/xdg-go/strum" ) func main() { type person struct { Name string Age int Active bool Joined time.Time } lines := []string{ "John 42 true 2020-03-01T00:00:00Z", "Jane 23 false 2022-02-22T00:00:00Z", } r := bytes.NewBufferString(strings.Join(lines, "\n")) d := strum.NewDecoder(r) var people []person err := d.DecodeAll(&people) if err != nil { log.Fatalf("decoding error: %v", err) } for _, p := range people { fmt.Printf("%v\n", p) } }
Output: {John 42 true 2020-03-01 00:00:00 +0000 UTC} {Jane 23 false 2022-02-22 00:00:00 +0000 UTC}
func (*Decoder) Tokens ¶
Tokens consumes a line of input and returns all strings generated by the tokenizer. It is used internally by `Decode`, but available for testing or for skipping over a line of input that should not be decoded.
func (*Decoder) WithDateParser ¶ added in v0.1.0
func (d *Decoder) WithDateParser(dp DateParser) *Decoder
WithDateParser modifies a Decoder to use a custom date parsing function.
func (*Decoder) WithSplitOn ¶
WithSplitOn modifies a Decoder to split fields on a separator string.
Example ¶
package main import ( "bytes" "fmt" "io" "log" "github.com/xdg-go/strum" ) func main() { type person struct { Last string First string } text := "Doe,John" r := bytes.NewBufferString(text) d := strum.NewDecoder(r).WithSplitOn(",") var p person err := d.Decode(&p) if err != nil && err != io.EOF { log.Fatal(err) } fmt.Println(p) }
Output: {Doe John}
func (*Decoder) WithTokenRegexp ¶
WithTokenRegexp modifies a Decoder to use a regular expression to extract tokens. The regular expression is called with `FindStringSubmatches` for each line of input, so it must encompass an entire line of input. If the line fails to match or if the regular expression has no subexpressions, an error is returned.
Example ¶
package main import ( "bytes" "fmt" "io" "log" "regexp" "github.com/xdg-go/strum" ) func main() { type jeans struct { Color string Waist int Inseam int } text := "Blue 36x32" r := bytes.NewBufferString(text) re := regexp.MustCompile(`^(\S+)\s+(\d+)x(\d+)`) d := strum.NewDecoder(r).WithTokenRegexp(re) var j jeans err := d.Decode(&j) if err != nil && err != io.EOF { log.Fatal(err) } fmt.Println(j) }
Output: {Blue 36 32}
func (*Decoder) WithTokenizer ¶
WithTokenizer modifies a Decoder to use a custom tokenizing function.