lexy

package module
v0.5.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 14, 2024 License: MIT Imports: 7 Imported by: 0

README

lexy

Build Status Go Report Card Go Reference

Lexicographical Byte Order Encodings

Lexy is a library for order-preserving lexicographical binary encodings. Most common Go types and user-defined types are supported, and it allows for encodings ordered differently than a type's natural ordering. Lexy uses generics and requires Go 1.19 to use. It has been tested with Go versions 1.19 and 1.22. Lexy has no non-test dependencies.

It may be more efficient to use another encoding if lexicographical unsigned byte ordering is not needed. Lexy's primary purpose is to make it easier to use an ordered key-value store, an ordered binary trie, or similar.

The primary interface in lexy is Codec, with this definition (details in Go docs):

type Codec[T any] interface {
    // Append encodes value and appends the encoded bytes to buf,
    // returning the updated buffer.
    Append(buf []byte, value T) []byte

    // Put encodes value into buf,
    // returning buf following what was written.
    Put(buf []byte, value T) []byte

    // Get decodes a value of type T from buf,
    // returning the value and buf following the encoded value.
    Get(buf []byte) (T, []byte)

    // RequiresTerminator returns whether encoded values require
    // a terminator and escaping if more data is written following
    // the encoded value. This is the case for most unbounded types
    // like strings and slices, as well as types whose encodings
    // can be zero bytes.
    RequiresTerminator() bool
}

A typical use might look something like this:

type Word string
type Key []Word
type Value struct {
  // ...
}

// keyCodec is safe for concurrent use.
// The terser functions lexy.SliceOf and lexy.String can be used
// if the types involved are the same as their underlying types,
// string and []string in this case. That would look like this:
//
//   var keyCodec = lexy.SliceOf(lexy.String())
//
var keyCodec = lexy.CastSliceOf[Key](lexy.CastString[Word]())

// lexy could be used here,
// but may be overkill if ordered Values aren't needed.
func EncodeValue(v *Value) ([]byte, error) { /* ... */ }
func DecodeValue(b []byte) (*Value, error) { /* ... */ }

type KeyValueDB struct {
    providerDB *provider.DB
    // ...
}

func (db *KeyValueDB) Put(key Key, value *Value) error {
    // If keyCodec could encode zero bytes, this might be preferable
    //
    //   keyBytes := keyCodec.Append([]byte{}, key)
    //
    keyBytes := keyCodec.Append(nil, key)
    valueBytes, err := EncodeValue(value)
    if err != nil {
        return err
    }
    return db.providerDB.Put(keyBytes, valueBytes)
}

func (db *KeyValueDB) Get(key Key) (*Value, error) {
    keyBytes := keyCodec.Append(nil, key)
    valueBytes, err := db.providerDB.Get(keyBytes)
    if err != nil {
        return nil, err
    }
    return DecodeValue(valueBytes)
}

All Codecs provided by lexy are safe for concurrent use if their delegate Codecs (if any) are.

Codecs do not normally encode a data's type, users must know what is being decoded. This aligns with best practices in Go, types should be known at compile time. A user-defined Codec handling multiple types could be created, but it is not recommended, and it would still require a concrete wrapper type to conform to the Codec[T] interface.

Different Codecs will generally not produce encodings with consistent orderings with respect to each other. For example, the encoding for int8(1) will be lexicographically greater than the encoding for uint8(100).

The Codecs provided by lexy can encode nil to be less than or greater than the encodings for non-nil values, for types that allow nil values.

Lexy provides order-preserving Codecs for the following types.

  • bool
  • uint8 (aka byte), uint16, uint32, uint64
  • int8, int16, int32 (aka rune), int64
  • uint, int (encoded as 64-bit values)
  • float32, float64
  • *math.big.Int
  • *math.big.Float (does not encode Accuracy)
  • string
  • time.Time (encodes timezone offset, but not its name)
  • time.Duration
  • pointers (also encodes the referent)
  • slices
  • []byte (optimized for byte slices)

Lexy provides Codecs for the following types which either have no natural ordering, or whose natural ordering cannot be preserved while being encoded at full precision.

  • maps
  • complex64, complex128
  • *math.big.Rat

Lexy provides these additional Codecs.

  • A Codec for types with no value except the zero value, useful for the value types of maps used as sets.
  • A Codec which reverses the lexicographical ordering of another Codec.
  • A Codec which terminates and escapes the encodings of another Codec.

Lexy does not does not provide Codecs for the following types, but user-defined Codecs are easy to create. See the Go docs for examples.

  • structs
    The inherent limitations of generic types in Go make it impossible to do this in a general way without having a separate parallel set of non-generic codecs. This is not a bad thing, resolving types at compile time is one of the reasons Go is so efficient. Creating a strongly-typed user-defined Codec is a much simpler and safer alternative, and also prevents silently changing an encoding when the data type it encodes is changed.
  • arrays
    While it is possible to create a general Codec for array types, the generics are very messy and it requires using reflection extensively. As is the case for structs, creating a strongly-typed user-defined Codec is a better option.
  • uintptr
    This type has an implementation-specific size, and encoding a pointer without encoding what it points to doesn't make much sense.
  • functions, interfaces, channels

Documentation

Overview

Package lexy defines an API for lexicographically ordered binary encodings. Implementations are provided for most builtin Go data types, and supporting functionality is provided to allow for the creation of user-defined encodings.

The Codec[T] interface defines an encoding, with methods to encode and decode values of type T. Functions returning Codecs for different types constitute the majority of this API. There are two kinds of Codec-returning functions defined by this package, those for which Go can infer the type arguments, and those for which Go cannot. The former have terser names, as in Int16. The latter have names starting with "Cast", as in CastInt16[MyIntType]. These latter functions are only needed when creating a Codec for a type that is not the same as its underlying type. Empty also requires a type argument when used and is the only exception to this naming convention.

All Codecs provided by lexy are safe for concurrent use if their delegate Codecs (if any) are.

All Codecs provided by lexy will order nils first if nil can be encoded. Invoking NilsLast(codec) on a Codec will return a Codec which orders nils last, but only for the pointer, slice, map, []byte, and *big.Int/Float/Rat Codecs provided by lexy.

See [Codec.RequiresTerminator] for details on when escaping and terminating encoded bytes is required.

These Codec-returning functions do not require specifying a type parameter when invoked.

These Codec-returning functions require specifying a type parameter when invoked.

These are implementations of Prefix, used when creating user-defined Codecs that can encode types whose instances can be nil.

Example (Array)

ExampleArray shows how to define a Codec for an array type.

package main

import (
	"bytes"
	"fmt"
	"math"

	"github.com/phiryll/lexy"
)

type Quaternion [4]float64

type quaternionCodec struct{}

var quatCodec lexy.Codec[Quaternion] = quaternionCodec{}

func (quaternionCodec) Append(buf []byte, value Quaternion) []byte {
	for i := range value {
		buf = lexy.Float64().Append(buf, value[i])
	}
	return buf
}

func (quaternionCodec) Put(buf []byte, value Quaternion) []byte {
	for i := range value {
		buf = lexy.Float64().Put(buf, value[i])
	}
	return buf
}

func (quaternionCodec) Get(buf []byte) (Quaternion, []byte) {
	var value Quaternion
	for i := range value {
		value[i], buf = lexy.Float64().Get(buf)
	}
	return value, buf
}

func (quaternionCodec) RequiresTerminator() bool {
	return false
}

// ExampleArray shows how to define a Codec for an array type.
func main() {
	quats := []Quaternion{
		{0.0, 3.4, 2.1, -1.5},
		{-9.3e+10, 7.6, math.Inf(1), 42.0},
	}
	for _, quat := range quats {
		appendBuf := quatCodec.Append(nil, quat)
		putBuf := make([]byte, 4*8)
		quatCodec.Put(putBuf, quat)
		fmt.Println(bytes.Equal(appendBuf, putBuf))
		decoded, _ := quatCodec.Get(appendBuf)
		fmt.Println(decoded)
	}
}
Output:
true
[0 3.4 2.1 -1.5]
true
[-9.3e+10 7.6 +Inf 42]
Example (PointerToStruct)

ExamplePointerToStruct shows how to use pointers for efficiency in a user-defined Codec, to avoid unnecessarily copying large data structures. Note that types in Go other than structs and arrays do not have this problem. Complex numbers, strings, pointers, slices, and maps all have a relatively small footprint when passed by value. The same is true of time.Time and time.Duration instances.

Normally, a Codec[BigStruct] would be defined and Container's Codec would use it as lexy.PointerTo(bigStructCodec). However, calls to a Codec[BigStruct] will pass BigStruct instances by value, even though the wrapping pointer Codec is only copying pointers.

The order isn't relevant for this example, so other fields are not shown.

package main

import (
	"fmt"

	"github.com/phiryll/lexy"
)

type BigStruct struct {
	name string
	// ... big fields, inefficient to copy
}

type Container struct {
	big *BigStruct
	// ... other fields, but not large ones
	// If Container were also large, use the same technique
	// to create a Codec[*Container] instead.
}

var (
	PtrToBigStructCodec lexy.Codec[*BigStruct] = ptrToBigStructCodec{}
	ContainerCodec      lexy.Codec[Container]  = containterCodec{}
)

type ptrToBigStructCodec struct{}

func (ptrToBigStructCodec) Append(buf []byte, value *BigStruct) []byte {
	done, buf := lexy.PrefixNilsFirst.Append(buf, value == nil)
	if done {
		return buf
	}
	buf = lexy.TerminatedString().Append(buf, value.name)
	// Append other fields.
	return buf
}

func (ptrToBigStructCodec) Put(buf []byte, value *BigStruct) []byte {
	done, buf := lexy.PrefixNilsFirst.Put(buf, value == nil)
	if done {
		return buf
	}
	buf = lexy.TerminatedString().Put(buf, value.name)
	// Put other fields.
	return buf
}

func (ptrToBigStructCodec) Get(buf []byte) (*BigStruct, []byte) {
	done, buf := lexy.PrefixNilsFirst.Get(buf)
	if done {
		return nil, buf
	}
	name, buf := lexy.TerminatedString().Get(buf)
	// Get other fields.
	return &BigStruct{name /* , other fields ... */}, buf
}

func (ptrToBigStructCodec) RequiresTerminator() bool {
	return false
}

type containterCodec struct{}

func (containterCodec) Append(buf []byte, value Container) []byte {
	buf = PtrToBigStructCodec.Append(buf, value.big)
	// Append other fields.
	return buf
}

func (containterCodec) Put(buf []byte, value Container) []byte {
	buf = PtrToBigStructCodec.Put(buf, value.big)
	// Put other fields.
	// buf = someCodec.Put(buf, someValue)
	return buf
}

func (containterCodec) Get(buf []byte) (Container, []byte) {
	big, buf := PtrToBigStructCodec.Get(buf)
	// Get other fields.
	// someValue, buf := someCodec.Get(buf)
	// ...
	return Container{big /* , other fields ... */}, buf
}

func (containterCodec) RequiresTerminator() bool {
	return false
}

// This is only used for printing output in the example.
func containerEquals(a, b Container) bool {
	if a.big == nil && b.big == nil {
		return true
	}
	if a.big == nil || b.big == nil {
		return false
	}
	return *a.big == *b.big
}

// ExamplePointerToStruct shows how to use pointers for efficiency
// in a user-defined Codec, to avoid unnecessarily copying large data structures.
// Note that types in Go other than structs and arrays do not have this problem.
// Complex numbers, strings, pointers, slices, and maps
// all have a relatively small footprint when passed by value.
// The same is true of [time.Time] and [time.Duration] instances.
//
// Normally, a Codec[BigStruct] would be defined and Container's Codec
// would use it as lexy.PointerTo(bigStructCodec).
// However, calls to a Codec[BigStruct] will pass BigStruct instances by value,
// even though the wrapping pointer Codec is only copying pointers.
//
// The order isn't relevant for this example, so other fields are not shown.
func main() {
	for _, value := range []Container{
		{nil},
		{&BigStruct{""}},
		{&BigStruct{"abc"}},
	} {
		buf := ContainerCodec.Append(nil, value)
		decoded, _ := ContainerCodec.Get(buf)
		fmt.Println(containerEquals(value, decoded))
	}
}
Output:
true
true
true
Example (RangeQuery)

ExampleRangeQuery shows how a range query might be implemented.

package main

import (
	"bytes"
	"fmt"
	"sort"

	"github.com/phiryll/lexy"
)

// BEGIN TOY DB IMPLEMENTATION

type DB struct {
	entries []Entry // sort order by Entry.key is maintained
}

type Entry struct {
	Key   []byte
	Value int // value type is unimportant for this example
}

func (db *DB) insert(i int, entry Entry) {
	db.entries = append(db.entries, Entry{nil, 0})
	copy(db.entries[i+1:], db.entries[i:])
	db.entries[i] = entry
}

func (db *DB) search(entry Entry) (int, bool) {
	index := sort.Search(len(db.entries), func(i int) bool {
		return bytes.Compare(entry.Key, db.entries[i].Key) <= 0
	})
	if index < len(db.entries) && bytes.Equal(entry.Key, db.entries[index].Key) {
		return index, true
	}
	return index, false
}

func (db *DB) Put(key []byte, value int) error {
	entry := Entry{key, value}
	if i, found := db.search(entry); found {
		db.entries[i] = entry
	} else {
		db.insert(i, entry)
	}
	return nil
}

// Returns Entries, in order, such that (begin <= entry.Key < end).
func (db *DB) Range(begin, end []byte) ([]Entry, error) {
	a, _ := db.search(Entry{begin, 0})
	b, _ := db.search(Entry{end, 0})
	return db.entries[a:b], nil
}

// END TOY DB IMPLEMENTATION

// BEGIN KEY CODEC

var (
	wordsCodec = lexy.Terminate(lexy.SliceOf(lexy.String()))
	costCodec  = lexy.Int32()
	KeyCodec   = keyCodec{}
)

type keyCodec struct{}

func (keyCodec) Append(buf []byte, key UserKey) []byte {
	buf = costCodec.Append(buf, key.cost)
	return wordsCodec.Append(buf, key.words)
}

func (keyCodec) Put(buf []byte, key UserKey) []byte {
	buf = costCodec.Put(buf, key.cost)
	return wordsCodec.Put(buf, key.words)
}

func (keyCodec) Get(buf []byte) (UserKey, []byte) {
	cost, buf := costCodec.Get(buf)
	words, buf := wordsCodec.Get(buf)
	return UserKey{words, cost}, buf
}

func (keyCodec) RequiresTerminator() bool {
	return false
}

// END KEY CODEC

// BEGIN USER DB ABSTRACTION

type UserKey struct {
	words []string
	cost  int32
}

func (k UserKey) String() string {
	return fmt.Sprintf("{%d, %v}", k.cost, k.words)
}

type UserDB struct {
	realDB DB
}

type UserEntry struct {
	Key   UserKey
	Value int
}

func (db *UserDB) Put(key UserKey, value int) error {
	return db.realDB.Put(KeyCodec.Append(nil, key), value)
}

// Returns Entries, in order, such that (begin <= entry.Key < end).
func (db *UserDB) Range(begin, end UserKey) ([]UserEntry, error) {
	beginBytes := KeyCodec.Append(nil, begin)
	endBytes := KeyCodec.Append(nil, end)
	dbEntries, err := db.realDB.Range(beginBytes, endBytes)
	if err != nil {
		return nil, err
	}
	userEntries := make([]UserEntry, len(dbEntries))
	for i, dbEntry := range dbEntries {
		userKey, _ := KeyCodec.Get(dbEntry.Key)
		userEntries[i] = UserEntry{userKey, dbEntry.Value}
	}
	return userEntries, nil
}

// END USER DB ABSTRACTION

// ExampleRangeQuery shows how a range query might be implemented.
func main() {
	var userDB UserDB
	for _, item := range []struct {
		cost  int32
		words []string
		value int
	}{
		// In sort order for clarity: key.Cost, then key.Words
		{1, []string{"not"}, 0},
		{1, []string{"not", "the"}, 0},
		{1, []string{"not", "the", "end"}, 0},
		{1, []string{"now"}, 0},

		{2, []string{"iffy", "proposal"}, 0},
		{2, []string{"in"}, 0},
		{2, []string{"in", "cahoots"}, 0},
		{2, []string{"in", "sort"}, 0},
		{2, []string{"in", "sort", "order"}, 0},
		{2, []string{"integer", "sort"}, 0},
	} {
		err := userDB.Put(UserKey{item.words, item.cost}, item.value)
		if err != nil {
			panic(err)
		}
	}

	printRange := func(low, high UserKey) {
		fmt.Printf("Range: %s -> %s\n", low.String(), high.String())
		entries, err := userDB.Range(low, high)
		if err != nil {
			panic(err)
		}
		for _, userEntry := range entries {
			fmt.Println(userEntry.Key.String())
		}
	}

	printRange(
		UserKey{[]string{"an"}, -1000},
		UserKey{[]string{"empty", "result"}, 1})
	printRange(
		UserKey{[]string{}, 1},
		UserKey{[]string{"not", "the", "beginning"}, 1})
	printRange(
		UserKey{[]string{"nouns", "are", "words"}, 1},
		UserKey{[]string{"in", "sort", "disorder"}, 2})
}
Output:
Range: {-1000, [an]} -> {1, [empty result]}
Range: {1, []} -> {1, [not the beginning]}
{1, [not]}
{1, [not the]}
Range: {1, [nouns are words]} -> {2, [in sort disorder]}
{1, [now]}
{2, [iffy proposal]}
{2, [in]}
{2, [in cahoots]}
{2, [in sort]}
Example (SchemaChange)

ExampleSchemaChange shows one way to allow for schema changes. The gist of this example is to encode field names as well as field values. This can be done in other ways, and more or less leniently. This is just an example.

Note that different encodings of the same type will generally not be ordered correctly with respect to each other, regardless of the technique used.

Only field values should be encoded if any of the following are true:

  • the schema is expected to never change, or
  • the encoded data will be replaced wholesale if the schema changes, or
  • schema versioning is used (see the schema version example).

The kinds of schema change addressed by this example are:

  • field added
  • field removed
  • field renamed

If a field's type might change, the best option is to use versioning. Otherwise, it would be necessary to encode the field's type before its value, because there's no way to know how to read the value otherwise, and then the type would be the primary sort key for that field. Encoding a value's type is strongly discouraged.

The sort order of encoded data cannot be changed. However, there is nothing wrong with creating multiple Codecs with different orderings for the same type, nor with storing the same data ordered in different ways in the same data store.

package main

import (
	"fmt"

	"github.com/phiryll/lexy"
)

// The previous version of the type, used here to create already existing data.
// Unless versioning is being used (see the schema version example),
// this would be the same type as schema, just earlier in the code's history.
// So both would not normally exist at the same time.
type schemaPrevious struct {
	name     string
	lastName string
	count    uint16
}

// The current version of the type.
type schema struct {
	firstName  string // renamed from "name"
	middleName string // added
	lastName   string
	// count      uint16 // removed
}

var (
	nameCodec     = lexy.TerminatedString()
	countCodec    = lexy.Uint16()
	PreviousCodec = previousCodec{}
	SchemaCodec   = schemaCodec{}
)

type previousCodec struct{}

func (previousCodec) Append(buf []byte, value schemaPrevious) []byte {
	buf = nameCodec.Append(buf, "count")
	buf = countCodec.Append(buf, value.count)
	buf = nameCodec.Append(buf, "lastName")
	buf = nameCodec.Append(buf, value.lastName)
	buf = nameCodec.Append(buf, "name")
	return nameCodec.Append(buf, value.name)
}

func (previousCodec) Put(_ []byte, _ schemaPrevious) []byte {
	panic("unused in this example")
}

func (previousCodec) Get(_ []byte) (schemaPrevious, []byte) {
	panic("unused in this example")
}

// Returns true because struct Codecs storing field name/value pairs
// to handle previous versions must be tolerant of missing fields.
// This Codec is essentially a map.
func (previousCodec) RequiresTerminator() bool {
	return true
}

// Other than handling the field changes, this Codec could change the sort order,
// although writing back to the same database index would corrupt the ordering.
// Because Get reads field names first, it is tolerant of field reorderings.
type schemaCodec struct{}

func (schemaCodec) Append(_ []byte, _ schema) []byte {
	panic("unused in this example")
}

func (schemaCodec) Put(_ []byte, _ schema) []byte {
	panic("unused in this example")
}

func (schemaCodec) Get(buf []byte) (schema, []byte) {
	var value schema
	for {
		if len(buf) == 0 {
			return value, buf
		}
		// Declaring these because "x, buf := ..." would declare a new buf,
		// and we need to keep the same buf throughout.
		var field string
		var firstName, middleName, lastName string
		field, buf = nameCodec.Get(buf)
		switch field {
		case "name", "firstName":
			// Field was renamed.
			firstName, buf = nameCodec.Get(buf)
			value.firstName = firstName
		case "middleName":
			// Field was added.
			middleName, buf = nameCodec.Get(buf)
			value.middleName = middleName
		case "lastName":
			lastName, buf = nameCodec.Get(buf)
			value.lastName = lastName
		case "count":
			// Field was removed, but we still need to read the value.
			_, buf = countCodec.Get(buf)
		default:
			// We must stop, we don't know how to proceed.
			panic(fmt.Sprintf("unrecognized field name %q", field))
		}
	}
}

// Returns true because struct Codecs storing field name/value pairs
// to handle previous versions must be tolerant of missing fields.
// This Codec is essentially a map.
func (schemaCodec) RequiresTerminator() bool {
	return true
}

// ExampleSchemaChange shows one way to allow for schema changes.
// The gist of this example is to encode field names as well as field values.
// This can be done in other ways, and more or less leniently.
// This is just an example.
//
// Note that different encodings of the same type will generally not be ordered
// correctly with respect to each other, regardless of the technique used.
//
// Only field values should be encoded if any of the following are true:
//   - the schema is expected to never change, or
//   - the encoded data will be replaced wholesale if the schema changes, or
//   - schema versioning is used (see the schema version example).
//
// The kinds of schema change addressed by this example are:
//   - field added
//   - field removed
//   - field renamed
//
// If a field's type might change, the best option is to use versioning.
// Otherwise, it would be necessary to encode the field's type before its value,
// because there's no way to know how to read the value otherwise,
// and then the type would be the primary sort key for that field.
// Encoding a value's type is strongly discouraged.
//
// The sort order of encoded data cannot be changed.
// However, there is nothing wrong with creating multiple Codecs
// with different orderings for the same type, nor with storing
// the same data ordered in different ways in the same data store.
func main() {
	for _, previous := range []schemaPrevious{
		{"Alice", "Jones", 35},
		{"", "Washington", 17},
		{"Cathy", "Spencer", 23},
	} {
		buf := PreviousCodec.Append(nil, previous)
		current, _ := SchemaCodec.Get(buf)
		fmt.Println(previous.name == current.firstName &&
			previous.lastName == current.lastName &&
			current.middleName == "")
	}
}
Output:
true
true
true
Example (SchemaVersion)

ExampleSchemaVersion shows how schema versioning could be implemented. This can be done in other ways, and more or less leniently. This is just an example, and likely a poorly structured one at that.

Note that different encodings of the same type will generally not be ordered correctly with respect to each other, regardless of the technique used.

The sort order of encoded data cannot be changed. However, there is nothing wrong with creating multiple Codecs with different orderings for the same type, nor with storing the same data ordered in different ways in the same data store.

package main

import (
	"fmt"
	"sort"

	"github.com/phiryll/lexy"
)

type schemaVersion1 struct {
	name string
}

type schemaVersion2 struct {
	name     string
	lastName string // added
}

type schemaVersion3 struct {
	name     string
	lastName string
	count    uint16 // added
}

// The current version of the type.
type schemaVersion4 struct {
	firstName  string // renamed from "name"
	middleName string // added
	lastName   string
	// count      uint16 // removed
}

var (
	// Which schema this returns will be updated as new versions are added.
	VersionedCodec lexy.Codec[schemaVersion4] = versionedCodec{}

	// The types of the Codecs can be inferred if using Go 1.21 or later.
	SchemaVersion1Codec lexy.Codec[schemaVersion1] = schemaVersion1Codec{}
	SchemaVersion2Codec lexy.Codec[schemaVersion2] = schemaVersion2Codec{}
	SchemaVersion3Codec lexy.Codec[schemaVersion3] = schemaVersion3Codec{}
	SchemaVersion4Codec lexy.Codec[schemaVersion4] = schemaVersion4Codec{}

	NameCodec  = lexy.TerminatedString()
	CountCodec = lexy.Uint16()
)

type versionedCodec struct{}

func (versionedCodec) Append(buf []byte, value schemaVersion4) []byte {
	buf = lexy.Uint32().Append(buf, 4)
	return SchemaVersion4Codec.Append(buf, value)
}

func (versionedCodec) Put(buf []byte, value schemaVersion4) []byte {
	buf = lexy.Uint32().Put(buf, 4)
	return SchemaVersion4Codec.Put(buf, value)
}

func (versionedCodec) Get(buf []byte) (schemaVersion4, []byte) {
	version, buf := lexy.Uint32().Get(buf)
	switch version {
	case 1:
		v1, newBuf := SchemaVersion1Codec.Get(buf)
		return schemaVersion4{v1.name, "", ""}, newBuf
	case 2:
		v2, newBuf := SchemaVersion2Codec.Get(buf)
		return schemaVersion4{v2.name, "", v2.lastName}, newBuf
	case 3:
		v3, newBuf := SchemaVersion3Codec.Get(buf)
		return schemaVersion4{v3.name, "", v3.lastName}, newBuf
	case 4:
		v4, newBuf := SchemaVersion4Codec.Get(buf)
		return v4, newBuf
	default:
		panic(fmt.Sprintf("unknown schema version: %d", version))
	}
}

func (versionedCodec) RequiresTerminator() bool {
	return false
}

// Version 1

type schemaVersion1Codec struct{}

func (schemaVersion1Codec) Append(buf []byte, value schemaVersion1) []byte {
	return NameCodec.Append(buf, value.name)
}

func (schemaVersion1Codec) Put(buf []byte, value schemaVersion1) []byte {
	return NameCodec.Put(buf, value.name)
}

func (schemaVersion1Codec) Get(buf []byte) (schemaVersion1, []byte) {
	name, buf := NameCodec.Get(buf)
	return schemaVersion1{name}, buf
}

func (schemaVersion1Codec) RequiresTerminator() bool {
	return false
}

// Version 2

type schemaVersion2Codec struct{}

func (schemaVersion2Codec) Append(buf []byte, value schemaVersion2) []byte {
	buf = NameCodec.Append(buf, value.lastName)
	return NameCodec.Append(buf, value.name)
}

func (schemaVersion2Codec) Put(buf []byte, value schemaVersion2) []byte {
	buf = NameCodec.Put(buf, value.lastName)
	return NameCodec.Put(buf, value.name)
}

func (schemaVersion2Codec) Get(buf []byte) (schemaVersion2, []byte) {
	lastName, buf := NameCodec.Get(buf)
	name, buf := NameCodec.Get(buf)
	return schemaVersion2{name, lastName}, buf
}

func (schemaVersion2Codec) RequiresTerminator() bool {
	return false
}

// Version 3

type schemaVersion3Codec struct{}

func (schemaVersion3Codec) Append(buf []byte, value schemaVersion3) []byte {
	buf = CountCodec.Append(buf, value.count)
	buf = NameCodec.Append(buf, value.lastName)
	return NameCodec.Append(buf, value.name)
}

func (schemaVersion3Codec) Put(buf []byte, value schemaVersion3) []byte {
	buf = CountCodec.Put(buf, value.count)
	buf = NameCodec.Put(buf, value.lastName)
	return NameCodec.Put(buf, value.name)
}

func (schemaVersion3Codec) Get(buf []byte) (schemaVersion3, []byte) {
	count, buf := CountCodec.Get(buf)
	lastName, buf := NameCodec.Get(buf)
	name, buf := NameCodec.Get(buf)
	return schemaVersion3{name, lastName, count}, buf
}

func (schemaVersion3Codec) RequiresTerminator() bool {
	return false
}

// Version 4

type schemaVersion4Codec struct{}

func (schemaVersion4Codec) Append(buf []byte, value schemaVersion4) []byte {
	buf = NameCodec.Append(buf, value.lastName)
	buf = NameCodec.Append(buf, value.firstName)
	return NameCodec.Append(buf, value.middleName)
}

func (schemaVersion4Codec) Put(buf []byte, value schemaVersion4) []byte {
	buf = NameCodec.Put(buf, value.lastName)
	buf = NameCodec.Put(buf, value.firstName)
	return NameCodec.Put(buf, value.middleName)
}

func (schemaVersion4Codec) Get(buf []byte) (schemaVersion4, []byte) {
	lastName, buf := NameCodec.Get(buf)
	firstName, buf := NameCodec.Get(buf)
	middleName, buf := NameCodec.Get(buf)
	return schemaVersion4{firstName, middleName, lastName}, buf
}

func (schemaVersion4Codec) RequiresTerminator() bool {
	return false
}

// A helper function for this test, to write older versions.
func writeWithVersion[T any](version uint32, codec lexy.Codec[T], value T) []byte {
	buf := lexy.Uint32().Append(nil, version)
	return codec.Append(buf, value)
}

// ExampleSchemaVersion shows how schema versioning could be implemented.
// This can be done in other ways, and more or less leniently.
// This is just an example, and likely a poorly structured one at that.
//
// Note that different encodings of the same type will generally not be ordered
// correctly with respect to each other, regardless of the technique used.
//
// The sort order of encoded data cannot be changed.
// However, there is nothing wrong with creating multiple Codecs
// with different orderings for the same type, nor with storing
// the same data ordered in different ways in the same data store.
func main() {
	// Encode data of a bunch of different versions and
	// throw all the encodings into the same slice.
	// Then make sure we can successfully decode them all.
	var encoded [][]byte

	// order: name
	for _, v1 := range []schemaVersion1{
		{"Bob"},
		{"Alice"},
		{"Cathy"},
	} {
		encoded = append(encoded, writeWithVersion(1, SchemaVersion1Codec, v1))
	}

	// order: lastName, name
	for _, v2 := range []schemaVersion2{
		{"Dave", "Thomas"},
		{"Edgar", "James"},
		{"Fiona", "Smith"},
	} {
		encoded = append(encoded, writeWithVersion(2, SchemaVersion2Codec, v2))
	}

	// order: count, lastName, name
	for _, v3 := range []schemaVersion3{
		{"Gloria", "Baker", 6},
		{"Henry", "Washington", 3},
		{"Isabel", "Bardot", 7},
	} {
		encoded = append(encoded, writeWithVersion(3, SchemaVersion3Codec, v3))
	}

	// order: lastName, firstName, middleName
	for _, v4 := range []schemaVersion4{
		{"Kevin", "Alex", "Monroe"},
		{"Jennifer", "Anne", "Monroe"},
		{"Lois", "Elizabeth", "Cassidy"},
	} {
		encoded = append(encoded, VersionedCodec.Append(nil, v4))
	}

	// When the encodings are sorted, they will be in the order:
	// - primary: version
	// - secondary: the encoded order for that version
	// sortableEncodings is defined in the Struct example.
	sort.Sort(sortableEncodings{encoded})

	for _, b := range encoded {
		value, _ := VersionedCodec.Get(b)
		fmt.Printf("%+v\n", value)
	}
}
Output:
{firstName:Alice middleName: lastName:}
{firstName:Bob middleName: lastName:}
{firstName:Cathy middleName: lastName:}
{firstName:Edgar middleName: lastName:James}
{firstName:Fiona middleName: lastName:Smith}
{firstName:Dave middleName: lastName:Thomas}
{firstName:Henry middleName: lastName:Washington}
{firstName:Gloria middleName: lastName:Baker}
{firstName:Isabel middleName: lastName:Bardot}
{firstName:Lois middleName:Elizabeth lastName:Cassidy}
{firstName:Jennifer middleName:Anne lastName:Monroe}
{firstName:Kevin middleName:Alex lastName:Monroe}
Example (Struct)

ExampleStruct shows how to define a typical user-defined Codec. someStructCodec in this example demonstrates an idiomatic Codec definition. The same pattern is used for non-struct types, see the array example for one such use case.

The rules of thumb are:

  • The order in which encoded data is written defines the Codec's ordering. Get should read data in the same order it was written, using the same Codecs. The schema change example has an exception to this.
  • Use lexy.PrefixNilsFirst or lexy.PrefixNilsLast if the value can be nil.
  • Get must panic if it cannot decode a value.
  • Generally use lexy.Terminate when a field's Codec might require it. See [tagsCodec] in this example for a typical usage. It is safe to return false from lexy.Codec.RequiresTerminator if you do this for all encoded fields and the number of fields is fixed.
package main

import (
	"bytes"
	"fmt"
	"math"
	"sort"

	"github.com/phiryll/lexy"
)

type SomeStruct struct {
	size  int32
	score float32
	tags  []string
}

func (s SomeStruct) String() string {
	return fmt.Sprintf("{%d %.2f %#v}", s.size, s.score, s.tags)
}

// All of these are safe for concurrent access.
var (
	// Score sorts high to low.
	negScoreCodec   = lexy.Negate(lexy.Float32())
	tagsCodec       = lexy.Terminate(lexy.SliceOf(lexy.String()))
	SomeStructCodec = someStructCodec{}
)

// Sort order is:
//   - size
//   - score (high to low)
//   - tags
type someStructCodec struct{}

func (someStructCodec) Append(buf []byte, value SomeStruct) []byte {
	buf = lexy.Int32().Append(buf, value.size)
	buf = negScoreCodec.Append(buf, value.score)
	return tagsCodec.Append(buf, value.tags)
}

func (someStructCodec) Put(buf []byte, value SomeStruct) []byte {
	buf = lexy.Int32().Put(buf, value.size)
	buf = negScoreCodec.Put(buf, value.score)
	return tagsCodec.Put(buf, value.tags)
}

func (someStructCodec) Get(buf []byte) (SomeStruct, []byte) {
	size, buf := lexy.Int32().Get(buf)
	score, buf := negScoreCodec.Get(buf)
	tags, buf := tagsCodec.Get(buf)
	return SomeStruct{size, score, tags}, buf
}

func (someStructCodec) RequiresTerminator() bool {
	return false
}

// Only defined to test whether two SomeStructs are equal.
func structsEqual(a, b SomeStruct) bool {
	if a.size != b.size {
		return false
	}
	// NaN != NaN, even when they're the exact same bits.
	if math.Float32bits(a.score) != math.Float32bits(b.score) {
		return false
	}
	if len(a.tags) != len(b.tags) {
		return false
	}
	for i := range a.tags {
		if a.tags[i] != b.tags[i] {
			return false
		}
	}
	return true
}

type sortableEncodings struct {
	b [][]byte
}

var _ sort.Interface = sortableEncodings{nil}

func (s sortableEncodings) Len() int           { return len(s.b) }
func (s sortableEncodings) Less(i, j int) bool { return bytes.Compare(s.b[i], s.b[j]) < 0 }
func (s sortableEncodings) Swap(i, j int)      { s.b[i], s.b[j] = s.b[j], s.b[i] }

// ExampleStruct shows how to define a typical user-defined Codec.
// someStructCodec in this example demonstrates an idiomatic Codec definition.
// The same pattern is used for non-struct types,
// see the array example for one such use case.
//
// The rules of thumb are:
//   - The order in which encoded data is written defines the Codec's ordering.
//     Get should read data in the same order it was written, using the same Codecs.
//     The schema change example has an exception to this.
//   - Use [lexy.PrefixNilsFirst] or [lexy.PrefixNilsLast] if the value can be nil.
//   - Get must panic if it cannot decode a value.
//   - Generally use [lexy.Terminate] when a field's Codec might require it.
//     See [tagsCodec] in this example for a typical usage.
//     It is safe to return false from [lexy.Codec.RequiresTerminator]
//     if you do this for all encoded fields and the number of fields is fixed.
func main() {
	structs := []SomeStruct{
		{1, 5.0, nil},
		{-72, 37.54, []string{"w", "x", "y", "z"}},
		{42, 37.6, []string{"p", "q", "r"}},
		{42, float32(math.Inf(1)), []string{}},
		{-100, 37.54, []string{"a", "b"}},
		{42, 37.54, []string{"a", "b", "a"}},
		{-100, float32(math.NaN()), []string{"cat"}},
		{42, 37.54, nil},
		{153, 37.54, []string{"d"}},
	}

	var encoded [][]byte
	fmt.Println("Round Trip Equals:")
	for _, value := range structs {
		buf := SomeStructCodec.Append(nil, value)
		decoded, _ := SomeStructCodec.Get(buf)
		fmt.Println(structsEqual(value, decoded))
		encoded = append(encoded, buf)
	}

	sort.Sort(sortableEncodings{encoded})
	fmt.Println("Sorted:")
	for _, enc := range encoded {
		decoded, _ := SomeStructCodec.Get(enc)
		fmt.Println(decoded.String())
	}

}
Output:
Round Trip Equals:
true
true
true
true
true
true
true
true
true
Sorted:
{-100 NaN []string{"cat"}}
{-100 37.54 []string{"a", "b"}}
{-72 37.54 []string{"w", "x", "y", "z"}}
{1 5.00 []string(nil)}
{42 +Inf []string{}}
{42 37.60 []string{"p", "q", "r"}}
{42 37.54 []string(nil)}
{42 37.54 []string{"a", "b", "a"}}
{153 37.54 []string{"d"}}

Index

Examples

Constants

This section is empty.

Variables

View Source
var (
	// PrefixNilsFirst is the [Prefix] implementation ordering nils first.
	PrefixNilsFirst prefixNilsFirst

	// PrefixNilsLast is the [Prefix] implementation ordering nils last.
	PrefixNilsLast prefixNilsLast
)

Functions

This section is empty.

Types

type Codec

type Codec[T any] interface {
	// Append encodes value and appends the encoded bytes to buf, returning the updated buffer.
	//
	// If buf is nil and no bytes are appended, Append may return nil.
	Append(buf []byte, value T) []byte

	// Put encodes value into buf, returning buf following what was written.
	//
	// Put will panic if buf is too small, and still may have written some data to buf.
	// Put will write only the bytes that encode value.
	Put(buf []byte, value T) []byte

	// Get decodes a value of type T from buf, returning the value and buf following the encoded value.
	// Get will panic if a value of type T cannot be successfully decoded from buf.
	// If buf is empty and this Codec could encode zero bytes for some value, Get will return that value and buf.
	// Get will not modify buf.
	Get(buf []byte) (T, []byte)

	// RequiresTerminator returns whether encoded values require escaping and a terminator
	// if more data is written following the encoded value.
	// This is the case for most unbounded types like slices and maps,
	// as well as types whose encodings can be zero bytes.
	// Wrapping this Codec with [Terminate] will return a Codec which behaves properly in these situations.
	//
	// For the rest of this doc comment, "requires escaping" is shorthand for
	// "requires escaping and a terminator if more data is written following the encoded value."
	//
	// Codecs that could encode zero bytes, like those for string and [Empty], always require escaping.
	// Codecs that could produce two distinct non-empty encodings with one being a prefix of the other,
	// like those for slices and maps, always require escaping.
	// Codecs that cannot produce two distinct non-empty encodings with one being a prefix of the other,
	// like those for primitive integers and floats, never require escaping.
	// Codecs that always encode to a non-zero fixed number of bytes are a special case of this.
	//
	// The net effect of escaping and terminating is to prevent one encoding from being the prefix of another,
	// while maintaining the same lexicographical ordering.
	RequiresTerminator() bool
}

Codec defines a binary encoding for values of type T. Most of the Codec implementations provided by this package preserve the type's natural ordering, but nothing requires that behavior. Append and Put should produce the same encoded bytes. Get must be able to decode encodings produced by Append and Put. Encoding and decoding should be lossless inverse operations. Exceptions to any of these behaviors are allowed, but should be clearly documented.

All Codecs provided by lexy will order nils first if instances of type T can be nil. Invoking NilsLast(codec) on a Codec will return a Codec which orders nils last, but only for the pointer, slice, map, []byte, and *big.Int/Float/Rat Codecs provided by lexy.

If instances of type T can be nil, implementations should invoke the appropriate method of PrefixNilsFirst or PrefixNilsLast as the first step of encoding or decoding method implementations. See the Prefix docs for example usage idioms.

All Codecs provided by lexy are safe for concurrent use if their delegate Codecs (if any) are.

func BigFloat

func BigFloat() Codec[*big.Float]

BigFloat returns a Codec for the *big.Float type, with nils ordered first. The encoded order is the numeric value first, precision second, and rounding mode third. Like floats, -Inf, -0.0, +0.0, and +Inf all have a big.Float representation. However, there is no big.Float representation for NaN. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

This Codec is lossy. It does not encode the value's big.Accuracy.

Example
package main

import (
	"fmt"
	"math/big"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.BigFloat()
	var value big.Float
	value.SetString("-1.23456789e+50732")
	buf := codec.Append(nil, &value)
	decoded, _ := codec.Get(buf)
	fmt.Println(value.Cmp(decoded))
}
Output:
0

func BigInt

func BigInt() Codec[*big.Int]

BigInt returns a Codec for the *big.Int type, with nils ordered first. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"
	"math/big"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.BigInt()
	var value big.Int
	value.SetString("-1234567890123456789012345678901234567890", 10)
	buf := codec.Append(nil, &value)
	decoded, _ := codec.Get(buf)
	fmt.Println(decoded)
}
Output:
-1234567890123456789012345678901234567890

func BigRat

func BigRat() Codec[*big.Rat]

BigRat returns a Codec for the *big.Rat type, with nils ordered first. The encoded order is signed numerator first, positive denominator second. Note that this is not the natural ordering for rational numbers. big.Rat will normalize its value to lowest terms. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"
	"math/big"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.BigRat()
	// value will be -832/6 in lowest terms
	var value big.Rat
	var num, denom big.Int
	num.SetString("12345", 10)
	denom.SetString("-90", 10)
	value.SetFrac(&num, &denom)
	buf := codec.Append(nil, &value)
	decoded, _ := codec.Get(buf)
	fmt.Println(decoded)
}
Output:
-823/6

func Bool

func Bool() Codec[bool]

Bool returns a Codec for the bool type. The encoded order is false, then true. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.Bool()
	buf := codec.Append(nil, true)
	first, _ := codec.Get(buf)
	_ = codec.Put(buf, false)
	second, _ := codec.Get(buf)
	fmt.Printf("%t, %t", first, second)
}
Output:
true, false

func Bytes

func Bytes() Codec[[]byte]

Bytes returns a Codec for the []byte type, with nil slices ordered first. A []byte is written as-is following a nil/non-nil indicator. This Codec is more efficient than Codecs produced by SliceOf(Uint8()), and will allow nil unlike String. This Codec requires escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.Bytes()
	buf := codec.Append(nil, []byte{1, 2, 3, 11, 17})
	decoded, _ := codec.Get(buf)
	fmt.Println(decoded)
}
Output:
[1 2 3 11 17]

func CastBool added in v0.5.0

func CastBool[T ~bool]() Codec[T]

CastBool returns a Codec for a type with an underlying type of bool. Other than the underlying type, this is the same as Bool.

func CastBytes added in v0.5.0

func CastBytes[S ~[]byte]() Codec[S]

CastBytes returns a Codec for a type with an underlying type of []byte, with nil slices ordered first. Other than the underlying type, this is the same as Bytes.

func CastFloat32 added in v0.5.0

func CastFloat32[T ~float32]() Codec[T]

CastFloat32 returns a Codec for a type with an underlying type of float32. Other than the underlying type, this is the same as Float32.

func CastFloat64 added in v0.5.0

func CastFloat64[T ~float64]() Codec[T]

CastFloat64 returns a Codec for a type with an underlying type of float64. Other than the underlying type, this is the same as Float64.

func CastInt added in v0.5.0

func CastInt[T ~int]() Codec[T]

CastInt returns a Codec for a type with an underlying type of int. Other than the underlying type, this is the same as Int.

func CastInt8 added in v0.5.0

func CastInt8[T ~int8]() Codec[T]

CastInt8 returns a Codec for a type with an underlying type of int8. Other than the underlying type, this is the same as Int8.

func CastInt16 added in v0.5.0

func CastInt16[T ~int16]() Codec[T]

CastInt16 returns a Codec for a type with an underlying type of int16. Other than the underlying type, this is the same as Int16.

func CastInt32 added in v0.5.0

func CastInt32[T ~int32]() Codec[T]

CastInt32 returns a Codec for a type with an underlying type of int32. Other than the underlying type, this is the same as Int32.

func CastInt64 added in v0.5.0

func CastInt64[T ~int64]() Codec[T]

CastInt64 returns a Codec for a type with an underlying type of int64. Other than the underlying type, this is the same as Int64.

func CastMapOf added in v0.5.0

func CastMapOf[M ~map[K]V, K comparable, V any](keyCodec Codec[K], valueCodec Codec[V]) Codec[M]

CastMapOf returns a Codec for a type with an underlying type of map[K]V, with nil maps ordered first. Other than the underlying type, this is the same as MapOf.

func CastPointerTo added in v0.5.0

func CastPointerTo[P ~*E, E any](elemCodec Codec[E]) Codec[P]

CastPointerTo returns a Codec for a type with an underlying type of *E, with nil pointers ordered first. Other than the underlying type, this is the same as PointerTo.

func CastSliceOf added in v0.5.0

func CastSliceOf[S ~[]E, E any](elemCodec Codec[E]) Codec[S]

CastSliceOf returns a Codec for a type with an underlying type of []E, with nil slices ordered first. Other than the underlying type, this is the same as SliceOf.

func CastString added in v0.5.0

func CastString[T ~string]() Codec[T]

CastString returns a Codec for a type with an underlying type of string. Other than the underlying type, this is the same as String.

func CastUint added in v0.5.0

func CastUint[T ~uint]() Codec[T]

CastUint returns a Codec for a type with an underlying type of uint. Other than the underlying type, this is the same as Uint.

func CastUint8 added in v0.5.0

func CastUint8[T ~uint8]() Codec[T]

CastUint8 returns a Codec for a type with an underlying type of uint8. Other than the underlying type, this is the same as Uint8.

func CastUint16 added in v0.5.0

func CastUint16[T ~uint16]() Codec[T]

CastUint16 returns a Codec for a type with an underlying type of uint16. Other than the underlying type, this is the same as Uint16.

func CastUint32 added in v0.5.0

func CastUint32[T ~uint32]() Codec[T]

CastUint32 returns a Codec for a type with an underlying type of uint32. Other than the underlying type, this is the same as Uint32.

func CastUint64 added in v0.5.0

func CastUint64[T ~uint64]() Codec[T]

CastUint64 returns a Codec for a type with an underlying type of uint64. Other than the underlying type, this is the same as Uint64.

func Complex64

func Complex64() Codec[complex64]

Complex64 returns a Codec for the complex64 type. The encoded order is real part first, imaginary part second, with those parts ordered as documented for Float32. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"
	"math"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.Complex64()
	valueReal := float32(math.Inf(1))
	valueImag := float32(5.4321e-12)
	buf := codec.Append(nil, complex(valueReal, valueImag))
	decoded, _ := codec.Get(buf)
	fmt.Println(math.Float32bits(valueReal) == math.Float32bits(real(decoded)))
	fmt.Println(math.Float32bits(valueImag) == math.Float32bits(imag(decoded)))
}
Output:
true
true

func Complex128

func Complex128() Codec[complex128]

Complex128 returns a Codec for the complex128 type. The encoded order is real part first, imaginary part second, with those parts ordered as documented for Float64. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"bytes"
	"fmt"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.Complex128()
	v1 := complex(123.5431, 9.87)
	v2 := complex(123.5432, 9.87)
	encodedV1 := codec.Append(nil, v1)
	encodedV2 := codec.Append(nil, v2)
	fmt.Println(bytes.Compare(encodedV1, encodedV2))
}
Output:
-1

func Duration

func Duration() Codec[time.Duration]

Duration returns a Codec for the time.Duration type. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"
	"time"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.Duration()
	buf := codec.Append(nil, time.Hour*57)
	decoded, _ := codec.Get(buf)
	fmt.Println(decoded)
}
Output:
57h0m0s

func Empty

func Empty[T any]() Codec[T]

Empty returns a Codec that encodes instances of T to zero bytes. Get returns the zero value of T. No method of this Codec will ever fail.

This is useful for empty structs, which are often used as map values. This Codec requires escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"
	"reflect"

	"github.com/phiryll/lexy"
)

func main() {
	type present struct{}
	type set map[uint8]present
	codec := lexy.CastMapOf[set](lexy.Uint8(), lexy.Empty[present]())
	value := set{
		23: present{},
		42: present{},
		59: present{},
		12: present{},
	}
	buf := codec.Append(nil, value)
	decoded, _ := codec.Get(buf)
	fmt.Printf("%T\n", decoded)
	fmt.Printf("%T\n", decoded[0])
	fmt.Println(reflect.DeepEqual(value, decoded))
}
Output:
lexy_test.set
lexy_test.present
true

func Float32

func Float32() Codec[float32]

Float32 returns a Codec for the float32 type. All bits of the value are preserved by this encoding. There are many different bit patterns for NaN, and their encodings will be distinct. No ordering distinction is made between quiet and signaling NaNs. This Codec does not require escaping, as defined by [Codec.RequiresTerminator]. The order of encoded values is:

-NaN
-Inf
negative finite numbers
-0.0
+0.0
positive finite numbers
+Inf
+NaN
Example
package main

import (
	"fmt"
	"math"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.Float32()
	value := float32(1.45e-17)
	buf := codec.Append(nil, value)
	decoded, _ := codec.Get(buf)
	fmt.Println(math.Float32bits(value) == math.Float32bits(decoded))
}
Output:
true

func Float64

func Float64() Codec[float64]

Float64 returns a Codec for the float64 type. Other than handling float64 instances, this function behaves the same as Float32.

Example
package main

import (
	"fmt"
	"math"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.Float64()
	value := math.Copysign(math.NaN(), -1.0)
	buf := codec.Append(nil, value)
	decoded, _ := codec.Get(buf)
	fmt.Println(math.Float64bits(value) == math.Float64bits(decoded))
}
Output:
true

func Int

func Int() Codec[int]

Int returns a Codec for the int type. Values are converted to/from int64 and encoded with Int64. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.Int()
	buf := codec.Append(nil, -4567890)
	decoded, _ := codec.Get(buf)
	fmt.Println(decoded)
}
Output:
-4567890

func Int8 added in v0.4.0

func Int8() Codec[int8]

Int8 returns a Codec for the int8 type. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

func Int16 added in v0.4.0

func Int16() Codec[int16]

Int16 returns a Codec for the int16 type. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

func Int32 added in v0.4.0

func Int32() Codec[int32]

Int32 returns a Codec for the int32 type. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"bytes"
	"fmt"
	"math"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.Int32()
	var encoded [][]byte
	for _, value := range []int32{
		math.MinInt32,
		-1,
		0,
		1,
		math.MaxInt32,
	} {
		buf := codec.Append(nil, value)
		encoded = append(encoded, buf)
	}
	// Verify the encodings are increasing.
	for i, b := range encoded[1:] {
		fmt.Println(bytes.Compare(encoded[i], b))
	}
}
Output:
-1
-1
-1
-1

func Int64 added in v0.4.0

func Int64() Codec[int64]

Int64 returns a Codec for the int64 type. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

func MapOf

func MapOf[K comparable, V any](keyCodec Codec[K], valueCodec Codec[V]) Codec[map[K]V]

MapOf returns a Codec for the map[K]V type, with nil maps ordered first. The encoded order for non-nil maps is empty maps first, with all other maps randomly ordered after. This Codec requires escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"
	"reflect"

	"github.com/phiryll/lexy"
)

func main() {
	type word string
	type count int
	type wordCounts map[word]count
	codec := lexy.CastMapOf[wordCounts](lexy.CastString[word](), lexy.CastInt[count]())
	value := wordCounts{
		"Now":  23,
		"is":   42,
		"the":  59,
		"time": 12,
	}
	buf := codec.Append(nil, value)
	decoded, _ := codec.Get(buf)
	fmt.Printf("%T\n", decoded)
	fmt.Printf("%T\n", decoded["not-found"])
	fmt.Println(reflect.DeepEqual(value, decoded))
}
Output:
lexy_test.wordCounts
lexy_test.count
true

func Negate

func Negate[T any](codec Codec[T]) Codec[T]

Negate returns a Codec reversing the encoded order of codec. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"bytes"
	"fmt"
	"math"

	"github.com/phiryll/lexy"
)

func main() {
	// Exactly the same as the lexy.Int32() example, except negated.
	codec := lexy.Negate(lexy.Int32())
	var encoded [][]byte
	for _, value := range []int32{
		math.MinInt32,
		-1,
		0,
		1,
		math.MaxInt32,
	} {
		buf := codec.Append(nil, value)
		encoded = append(encoded, buf)
	}
	// Verify the encodings are decreasing.
	for i, b := range encoded[1:] {
		fmt.Println(bytes.Compare(encoded[i], b))
	}
}
Output:
1
1
1
1

func NilsLast added in v0.5.0

func NilsLast[T any](codec Codec[T]) Codec[T]

NilsLast returns a Codec exactly like codec, but with nils ordered last. NilsLast will panic if codec is not a pointer, slice, map, []byte, or *big.Int/Float/Rat Codec provided by lexy. Codecs returned by Negate and Terminate will cause NilsLast to panic, regardless of the Codec they are wrapping.

func PointerTo

func PointerTo[E any](elemCodec Codec[E]) Codec[*E]

PointerTo returns a Codec for the *E type, with nil pointers ordered first. The encoded order of non-nil values is the same as is produced by elemCodec. This Codec requires escaping if elemCodec does, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.PointerTo(lexy.String())
	value := "abc"
	buf := codec.Append(nil, &value)
	decoded, _ := codec.Get(buf)
	fmt.Println(value == *decoded)
	fmt.Println(&value == decoded)
}
Output:
true
false

func SliceOf

func SliceOf[E any](elemCodec Codec[E]) Codec[[]E]

SliceOf returns a Codec for the []E type, with nil slices ordered first. The encoded order is lexicographical using the encoded order of elemCodec for the elements. This Codec requires escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"

	"github.com/phiryll/lexy"
)

func main() {
	type words []string
	codec := lexy.CastSliceOf[words](lexy.String())
	buf := codec.Append(nil, words{"The", "time", "is", "now"})
	decoded, _ := codec.Get(buf)
	fmt.Printf("%T\n", decoded)
	fmt.Println(decoded)
}
Output:
lexy_test.words
[The time is now]

func String

func String() Codec[string]

String returns a Codec for the string type. This Codec requires escaping, as defined by [Codec.RequiresTerminator].

A string is encoded as its bytes. This encoded order may be surprising. A string in Go is essentially an immutable []byte without any text semantics. For a UTF-8 string, the order is the same as the lexicographical order of the Unicode code points. However, even this is not intuitive. For example, 'Z' < 'a'. Collation is locale-dependent, and any ordering could be incorrect in another locale.

Example
package main

import (
	"fmt"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.String()
	buf := codec.Append(nil, "")
	decoded, _ := codec.Get(buf)
	fmt.Printf("%q\n", decoded)
	buf = codec.Append(nil, "Go rocks!")
	decoded, _ = codec.Get(buf)
	fmt.Printf("%q\n", decoded)
}
Output:
""
"Go rocks!"

func Terminate

func Terminate[T any](codec Codec[T]) Codec[T]

Terminate returns a Codec that escapes and terminates the encodings produced by codec, if [Codec.RequiresTerminator] returns true for codec. Otherwise it returns codec.

func TerminatedBytes added in v0.4.0

func TerminatedBytes() Codec[[]byte]

TerminatedBytes returns a Codec for the []byte type which escapes and terminates the encoded bytes. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

This is a convenience function, it returns the same Codec as Terminate(Bytes()).

func TerminatedString added in v0.4.0

func TerminatedString() Codec[string]

TerminatedString returns a Codec for the string type which escapes and terminates the encoded bytes. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

This is a convenience function, it returns the same Codec as Terminate(String()).

func Time

func Time() Codec[time.Time]

Time returns a Codec for the time.Time type. The encoded order is UTC time first, timezone offset second. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

This Codec is lossy. It encodes the timezone's offset, but not its name. It will therefore lose information about Daylight Saving Time. Timezone names and DST behavior are defined outside of Go's control (as they must be), and time.Time.Zone can return names that will fail with time.LoadLocation in the same program.

Example
package main

import (
	"fmt"
	"time"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.Time()
	buf := codec.Append(nil, time.Date(2000, 1, 2, 3, 4, 5, 678_901_234, time.UTC))
	decoded, _ := codec.Get(buf)
	fmt.Println(decoded.Format(time.RFC3339Nano))
}
Output:
2000-01-02T03:04:05.678901234Z

func Uint

func Uint() Codec[uint]

Uint returns a Codec for the uint type. Values are converted to/from uint64 and encoded with Uint64. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.Uint()
	buf := codec.Append(nil, 4567890)
	decoded, _ := codec.Get(buf)
	fmt.Println(decoded)
}
Output:
4567890

func Uint8 added in v0.4.0

func Uint8() Codec[uint8]

Uint8 returns a Codec for the uint8 type. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

func Uint16 added in v0.4.0

func Uint16() Codec[uint16]

Uint16 returns a Codec for the uint16 type. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

func Uint32 added in v0.4.0

func Uint32() Codec[uint32]

Uint32 returns a Codec for the uint32 type. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

Example (UnderlyingType)
package main

import (
	"fmt"

	"github.com/phiryll/lexy"
)

func main() {
	type size uint32
	codec := lexy.CastUint32[size]()
	buf := codec.Append(nil, 123)
	decoded, _ := codec.Get(buf)
	fmt.Printf("Value %d of type %T", decoded, decoded)
}
Output:
Value 123 of type lexy_test.size

func Uint64 added in v0.4.0

func Uint64() Codec[uint64]

Uint64 returns a Codec for the uint64 type. This Codec does not require escaping, as defined by [Codec.RequiresTerminator].

Example
package main

import (
	"fmt"

	"github.com/phiryll/lexy"
)

func main() {
	codec := lexy.Uint64()
	buf := codec.Append(nil, 123)
	decoded, _ := codec.Get(buf)
	fmt.Println(decoded)
}
Output:
123

type Prefix added in v0.5.0

type Prefix interface {
	// Append appends a prefix byte to the end of buf, returning the updated buffer.
	// This is a typical usage:
	//
	//	func (fooCodec) Append(buf []byte, value Foo) []byte {
	//	    done, buf := PrefixNilsFirst.Append(buf, value == nil)
	//	    if done {
	//	        return buf
	//	    }
	//	    // encode and append the non-nil value to buf
	//	}
	Append(buf []byte, isNil bool) (done bool, newBuf []byte)

	// Put sets buf[0] to a prefix byte.
	// This is a typical usage:
	//
	//	func (fooCodec) Put(buf []byte, value Foo) []byte {
	//	    done, buf := PrefixNilsFirst.Put(buf, value == nil)
	//	    if done {
	//	        return buf
	//	    }
	//	    // encode the non-nil value into buf
	//	}
	Put(buf []byte, isNil bool) (done bool, newBuf []byte)

	// Get decodes a prefix byte from buf[0].
	// Get will panic if the prefix byte is invalid.
	// Get will not modify buf.
	// This is a typical usage:
	//
	//	func (c fooCodec) Get(buf []byte) (Foo, []byte)
	//	    done, buf := PrefixNilsFirst.Get(buf)
	//	    if done {
	//	        return nil, buf
	//	    }
	//	    // decode and return a non-nil value from buf
	//	}
	Get(buf []byte) (done bool, newBuf []byte)
}

A Prefix provides helper methods to handle the initial nil/non-nil prefix byte for Codec implementations that encode types whose instances can be nil. The rest of these comments only pertain to usage by these Codec implementations.

Each Prefix method is a helper for implementing the correspondingly named Codec method. Invoking the Prefix method should be the first action taken by the Codec method, since it allows an early return if the value is nil.

In addition to other return values, every Prefix method returns done, a bool value which is true if and only if the caller should return immediately. If done is true, either there was an error or the value is nil. If done is false, there was no error and the value is non-nil, in which case the caller still needs to process the non-nil value. See the method docs for typical usages.

Prefix is implemented by PrefixNilsFirst and PrefixNilsLast.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL