README
¶
mus-go Serializer
mus-go is a MUS format serializer. However, due to its minimalist design and a wide range of serialization primitives, it can also be used to implement other binary serialization formats (here is an example where mus-go is utilized to implement Protobuf encoding).
To get started quickly, go to the code generator page.
Why mus-go?
It is lightning fast, space efficient and well tested.
Description
- Has a streaming version.
- Can run on both 32 and 64-bit systems.
- Variable-length data types (like
string
,array
,slice
, ormap
) are encoded as:length + data
. You can choose binary representation for both of these parts. - Supports data versioning.
- Deserialization may fail with one of the following errors:
ErrOverflow
,ErrNegativeLength
,ErrTooSmallByteSlice
,ErrWrongFormat
. - Can validate and skip data while unmarshalling.
- Supports pointers.
- Can encode data structures such as graphs or linked lists.
- Supports oneof feature.
- Supports private fields.
- Supports out-of-order deserialization.
- Supports zero allocation deserialization.
Contents
- mus-go Serializer
- Contents
- cmd-stream-go
- musgen-go
- Benchmarks
- How To
- Out of Order Deserialization
- Zero Allocation Deserialization
cmd-stream-go
cmd-stream-go allows to execute commands on the server. cmd-stream-go/MUS is about 3 times faster than gRPC/Protobuf.
musgen-go
Writing mus-go code manually can be tedious and error-prone. A better approach
is to use a code generator, it's also
incredibly easy to use - just provide a type and call Generate()
.
Benchmarks
Why did I create another benchmarks? The existing benchmarks have some notable issues - try running them several times, and you'll likely get inconsistent results, making it difficult to determine which serializer is truly faster. That was one of the reasons, and basically I made them for my own use.
How To
With mus-go, to make a type serializable, you need to implement the Serializer interface:
import "github.com/mus-format/mus-go"
// YourTypeMUS is a MUS serializer for YourType.
var YourTypeMUS = yourTypeMUS{}
// yourTypeMUS implements the mus.Serializer interface.
type yourTypeMUS struct{}
func (s yourTypeMUS) Marshal(v YourType, bs []byte) (n int) {...}
func (s yourTypeMUS) Unmarshal(bs []byte) (v YourType, n int, err error) {...}
func (s yourTypeMUS) Size(v YourType) (size int) {...}
func (s yourTypeMUS) Skip(bs []byte) (n int, err error) {...}
And than use it like:
var (
value YourType = ...
size = YourTypeMUS.Size(value) // The number of bytes required to serialize the value.
bs = make([]byte, size)
)
n := YourTypeMUS.Marshal(value, bs) // Returns the number of used bytes.
value, n, err := YourTypeMUS.Unmarshal(bs) // Returns the value, the number of
// used bytes and any error encountered.
// Instead of unmarshalling the value can be skipped:
n, err := YourTypeMUS.Skip(bs)
Packages
mus-go offers several encoding options, each of which is in a separate package.
varint
Contains Varint serialzers for all uint
(uint64
, uint32
, uint16
,
uint8
, uint
), int
, float
, byte
data types. Example:
package main
import "github.com/mus-format/mus-go/varint"
func main() {
var (
num = 100
size = varint.Int.Size(num)
bs = make([]byte, size)
)
n := varint.Int.Marshal(num, bs)
num, n, err := varint.Int.Unmarshal(bs)
// ...
}
Also includes the PositiveInt
serializer (Varint without ZigZag) for positive
int
values. It can handle negative values as well, but with lower performance.
raw
Contains Raw serializers for the same byte
, uint
, int
, float
, time.Time
data types. Example:
package main
import "github.com/mus-format/mus-go/raw"
func main() {
var (
num = 100
size = raw.Int.Size(num)
bs = make([]byte, size)
)
n := raw.Int.Marshal(num, bs)
num, n, err := raw.Int.Unmarshal(bs)
// ...
}
More details about Varint and Raw encodings can be found in the MUS format specification. If in doubt, use Varint.
For time.Time
, there are several serializers:
TimeUnix
– encodes a value as a Unix timestamp in seconds.TimeUnixMilli
– encodes a value as a Unix timestamp in milliseconds.TimeUnixMicro
– encodes a value as a Unix timestamp in microseconds.TimeUnixNano
– encodes a value as a Unix timestamp in nanoseconds.
To ensure the deserialized value is in UTC, make sure your TZ environment variable is set to UTC. This can be done as follows:
os.Setenv("TZ", "")
Alternatively, you can use one of the corresponding UTC serializers, e.g.,
TimeUnixUTC
, TimeUnixMilliUTC
, etc.
ord (ordinary)
Contains serializers/constructors for bool
, string
, array
, byte slice
,
slice
, map
, and pointer types.
Variable-length data types (such as string
, array
, slice
, or map
) are
encoded as length + data
. You can choose the binary representation for both
parts. By default, the length is encoded using a Varint without ZigZag
(varint.PositiveInt
). In this case, the maximum length is limited by the
maximum value of the int
type on your system. This works well across different
architectures - for example, an attempt to unmarshal a string that is too long
on a 32-bit system will result in an ErrOverflow
.
For array
, slice
, and map
types, there are only constructors available to
create a concrete serializer.
Array
Unfortunately, Go does not support generic parameterization of array sizes, as a result, the array serializer constructor looks like:
package main
import (
"github.com/mus-format/mus-go/ord"
"github.com/mus-format/mus-go/varint"
arrops "github.com/mus-format/mus-go/options/array"
)
func main() {
var (
// The first type parameter of the NewArraySer function represents the array
// type, and the second - the type of the array’s elements.
//
// As for the function parameters, varint.Int specifies the serializer for
// the array’s elements.
ser = ord.NewArraySer[[3]int, int](varint.Int)
// To create an array serializer with the specific length serializer use:
// ser = ord.NewArraySer[[3]int, int](varint.Int, arrops.WithLenSer(lenSer))
arr = [3]int{1, 2, 3}
size = ser.Size(arr)
bs = make([]byte, size)
)
n := ser.Marshal(arr, bs)
arr, n, err := ser.Unmarshal(bs)
// ...
}
Slice
package main
import (
"github.com/mus-format/mus-go/ord"
"github.com/mus-format/mus-go/varint"
slops "github.com/mus-format/mus-go/options/slice"
)
func main() {
var (
// varint.Int specifies the serializer for the slice's elements.
ser = ord.NewSliceSer[int](varint.Int)
// To create a slice serializer with the specific length serializer use:
// ser = ord.NewSliceSer[int](varint.Int, slops.WithLenSer(lenSer))
sl = []int{1, 2, 3}
size = ser.Size(sl)
bs = make([]byte, size)
)
n := ser.Marshal(sl, bs)
sl, n, err := ser.Unmarshal(bs)
// ...
}
Map
package main
import (
"github.com/mus-format/mus-go/ord"
"github.com/mus-format/mus-go/varint"
mapops "github.com/mus-format/mus-go/options/map"
)
func main() {
var (
// varint.Int specifies the serializer for the map’s keys, and ord.String -
// the serializer for the map’s values.
ser = ord.NewMapSer[int, string](varint.Int, ord.String)
// To create a map serializer with the specific length serializer use:
// ser = ord.NewMapSer[int, string](varint.Int, ord.String, mapops.WithLenSer(lenSer))
m = map[int]string{1: "one", 2: "two", 3: "three"}
size = ser.Size(m)
bs = make([]byte, size)
)
n := ser.Marshal(m, bs)
m, n, err := ser.Unmarshal(bs)
// ...
}
unsafe
The unsafe package provides maximum performance, but be careful - it uses an unsafe type conversion. This warning largely applies to the string type because modifying the byte slice after unmarshalling will also change the string’s contents. Here is an example that demonstrates this behavior more clearly.
Provides serializers for the following data types: byte
, bool
, string
,
byte slice
, time.Time
and all uint
, int
, float
.
pm (pointer mapping)
Let's consider two pointers initialized with the same value:
var (
str = "hello world"
ptr = &str
ptr1 *string = ptr
ptr2 *string = ptr
)
The pm
package preserves pointer equality after unmarshalling ptr1 == ptr2
,
while the ord
package does not. This capability enables the serialization of
data structures like graphs or linked lists. You can find corresponding examples
in mus-examples-go.
Structs Support
mus-go doesn’t support structural data types out of the box, which means you’ll
need to implement the mus.Serializer
interface yourself. But that’s not
difficult at all. For example:
package main
import (
"github.com/mus-format/mus-go/ord"
"github.com/mus-format/mus-go/varint"
)
// We will implement the FooMUS serializer for this struct.
type Foo struct {
str string
sl []int
}
// Serializers.
var (
FooMUS = fooMUS{}
// IntSliceMUS is used by the FooMUS serializer.
IntSliceMUS = ord.NewSliceSer[int](varint.Int)
)
// fooMUS implements the mus.Serializer interface.
type fooMUS struct{}
func (s fooMUS) Marshal(v Foo, bs []byte) (n int) {
n = ord.String.Marshal(v.str, bs)
return n + IntSliceMUS.Marshal(v.sl, bs[n:])
}
func (s fooMUS) Unmarshal(bs []byte) (v Foo, n int, err error) {
v.str, n, err = ord.String.Unmarshal(bs)
if err != nil {
return
}
var n1 int
v.sl, n1, err = IntSliceMUS.Unmarshal(bs[n:])
n += n1
return
}
func (s fooMUS) Size(v Foo) (size int) {
size += ord.String.Size(v.str)
return size + IntSliceMUS.Size(v.sl)
}
func (s fooMUS) Skip(bs []byte) (n int, err error) {
n, err = ord.String.Skip(bs)
if err != nil {
return
}
var n1 int
n1, err = IntSliceMUS.Skip(bs[n:])
n += n1
return
}
All you have to do is deconstruct the structure into simpler data types and choose the desired encoding for each. Of course, this requires some effort. But, firstly, the code can be generated, secondly, this approach provides greater flexibility, and thirdly, mus-go stays quite simple, making it easy to implement in other programming languages.
DTS (Data Type metadata Support)
mus-dts-go enables typed data serialization using DTM.
Data Versioning
mus-dts-go can be used to implement data versioning. Here is an example.
MarshallerMUS Interface and MarshalMUS Function
It is often convenient to use the MarshallerMUS
interface:
type MarshallerMUS interface {
MarshalMUS(bs []byte) (n int)
SizeMUS() (size int)
}
and MarshalMUS
function:
func MarshalMUS(v MarshallerMUS) (bs []byte) {
bs = make([]byte, v.SizeMUS())
v.MarshalMUS(bs)
return
}
// Foo implements the MarshallerMUS interface.
type Foo struct {...}
...
func main() {
// Foo can now be marshalled with a single function call.
bs := MarshalMUS(Foo{...})
// ...
}
They are already defined in the ext-mus-go
module, which also includes the MarshallerTypedMUS
interface and the
MarshalTypedMUS
function for typed data serialization (DTM + data).
The full code of using MarshalMUS
function can be found here.
Interface Serialization (oneof feature)
mus-dts-go will also help to create a serializer for an interface. Example:
import (
dts "github.com/mus-format/mus-dts-go"
ext "github.com/mus-format/ext-mus-go"
)
// Interface to serializer.
type Instruction interface {...}
// Copy implements the Instruction and ext.MarshallerTypedMUS interfaces.
type Copy struct {...}
// MarshalTYpedMUS uses CopyDTS.
func (c Copy) MarshalTypedMUS(bs []byte) (n int) {
return CopyDTS.Marshal(c, bs)
}
// SizeTypedMUS uses CopyDTS.
func (c Copy) SizeTypedMUS() (size int) {
return CopyDTS.Size(c, bs)
}
// Insert implements the Instruction and ext.MarshallerTypedMUS interfaces.
type Insert struct {...}
// ...
// instructionMUS implements the mus.Serializer interface.
type instructionMUS struct {}
func (s instructionMUS) Marshal(i Instruction, bs []byte) (n int) {
if m, ok := i.(MarshallerTypedMUS); ok {
return m.MarshalTypedMUS(bs)
}
panic(fmt.Sprintf("%v doesn't implement ext.MarshallerTypedMUS interface",
reflect.TypeOf(i)))
}
func (s instructionMUS) Unmarshal(bs []byte) (i Instruction, n int, err error) {
dtm, n, err := dts.DTMSer.Unmarshal(bs)
if err != nil {
return
}
switch dtm {
case CopyDTM:
return CopyDTS.UnmarshalData(bs[n:])
case InsertDTM:
return InsertDTS.UnmarshalData(bs[n:])
default:
err = ErrUnexpectedDTM
return
}
}
func (s instructionMUS) Size(i Instruction) (size int) {
if s, ok := i.(MarshallerTypedMUS); ok {
return s.SizeTypedMUS()
}
panic(fmt.Sprintf("%v doesn't implement ext.MarshallerTypedMUS interface",
reflect.TypeOf(i)))
}
A full example can be found at mus-examples-go.
Validation
Validation is performed during unmarshalling. Validator is just a function
with the following signature func (value Type) error
, where Type
is a type
of the value to which the validator is applied.
String
ord.NewValidStringSer
constructor creates a string serializer with the length
validator.
package main
import (
com "github.com/mus-format/common-go"
"github.com/mus-format/mus-go/ord"
strops "github.com/mus-format/mus-go/options/string"
)
func main() {
var (
// Length validator.
lenVl = func(length int) (err error) {
if length > 3 {
err = com.ErrTooLargeLength
}
return
}
ser = ord.NewValidStringSer(strops.WithLenValidator(com.ValidatorFn[int](lenVl)))
// To create a valid string serializer with the specific length serializer
// use:
// ser = ord.NewValidStringSer(strops.WithLenSer(lenSer), ...)
value = "hello world"
size = ser.Size(value)
bs = make([]byte, size)
)
n := ser.Marshal(value, bs)
// Unmarshalling stops when a validator returns an error. As a result, in
// this case, we will receive a length validation error.
value, n, err := ser.Unmarshal(bs)
// ...
}
Slice
ord.NewValidSliceSer
constructor creates a valid slice serializer with the
length and element validators.
package main
import (
com "github.com/mus-format/common-go"
"github.com/mus-format/mus-go/ord"
slops "github.com/mus-format/mus-go/options/slice"
)
func main() {
var (
// Length validator.
lenVl = func(length int) (err error) {
if length > 3 {
err = com.ErrTooLargeLength
}
return
}
// Element validator.
elemVl = func(elem string) (err error) {
if elem == "hello" {
err = ErrBadElement
}
return
}
// Each of the validators could be nil.
ser = ord.NewValidSliceSer[string](ord.String,
slops.WithLenValidator[string](com.ValidatorFn[int](lenVl)),
slops.WithElemValidator[string](com.ValidatorFn[string](elemVl)))
// To create a valid slice serializer with the specific length serializer
// use:
// ser = ord.NewValidSliceSer[string](ord.String,
// slops.WithLenSer[string](lenSer), ...)
value = []string{"hello", "world"}
size = ser.Size(value)
bs = make([]byte, size)
)
n := ser.Marshal(value, bs)
// Unmarshalling stops when any of the validators return an error. As a
// result, in this case, we will receive an element validation error.
value, n, err := ser.Unmarshal(bs)
// ...
}
Map
ord.NewValidMapSer
constructor creates a valid map serializer with the
length, key and value validators.
package main
import (
com "github.com/mus-format/common-go"
"github.com/mus-format/mus-go/ord"
"github.com/mus-format/mus-go/varint"
mapops "github.com/mus-format/mus-go/options/map"
)
func main() {
var (
// Length validator.
lenVl = func(length int) (err error) {
if length > 3 {
err = com.ErrTooLargeLength
}
return
}
// Key validator.
keyVl = func(key int) (err error) {
if key == 1 {
err = ErrBadKey
}
return
}
// Value validator.
valueVl = func(val string) (err error) {
if val == "hello" {
err = ErrBadValue
}
return
}
// Each of the validators could be nil.
ser = ord.NewValidMapSer[int, string](varint.Int, ord.String,
mapops.WithLenValidator[int, string](com.ValidatorFn[int](lenVl)),
mapops.WithKeyValidator[int, string](com.ValidatorFn[int](keyVl)),
mapops.WithValueValidator[int, string](com.ValidatorFn[string](valueVl)))
// To create a valid map serializer with the specific length serializer
// use:
// ser = ord.NewValidMapSer[int, string](varint.Int, ord.String,
// mapops.WithLenSer[int, string](lenSer), ...)
value = map[int]string{1: "hello", 2: "world"}
size = ser.Size(value)
bs = make([]byte, size)
)
n := ser.Marshal(value, bs)
// Unmarshalling stops when any of the validators return an error. As a
// result, in this case, we will receive a key validation error.
value, n, err := ser.Unmarshal(bs)
// ...
}
Struct
Unmarshalling an invalid structure may stop at the first invalid field, returning a validation error.
package main
import "github.com/mus-format/mus-go/varint"
type fooMUS struct{}
// ...
func (s fooMUS) Unmarshal(bs []byte) (v Foo, n int, err error) {
// Unmarshal the first field.
v.str, n, err = ord.String.Unmarshal(bs)
if err != nil {
return
}
// Validate the first field.
if err = ValidateFieldA(v.a); err != nil {
// The rest of the structure remains unmarshaled.
return
}
// ...
}
Out of Order Deserialization
A simple example can be found here.
Zero Allocation Deserialization
Can be achieved using the unsafe
package.
Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ErrTooSmallByteSlice = errors.New("too small byte slice")
ErrTooSmallByteSlice means that an Unmarshal requires a longer byte slice than was provided.
Functions ¶
This section is empty.
Types ¶
type Serializer ¶ added in v0.5.0
type Serializer[T any] interface { Marshal(t T, bs []byte) (n int) Unmarshal(bs []byte) (t T, n int, err error) Size(t T) (size int) Skip(bs []byte) (n int, err error) }
Serializer is the interface that groups the Marshal, Unmarshal, Size and Skip methods.
Marshal fills bs with an encoded value, returning the number of used bytes. It should panic if receives too small byte slice.
Unmarshal parses an encoded value from bs, returning the value, the number of used bytes and any error encountered.
Size method returns the number of bytes needed to encode the value.
Skip skips an encoded value, returning the number of skipped bytes and any error encountered.