codec

package
v0.0.0-...-c61e583 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 30, 2013 License: BSD-3-Clause Imports: 15 Imported by: 0

README

Codec

High Performance and Feature-Rich Idiomatic Go Library providing encode/decode support for different serialization formats.

Supported Serialization formats are:

To install:

go get github.com/ugorji/go/codec

Online documentation: [http://godoc.org/github.com/ugorji/go/codec]

The idiomatic Go support is as seen in other encoding packages in the standard library (ie json, xml, gob, etc).

Rich Feature Set includes:

  • Simple but extremely powerful and feature-rich API
  • Very High Performance.
    Our extensive benchmarks show us outperforming Gob, Json and Bson by 2-4X. This was achieved by taking extreme care on:
    • managing allocation
    • stack frame size (important due to Go's use of split stacks),
    • reflection use
    • recursion implications
    • zero-copy mode (encoding/decoding to byte slice without using temp buffers)
  • Correct.
    Care was taken to precisely handle corner cases like: overflows, nil maps and slices, nil value in stream, etc.
  • Efficient zero-copying into temporary byte buffers
    when encoding into or decoding from a byte slice.
  • Standard field renaming via tags
  • Encoding from any value
    (struct, slice, map, primitives, pointers, interface{}, etc)
  • Decoding into pointer to any non-nil typed value
    (struct, slice, map, int, float32, bool, string, reflect.Value, etc)
  • Supports extension functions to handle the encode/decode of custom types
  • Schema-less decoding
    (decode into a pointer to a nil interface{} as opposed to a typed non-nil value).
    Includes Options to configure what specific map or slice type to use when decoding an encoded list or map into a nil interface{}
  • Provides a RPC Server and Client Codec for net/rpc communication protocol.
  • Msgpack Specific:
    • Provides extension functions to handle spec-defined extensions (binary, timestamp)
    • Options to resolve ambiguities in handling raw bytes (as string or []byte)
      during schema-less decoding (decoding into a nil interface{})
    • RPC Server/Client Codec for msgpack-rpc protocol defined at: http://wiki.msgpack.org/display/MSGPACK/RPC+specification

Extension Support

Users can register a function to handle the encoding or decoding of their custom types.

There are no restrictions on what the custom type can be. Extensions can be any type: pointers, structs, custom types off arrays/slices, strings, etc. Some examples:

type BisSet   []int
type BitSet64 uint64
type UUID     string
type MyStructWithUnexportedFields struct { a int; b bool; c []int; }
type GifImage struct { ... }

Typically, MyStructWithUnexportedFields is encoded as an empty map because it has no exported fields, while UUID will be encoded as a string, etc. However, with extension support, you can encode any of these however you like.

We provide implementations of these functions where the spec has defined an inter-operable format. For msgpack, these are Binary and time.Time. Library users will have to explicitly configure these as seen in the usage below.

Usage

Typical usage model:

var (
  mapStrIntfTyp = reflect.TypeOf(map[string]interface{}(nil))
  sliceByteTyp = reflect.TypeOf([]byte(nil))
  timeTyp = reflect.TypeOf(time.Time{})
)

// create and configure Handle
var (
  bh codec.BincHandle
  mh codec.MsgpackHandle
)

mh.MapType = mapStrIntfTyp

// configure extensions for msgpack, to enable Binary and Time support for tags 0 and 1
mh.AddExt(sliceByteTyp, 0, mh.BinaryEncodeExt, mh.BinaryDecodeExt)
mh.AddExt(timeTyp, 1, mh.TimeEncodeExt, mh.TimeDecodeExt)

// create and use decoder/encoder
var (
  r io.Reader
  w io.Writer
  b []byte
  h = &bh // or mh to use msgpack
)

dec = codec.NewDecoder(r, h)
dec = codec.NewDecoderBytes(b, h)
err = dec.Decode(&v) 

enc = codec.NewEncoder(w, h)
enc = codec.NewEncoderBytes(&b, h)
err = enc.Encode(v)

//RPC Server
go func() {
    for {
        conn, err := listener.Accept()
        rpcCodec := codec.GoRpc.ServerCodec(conn, h)
        //OR rpcCodec := codec.MsgpackSpecRpc.ServerCodec(conn, h)
        rpc.ServeCodec(rpcCodec)
    }
}()

//RPC Communication (client side)
conn, err = net.Dial("tcp", "localhost:5555")  
rpcCodec := rpcH.ClientCodec(conn, h)  
client := rpc.NewClientWithCodec(rpcCodec)

Representative Benchmark Results

A sample run of benchmark using "go test -bi -bench=.":

..............................................
Benchmark: 
	Struct recursive Depth:             1
	ApproxDeepSize Of benchmark Struct: 4786
Benchmark One-Pass Run:
	   msgpack: len: 1564
	      binc: len: 1191
	       gob: len: 1972
	      json: len: 2538
	 v-msgpack: len: 1600
	      bson: len: 3025
..............................................
PASS
Benchmark__Msgpack__Encode	   50000	     61731 ns/op
Benchmark__Msgpack__Decode	   10000	    115947 ns/op
Benchmark__Binc_____Encode	   50000	     64568 ns/op
Benchmark__Binc_____Decode	   10000	    113843 ns/op
Benchmark__Gob______Encode	   10000	    143956 ns/op
Benchmark__Gob______Decode	    5000	    431889 ns/op
Benchmark__Json_____Encode	   10000	    158662 ns/op
Benchmark__Json_____Decode	    5000	    310744 ns/op
Benchmark__Bson_____Encode	   10000	    172905 ns/op
Benchmark__Bson_____Decode	   10000	    228564 ns/op
Benchmark__VMsgpack_Encode	   20000	     81752 ns/op
Benchmark__VMsgpack_Decode	   10000	    160050 ns/op

To run full benchmark suite (including against vmsgpack and bson), see notes in ext_dep_test.go

Documentation

Overview

High Performance, Feature-Rich Idiomatic Go encoding library for msgpack and binc .

Supported Serialization formats are:

To install:

go get github.com/ugorji/go/codec

The idiomatic Go support is as seen in other encoding packages in the standard library (ie json, xml, gob, etc).

Rich Feature Set includes:

  • Simple but extremely powerful and feature-rich API
  • Very High Performance. Our extensive benchmarks show us outperforming Gob, Json and Bson by 2-4X. This was achieved by taking extreme care on:
  • managing allocation
  • stack frame size (important due to Go's use of split stacks),
  • reflection use
  • recursion implications
  • zero-copy mode (encoding/decoding to byte slice without using temp buffers)
  • Correct. Care was taken to precisely handle corner cases like: overflows, nil maps and slices, nil value in stream, etc.
  • Efficient zero-copying into temporary byte buffers when encoding into or decoding from a byte slice.
  • Standard field renaming via tags
  • Encoding from any value (struct, slice, map, primitives, pointers, interface{}, etc)
  • Decoding into pointer to any non-nil typed value (struct, slice, map, int, float32, bool, string, reflect.Value, etc)
  • Supports extension functions to handle the encode/decode of custom types
  • Support Go 1.2 encoding.BinaryMarshaler/BinaryUnmarshaler
  • Schema-less decoding (decode into a pointer to a nil interface{} as opposed to a typed non-nil value). Includes Options to configure what specific map or slice type to use when decoding an encoded list or map into a nil interface{}
  • Provides a RPC Server and Client Codec for net/rpc communication protocol.
  • Msgpack Specific:
  • Provides extension functions to handle spec-defined extensions (binary, timestamp)
  • Options to resolve ambiguities in handling raw bytes (as string or []byte) during schema-less decoding (decoding into a nil interface{})
  • RPC Server/Client Codec for msgpack-rpc protocol defined at: http://wiki.msgpack.org/display/MSGPACK/RPC+specification

Extension Support

Users can register a function to handle the encoding or decoding of their custom types.

There are no restrictions on what the custom type can be. Extensions can be any type: pointers, structs, custom types off arrays/slices, strings, etc. Some examples:

type BisSet   []int
type BitSet64 uint64
type UUID     string
type MyStructWithUnexportedFields struct { a int; b bool; c []int; }
type GifImage struct { ... }

Typically, MyStructWithUnexportedFields is encoded as an empty map because it has no exported fields, while UUID will be encoded as a string, etc. However, with extension support, you can encode any of these however you like.

We provide implementations of these functions where the spec has defined an inter-operable format. For msgpack, these are Binary and time.Time. Library users will have to explicitly configure these as seen in the usage below.

Usage

Typical usage model:

var (
  mapStrIntfTyp = reflect.TypeOf(map[string]interface{}(nil))
  sliceByteTyp = reflect.TypeOf([]byte(nil))
  timeTyp = reflect.TypeOf(time.Time{})
)

// create and configure Handle
var (
  bh codec.BincHandle
  mh codec.MsgpackHandle
)

mh.MapType = mapStrIntfTyp

// configure extensions for msgpack, to enable Binary and Time support for tags 0 and 1
mh.AddExt(sliceByteTyp, 0, mh.BinaryEncodeExt, mh.BinaryDecodeExt)
mh.AddExt(timeTyp, 1, mh.TimeEncodeExt, mh.TimeDecodeExt)

// create and use decoder/encoder
var (
  r io.Reader
  w io.Writer
  b []byte
  h = &bh // or mh to use msgpack
)

dec = codec.NewDecoder(r, h)
dec = codec.NewDecoderBytes(b, h)
err = dec.Decode(&v)

enc = codec.NewEncoder(w, h)
enc = codec.NewEncoderBytes(&b, h)
err = enc.Encode(v)

//RPC Server
go func() {
    for {
        conn, err := listener.Accept()
        rpcCodec := codec.GoRpc.ServerCodec(conn, h)
        //OR rpcCodec := codec.MsgpackSpecRpc.ServerCodec(conn, h)
        rpc.ServeCodec(rpcCodec)
    }
}()

//RPC Communication (client side)
conn, err = net.Dial("tcp", "localhost:5555")
rpcCodec := codec.GoRpc.ClientCodec(conn, h)
//OR rpcCodec := codec.MsgpackSpecRpc.ClientCodec(conn, h)
client := rpc.NewClientWithCodec(rpcCodec)

Representative Benchmark Results

A sample run of benchmark using "go test -bi -bench=.":

..............................................
BENCHMARK INIT: 2013-09-30 14:18:26.997930788 -0400 EDT
To run full benchmark comparing encodings (MsgPack, Binc, JSON, GOB, etc), use: "go test -bench=."
Benchmark:
   	Struct recursive Depth:             1
   	ApproxDeepSize Of benchmark Struct: 4694 bytes
Benchmark One-Pass Run:
   	 v-msgpack: len: 1600 bytes
   	      bson: len: 3025 bytes
   	   msgpack: len: 1560 bytes
   	      binc: len: 1187 bytes
   	       gob: len: 1972 bytes
   	      json: len: 2538 bytes
..............................................
PASS
Benchmark__Msgpack__Encode	   50000	     69408 ns/op	   15852 B/op	      84 allocs/op
Benchmark__Msgpack__Decode	   10000	    119152 ns/op	   15542 B/op	     424 allocs/op
Benchmark__Binc_____Encode	   20000	     80940 ns/op	   18033 B/op	      88 allocs/op
Benchmark__Binc_____Decode	   10000	    123617 ns/op	   16363 B/op	     305 allocs/op
Benchmark__Gob______Encode	   10000	    152634 ns/op	   21342 B/op	     238 allocs/op
Benchmark__Gob______Decode	    5000	    424450 ns/op	   83625 B/op	    1842 allocs/op
Benchmark__Json_____Encode	   20000	     83246 ns/op	   13866 B/op	     102 allocs/op
Benchmark__Json_____Decode	   10000	    263762 ns/op	   14166 B/op	     493 allocs/op
Benchmark__Bson_____Encode	   10000	    129876 ns/op	   27722 B/op	     514 allocs/op
Benchmark__Bson_____Decode	   10000	    164583 ns/op	   16478 B/op	     789 allocs/op
Benchmark__VMsgpack_Encode	   50000	     71333 ns/op	   12356 B/op	     343 allocs/op
Benchmark__VMsgpack_Decode	   10000	    161800 ns/op	   20302 B/op	     571 allocs/op
ok  	ugorji.net/codec	27.165s

To run full benchmark suite (including against vmsgpack and bson), see notes in ext_dep_test.go

RPC

RPC Client and Server Codecs are implemented, so the codecs can be used with the standard net/rpc package.

Index

Constants

This section is empty.

Variables

View Source
var GoRpc goRpc

GoRpc implements Rpc using the communication protocol defined in net/rpc package.

View Source
var MsgpackSpecRpc msgpackSpecRpc

MsgpackSpecRpc implements Rpc using the communication protocol defined in the msgpack spec at http://wiki.msgpack.org/display/MSGPACK/RPC+specification

Functions

This section is empty.

Types

type BincHandle

type BincHandle struct {
	EncodeOptions
	DecodeOptions
	// contains filtered or unexported fields
}

BincHandle is a Handle for the Binc Schema-Free Encoding Format defined at https://github.com/ugorji/binc .

BincHandle currently supports all Binc features with the following EXCEPTIONS:

  • only integers up to 64 bits of precision are supported. big integers are unsupported.
  • Only IEEE 754 binary32 and binary64 floats are supported (ie Go float32 and float64 types). extended precision and decimal IEEE 754 floats are unsupported.
  • Only UTF-8 strings supported. Unicode_Other Binc types (UTF16, UTF32) are currently unsupported.

Note that these EXCEPTIONS are temporary and full support is possible and may happen soon.

func (*BincHandle) AddExt

func (o *BincHandle) AddExt(
	rt reflect.Type,
	tag byte,
	encfn func(reflect.Value) ([]byte, error),
	decfn func(reflect.Value, []byte) error,
) (err error)

AddExt registers an encode and decode function for a reflect.Type. Note that the type must be a named type, and specifically not a pointer or Interface. An error is returned if that is not honored.

type DecodeOptions

type DecodeOptions struct {
	// An instance of MapType is used during schema-less decoding of a map in the stream.
	// If nil, we use map[interface{}]interface{}
	MapType reflect.Type
	// An instance of SliceType is used during schema-less decoding of an array in the stream.
	// If nil, we use []interface{}
	SliceType reflect.Type
	// ErrorIfNoField controls whether an error is returned when decoding a map
	// from a codec stream into a struct, and no matching struct field is found.
	ErrorIfNoField bool
}

type Decoder

type Decoder struct {
	// contains filtered or unexported fields
}

A Decoder reads and decodes an object from an input stream in the codec format.

func NewDecoder

func NewDecoder(r io.Reader, h Handle) *Decoder

NewDecoder returns a Decoder for decoding a stream of bytes from an io.Reader.

For efficiency, Users are encouraged to pass in a memory buffered writer (eg bufio.Reader, bytes.Buffer).

func NewDecoderBytes

func NewDecoderBytes(in []byte, h Handle) *Decoder

NewDecoderBytes returns a Decoder which efficiently decodes directly from a byte slice with zero copying.

func (*Decoder) Decode

func (d *Decoder) Decode(v interface{}) (err error)

Decode decodes the stream from reader and stores the result in the value pointed to by v. v cannot be a nil pointer. v can also be a reflect.Value of a pointer.

Note that a pointer to a nil interface is not a nil pointer. If you do not know what type of stream it is, pass in a pointer to a nil interface. We will decode and store a value in that nil interface.

Sample usages:

// Decoding into a non-nil typed value
var f float32
err = codec.NewDecoder(r, handle).Decode(&f)

// Decoding into nil interface
var v interface{}
dec := codec.NewDecoder(r, handle)
err = dec.Decode(&v)

When decoding into a nil interface{}, we will decode into an appropriate value based on the contents of the stream. Numbers are decoded as float64, int64 or uint64. Other values are decoded appropriately (e.g. bool), and configurations exist on the Handle to override defaults (e.g. for MapType, SliceType and how to decode raw bytes).

When decoding into a non-nil interface{} value, the mode of encoding is based on the type of the value. When a value is seen:

  • If an extension is registered for it, call that extension function
  • If it implements BinaryUnmarshaler, call its UnmarshalBinary(data []byte) error
  • Else decode it based on its reflect.Kind

There are some special rules when decoding into containers (slice/array/map/struct). Decode will typically use the stream contents to UPDATE the container.

  • This means that for a struct or map, we just update matching fields or keys.
  • For a slice/array, we just update the first n elements, where n is length of the stream.
  • However, if decoding into a nil map/slice and the length of the stream is 0, we reset the destination map/slice to be a zero-length non-nil map/slice.
  • Also, if the encoded value is Nil in the stream, then we try to set the container to its "zero" value (e.g. nil for slice/map).
  • Note that a struct can be decoded from an array in the stream, by updating fields as they occur in the struct.

type EncodeOptions

type EncodeOptions struct {
	// Encode a struct as an array, and not as a map.
	StructToArray bool
}

type Encoder

type Encoder struct {
	// contains filtered or unexported fields
}

An Encoder writes an object to an output stream in the codec format.

func NewEncoder

func NewEncoder(w io.Writer, h Handle) *Encoder

NewEncoder returns an Encoder for encoding into an io.Writer.

For efficiency, Users are encouraged to pass in a memory buffered writer (eg bufio.Writer, bytes.Buffer).

func NewEncoderBytes

func NewEncoderBytes(out *[]byte, h Handle) *Encoder

NewEncoderBytes returns an encoder for encoding directly and efficiently into a byte slice, using zero-copying to temporary slices.

It will potentially replace the output byte slice pointed to. After encoding, the out parameter contains the encoded contents.

func (*Encoder) Encode

func (e *Encoder) Encode(v interface{}) (err error)

Encode writes an object into a stream in the codec format.

Encoding can be configured via the "codec" struct tag for the fields.

The "codec" key in struct field's tag value is the key name, followed by an optional comma and options.

To set an option on all fields (e.g. omitempty on all fields), you can create a field called _struct, and set flags on it.

Struct values "usually" encode as maps. Each exported struct field is encoded unless:

  • the field's codec tag is "-", OR
  • the field is empty and its codec tag specifies the "omitempty" option.

When encoding as a map, the first string in the tag (before the comma) is the map key string to use when encoding.

However, struct values may encode as arrays. This happens when:

  • StructToArray Encode option is set, OR
  • the codec tag on the _struct field sets the "toarray" option

The empty values (for omitempty option) are false, 0, any nil pointer or interface value, and any array, slice, map, or string of length zero.

Anonymous fields are encoded inline if no struct tag is present. Else they are encoded as regular fields.

Examples:

type MyStruct struct {
    _struct bool    `codec:",omitempty"`   //set omitempty for every field
    Field1 string   `codec:"-"`            //skip this field
    Field2 int      `codec:"myName"`       //Use key "myName" in encode stream
    Field3 int32    `codec:",omitempty"`   //use key "Field3". Omit if empty.
    Field4 bool     `codec:"f4,omitempty"` //use key "f4". Omit if empty.
    ...
}

type MyStruct struct {
    _struct bool    `codec:",omitempty,toarray"`   //set omitempty for every field
                                                   //and encode struct as an array
}

The mode of encoding is based on the type of the value. When a value is seen:

  • If an extension is registered for it, call that extension function
  • If it implements BinaryMarshaler, call its MarshalBinary() (data []byte, err error)
  • Else encode it based on its reflect.Kind

Note that struct field names and keys in map[string]XXX will be treated as symbols. Some formats support symbols (e.g. binc) and will properly encode the string only once in the stream, and use a tag to refer to it thereafter.

type Handle

type Handle interface {
	// contains filtered or unexported methods
}

Handle is the interface for a specific encoding format.

Typically, a Handle is pre-configured before first time use, and not modified while in use. Such a pre-configured Handle is safe for concurrent access.

type MsgpackHandle

type MsgpackHandle struct {
	// RawToString controls how raw bytes are decoded into a nil interface{}.
	RawToString bool
	// WriteExt flag supports encoding configured extensions with extension tags.
	// It also controls whether other elements of the new spec are encoded (ie Str8).
	//
	// With WriteExt=false, configured extensions are serialized as raw bytes
	// and Str8 is not encoded.
	//
	// A stream can still be decoded into a typed value, provided an appropriate value
	// is provided, but the type cannot be inferred from the stream. If no appropriate
	// type is provided (e.g. decoding into a nil interface{}), you get back
	// a []byte or string based on the setting of RawToString.
	WriteExt bool

	EncodeOptions
	DecodeOptions
	// contains filtered or unexported fields
}

MsgpackHandle is a Handle for the Msgpack Schema-Free Encoding Format.

func (*MsgpackHandle) AddExt

func (o *MsgpackHandle) AddExt(
	rt reflect.Type,
	tag byte,
	encfn func(reflect.Value) ([]byte, error),
	decfn func(reflect.Value, []byte) error,
) (err error)

AddExt registers an encode and decode function for a reflect.Type. Note that the type must be a named type, and specifically not a pointer or Interface. An error is returned if that is not honored.

func (*MsgpackHandle) TimeDecodeExt

func (_ *MsgpackHandle) TimeDecodeExt(rv reflect.Value, bs []byte) (err error)

TimeDecodeExt decodes a time.Time from the byte slice parameter, and sets it into the reflect value. Configure this to support the Time Extension, e.g. using tag 1.

func (*MsgpackHandle) TimeEncodeExt

func (_ *MsgpackHandle) TimeEncodeExt(rv reflect.Value) (bs []byte, err error)

TimeEncodeExt encodes a time.Time as a byte slice. Configure this to support the Time Extension, e.g. using tag 1.

type MsgpackSpecRpcMultiArgs

type MsgpackSpecRpcMultiArgs []interface{}

MsgpackSpecRpcMultiArgs is a special type which signifies to the MsgpackSpecRpcCodec that the backend RPC service takes multiple arguments, which have been arranged in sequence in the slice.

The Codec then passes it AS-IS to the rpc service (without wrapping it in an array of 1 element).

type Rpc

type Rpc interface {
	ServerCodec(conn io.ReadWriteCloser, h Handle) rpc.ServerCodec
	ClientCodec(conn io.ReadWriteCloser, h Handle) rpc.ClientCodec
}

Rpc interface provides a rpc Server or Client Codec for rpc communication.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL