Documentation
¶
Overview ¶
Package netstring provides robust encoding and decoding of netstrings to and from byte streams.
netstrings are a simple serialization technique originally defined by http://cr.yp.to/netstrings.txt. Typical usage is to exchange messages consisting of a small number of well-defined netstrings. Complex messages with many variables and changing semantics are better suited to more sophisticated encoding schemes such as encoding/json or Protobufs.
Netstrings are of the form: [length] ":" [value] "," where [value] is the payload of interest, [length] is the length of [value] in decimal bytes and the colon and comma are leading and trailing delimiters respectively. The string "The Hitchhiker's Guide to the Galaxy - D.A." encoded as a netstring is:
"42:The Hitchhiker's Guide to the Galaxy - DA.,"
Binary Values ¶
Storing binary values in netstrings, while possible, is not recommended for obvious reasons of incompatible CPU architectures. Best practise is to convert all binary values to strings prior to encoding.
To assist in this best practice, helper functions are available to encode basic go-types such as ints and floats to netstrings. E.g. the function Encoder.EncodeInt() converts int(2^16) to the netstring:
"5:65536,"
Most of these helpers use strconv.Format* functions to convert binary values to strings and applications are encouraged to use the corresponding strconv.Parse*() functions to decode non-string values back to internal binary. The specifics of each to non-string conversion are documented in each helper function.
Apart from simple struct support with Marshal() and Unmarshal() there is no support for encoding complex go types such as nest structs, arrays, slices and maps as this is the juncture at which the application might best be served using a more sophisticated encoding scheme as mentioned earlier.
Rigorous Parsing ¶
This package is particularly fastidious about parsing and generating valid netstrings. For example, the specification say that a length can only start with a zero digit if the length field is exactly one byte long - in other words a zero-length netstring. But many netstring packages blindly accept any number of leading zeroes because they use something like the tolerant strconv.Atoi() to convert the length. Not so for this package.
If the Decoder fails for some reason, the parser stays in a permanent error state as resynchronizing to the next netstring after any syntax error is impossible to perform reliably.
Assembling Messages ¶
Typical usage of netstrings is to assemble a simple message consisting of a small number of netstrings. E.g., if an application wants to transmit Age, Country and Name it could be encoded as these three netstrings:
"2:21,7:Iceland,5:Bjorn,"
and then written to the transmission channel for decoding at the remote end.
To correctly decode the message, the remote end has to know that there are exactly three netstrings in the message and it also has to know the correct order of the values - in this case: Age, Country and Name.
Netstring messages are brittle ¶
As mentioned above, when exchanging messages, both ends have to agree on the number and order of netstrings to be able to correctly encode and decode the message. If a message changes, perhaps because a new value is added, both ends have to be upgraded at the same time. For simple messages and tightly coupled applications, this brittleness is tolerable, but for loosely coupled applications and more complex messages, this brittleness is limiting and unwieldy.
To alleviate this brittleness, this package supports the notion of "keyed" netstrings which provide much greater flexibility in arranging a message.
"Keyed" netstrings ¶
"Keyed" netstrings allow a single byte - known as a "key" - to be assocated with each netstring. The "key" only has meaning to the application as this package merely facilitates associating a "key" with each netstring. For example a "key" might define how the application should decode the netstring or it might associate a netstring with an particular field in a struct. Or it might mean something else entirely!
A "keyed" netstring is a simple convention which is nothing more than a regular netstring with the first byte being used as the "key" and subsequent bytes representing the value. For example, the netstring:
"4:dDog,"
can be interpreted as a "keyed" netstring with a key of 'd' and a value of "Dog".
As it's merely a convention, both encoders and decoders need to agree on whether they are exchanging standard netstrings or "keyed" netstrings.
The benefit of "keyed" netstrings is that they create a simply typing system such that netstrings can be associated with particular variables and can be serialized in any order or even optionally serialized as part of an aggregate message. In short, "keyed" netstrings are a flexible form of Type-Length-Value encoding.
Using the earlier example of Age, Country and Name, a message with "keyed" netstrings might look like:
"3:a21,8:CIceland,6:nBjorn,"
where the key 'a' means Age, 'C' means Country and 'n' means Name.
or possibly:
"6:nBjorn,3:a21,"
if Country is optional.
Note how "keyed" netstrings no longer need to be serialized in order, nor do they need to be present if optional as compared to positional netstrings.
Another minor benefit of "keyed" netstrings is the ability to differentiate between zero length values and NULL. If the "keyed" netstring is present it implies a value; if the "keyed" netstring is absent it implies a NULL.
"Keyed" netstrings thus allow greater flexibility in message assembly and disassembly as well as much easier upgrades of messages without having to necessarily synchronize transmitters and receivers.
To ensure "keyed" netstrings remain as strings, a valid "key" must be in the isalpha() character set - that is 'a'-'z' and 'A'-'Z'.
This package imposes no limitations on how "keyed" netstrings are used. An application is free to re-use the same "key" in the same message if it makes sense to do so. Note that this level of flexibility does not apply to the higher level Marshal() and Unmarshal() functions.
Encoder.Marshal and Decoder.Unmarshal are purposely designed with "keyed" netstrings in mind as they encode and decode a simple struct into a message with "keyed" netstrings. There are various rules around how netstring keys are used and what constitutes a simple struct.
End of Message Strategies ¶
When designing a message containing multiple netstrings, the question arises as to how to signify to the remote receiver that they have received all netstrings for that particular message. One strategy already mentioned is to simply agree on the number of netstrings - with the obvious brittleness that imposes.
Another strategy is to create an encapsulating netstring which contains all the message's netstrings thus the receiver accepts a single netstring then decodes it for the actually payload. Using our earlier example, this is what an encapsulating netstring might look like:
"26:3:a21,8:CIceland,6:nBjorn,,"
with the encapsulating netstring being 26 bytes long in which the value contains three "keyed" netstrings of 'a', 'C' and 'n'.
While this strategy works, one problem is that it requires double handling of each message.
Yet another strategy is to used "keyed" netstrings and designate a particular key as an end-of-message sentinel, such as 'z'. Using our previous example message with Age, Country and Name, the complete message with a trailing end-of-message sentinel of 'z' might look like:
"3:a21,8:CIceland,6:nBjorn,1:z,"
or possibly:
"6:nBjorn,3:a21,1:z,"
such that any application decoding the messsage knows it has a complete message when the 'z' "key" is returned.
Encoder.Marshal and Decoder.Unmarshal use this end-of-message sentinel strategy.
Examples ¶
The _examples directory contains a number of programs which demonstrate various features of this package so this section merely contains a few fragments to provide a general idea of idiomatic use.
This example encodes a message into a bytes.Buffer.
var buf bytes.Buffer enc := netstring.NewEncoder(&buf) enc.EncodeInt('a', 21) // Age enc.EncodeString('C', "Iceland") // Country enc.EncodeString('n', "Bjorn") // Name enc.EncodeBytes('z') // End-of-message sentinel fmt.Println(buf.String()) // "3:a21,8:CIceland,6:nBjorn,1:z,"
And this example decodes the same message.
dec := netstring.NewDecoder(&buf) k, v, e := dec.DecodeKeyed() // k=a, v=21 k, v, e = dec.DecodeKeyed() // k=C, v=Iceland k, v, e = dec.DecodeKeyed() // k=n, v=Bjorn k, v, e = dec.DecodeKeyed() // k=z End-Of-Message
A complete implementation of this example is in _examples/compare.go which encodes a simple message with "keyed" netstrings then decodes the message to ensure that the reconstructed values match the originals.
The higher level functions Marshal() and Unmarshal() can be used to exchange complete messages. These function used "keyed" netstrings with an end-of-message sentinel to package up a complete message from a simple struct.
This example encodes the same message as above using Encoder.Marshal().
type msg struct { Age int `netstring:"a"` Country string `netstring:"C"` Name string `netstring:"n"` } var buf bytes.Buffer enc := netstring.NewEncoder(&buf) out := &msg{21, "Iceland", "Bjorn"} enc.Marshal('z', out) fmt.Println(buf.String()) // "3:a21,8:CIceland,6:nBjorn,1:z,"
And this example decodes the same message using Decoder.Unmarshal().
dec := netstring.NewDecoder(&buf) in := &msg{} dec.Unmarshal('z', in)
_examples/client.go and _example/server.go show how an Encoder and Decoder can be attached to a network connection such that all exchanges across the network are performed with netstrings. These example programs use both the lower level Encode*() and Decode*() functions as well as the higher level Marshal() and Unmarshal() functions.
Index ¶
- Constants
- Variables
- type Decoder
- type Encoder
- func (enc *Encoder) Encode(key Key, val any) error
- func (enc *Encoder) EncodeBool(key Key, val bool) error
- func (enc *Encoder) EncodeByte(key Key, val byte) error
- func (enc *Encoder) EncodeBytes(key Key, val ...[]byte) error
- func (enc *Encoder) EncodeFloat32(key Key, val float32) error
- func (enc *Encoder) EncodeFloat64(key Key, val float64) error
- func (enc *Encoder) EncodeInt(key Key, val int) error
- func (enc *Encoder) EncodeInt32(key Key, val int32) error
- func (enc *Encoder) EncodeInt64(key Key, val int64) error
- func (enc *Encoder) EncodeString(key Key, val string) error
- func (enc *Encoder) EncodeUint(key Key, val uint) error
- func (enc *Encoder) EncodeUint32(key Key, val uint32) error
- func (enc *Encoder) EncodeUint64(key Key, val uint64) error
- func (enc *Encoder) Marshal(eom Key, message any) error
- type Key
Constants ¶
const MaximumLength = 999999999
MaximumLength defines the maximum length of a value in a netstring.
The original specification doesn't actually define a maximum length so this somewhat arbitrary value is defined mostly as a safety margin for CPUs for which the go compiler defines int as int32.
Having said that, the original specification *does* include a code fragment which suggests the same limit so it seems like a good place to start. This limit is slighty less than 2^30, so safe for any int32/uint32 storage.
Variables ¶
var ErrBadMarshalEOM = errors.New(errorPrefix + "End-of-Message Key is invalid")
var ErrBadMarshalTag = errors.New(errorPrefix + "struct tag is not a valid netstring.Key")
var ErrBadMarshalValue = errors.New(errorPrefix + "Marshal only accepts struct{} and *struct{}")
var ErrBadUnmarshalMsg = errors.New(errorPrefix + "Unmarshal only accepts *struct{}")
var ErrColonExpected = errors.New(errorPrefix + "Leading colon delimiter not found after length")
var ErrCommaExpected = errors.New(errorPrefix + "Trailing comma delimeter not found after value")
var ErrInvalidKey = errors.New(errorPrefix + "Key is not in range 'a'-'z' or 'A'-'Z'")
var ErrLeadingZero = errors.New(errorPrefix + "Non-zero length cannot have a leading zero")
var ErrLengthNotDigit = errors.New(errorPrefix + "Length does not start with a digit")
var ErrLengthToLong = errors.New(errorPrefix + "Length contains more bytes than maximum allowed")
var ErrNoKey = errors.New(errorPrefix + "Keyed netstring cannot be NoKey")
var ErrUnsupportedType = errors.New(errorPrefix + "Unsupported go type supplied to Encode()")
var ErrValueToLong = errors.New(errorPrefix + "Length of value is longer than maximum allowed")
var ErrZeroKey = errors.New(errorPrefix + "Keyed netstring is zero length (thus has no key)")
Functions ¶
This section is empty.
Types ¶
type Decoder ¶
type Decoder struct {
// contains filtered or unexported fields
}
Decoder provides a netstring decode capability. A Decoder *must* be constructed with NewDecoder otherwise subsequent calls will panic.
The byte-stream from the io.Reader provided to NewDecoder is expected to contain a pure stream of netstrings. Each netstring can be retrieved via [Decode] and [DecodeKeyed] for standard netstrings and "keyed" netstrings respectively. The sending and receiving applications must agree on all aspects of how these netstrings are interpreted. Typically they will agree on a message structure which is either a fixed number of standard netstrings or a variable number of "keyed" netstrings terminated by an end-of-message sentinel.
[Decode] and [DecodeKeyed] are used to accessed each decoded netstring as it becomes available and [Unmarshal] is used to decoded a complete "message" containing a series of "keyed" netstrings (including an end-of-message sentinel) into a simple struct.
It is often good practice to wrap the input io.Reader in a bufio.Reader as this can improve parsing performance.
If the Decoder detects a malformed netstring, it stops parsing, returns an error and effective stops all future parsing for that byte stream because once synchronization is lost, it can never be recovered.
Decoder passes io.EOF back to the caller from the io.Reader, but only after all bytes have been consumed in the process of producing netstrings. An application should anticipate io.EOF if the io.Reader constitutes a network connection of some type. Unlike io.Reader, the EOF error is *not* returned in the same call which returns a valid netstring or message.
func NewDecoder ¶
NewDecoder constructs a Decoder which accepts a byte stream via its io.Reader interface and presents decoded netstrings via Decode(), DecodeKeyed() and Unmarshal()
func (*Decoder) Decode ¶ added in v1.0.1
Decode returns the next available netstring. If no more netstrings are available from the supplied io.Reader, io.EOF is returned.
Once an invalid netstring is detected, the byte stream is considered permanently unrecoverable and the same error is returned in perpetuity.
The [DecodeKeyed] function is better suited if the application is using "keyed" netstrings.
func (*Decoder) DecodeKeyed ¶ added in v1.0.1
DecodeKeyed is used when the stream contains "keyed" netstrings created by the Encoder. A "keyed" netstring is simply a netstring where the first byte is a "key" used to categorize the rest of the value. What that categorization means is entirely up to the application.
DecodeKeyed returns the next available netstring, if any, along with the prefix "key". The returned value does *not* include the prefix "key". If no more netstrings are available, error is returned with io.EOF.
Once an invalid netstring is detected, the byte stream is considered permanently unrecoverable and the same error is returned in perpetuity.
This function returns non-persistent errors if a non-keyed netstring is parsed. A non-keyed netstring is either zero length or the first byte is not an isalpha() key value.
func (*Decoder) Unmarshal ¶ added in v1.0.1
Unmarshal takes incoming "keyed" netstrings and populates "message". Message must be a pointer to a simple struct with the same restrictions as discussed in Marshal.
Each netstring is read via Decoder.DecodeKeyed() until a "keyed" netstring matches "eom". Each netstring is decoded into the field with a "netstring" tag matching the netstring "key".
The end-of-message sentinel, "eom", can be any valid Key excepting netstring.NoKey. When the "eom" netstring is seen, the message is considered fully populated, the "eom" message is discarded and control is returned to the caller.
If "message" is not a simple struct or pointer to a simple struct an error is returned. Only exported fields with "netstring" tags are considered for incoming "keyed" netstrings. If "message" contains duplicate "netstring" tag values an error is returned.
The "unknown" variable is set with the key of any incoming "keyed" netstring which has no corresponding field in "message". Obviously only one "unknown" is visible to the caller even though there may be multiple occurrences. Since an unknown key may be acceptable to the application, it is left to the caller to decide whether this situation results in an error, an alert to upgrade, or silence.
An example:
type record struct { Age int `netstring:"a"` Country string `netstring:"c"` TLD []byte `netstring:"t"` CountryCode []byte `netstring:"C"` Name string `netstring:"n"` } bbuf := bytes.NewBufferString("3:Mr0,3:a22,11:cNew Zeland,3:C64,4:nBob,1:Z,") dec := netstring.NewDecoder(bbuf) k, v, e := dec.DecodeKeyed() if k == 'M' && string(v) == "r0" { // Dispatch on message type msg := &record{} dec.Unmarshal('Z', msg) }
Note how the first netstring is used to determine which struct to Unmarshal into.
type Encoder ¶
type Encoder struct {
// contains filtered or unexported fields
}
Encoder provides Encode*() functions to encode basic go types as netstrings and write them to the io.Writer. Encode also provides Marshal() which assembles a complete message from a simple struct as a series of netstrings. An Encoder *must* be constructed with NewEncoder() otherwise subsequent calls will panic.
The first parameter to every Encode*() function is a Key type called "key" which can be either a binary '0' (aka netstring.NoKey) which causes the Encoder to emit a regular netstring or any isalpha() value which causes the Encoder to emit a "keyed" netstring. Any "key" value outside those ranges is invalid and results in an error return. The "key" is tested using netstring.Key.Assess().
The "key" in "keyed" netstrings can be used to categorized the netstring in some meaningful way for the application. In this case the receiving application calls Decode.DecodeKeyed() to return this "key" and the rest of the netstring as a value.
"Keyed" netstrings simply mean that the "key" byte is the first byte of the netstring with the value, if any, being the following bytes. It's nothing particularly fancy, but it does afford the application signifcantly more flexibility as described in the general package documentation.
Idiomatic use of Encoder is to supply a network socket to NewEncoder() thus encoded netstrings are automatically written to the network. Similarly the receiver connects their network socket to a Decoder() and automatically receive decoded netstrings as they arrive.
Almost all error returns will be errors from the underlying io.Writer which tends to mean a Write() to a network socket failed.
func NewEncoder ¶
NewEncoder constructs a netstring encoder. An Encoder *must* be constructed with NewEncoder otherwise subsequent calls will panic.
Each call to a Encode*() function results in a netstring being written to the io.Writer, quite possibly with multiple Write() calls.
func (*Encoder) Encode ¶ added in v1.0.1
Encode is the type-generic function which encodes most simple go types. Encode() uses go type-casting of val.(type) to determine the type-specific encoder to call. "key" must pass Key.Assess() otherwise an error is returned.
Be wary of encoding a rune (a single quoted unicode character) with Encode() as the go compiler arranges for a rune to be passed in as an int32 and will thus be encoded as a string representation of its integer value. Recipient applications need to be aware of this conversion if they want to reconstruct the original rune.
A better strategy is to pass unicode characters to Encode() as a string and single bytes should be cast as a byte, e.g. Encode(0, byte('Z')). When in doubt it's best to use type-specific functions such as EncodeByte() and EncodeString().
func (*Encoder) EncodeBool ¶ added in v1.0.1
EncodeBool encodes a boolean value as a netstring. If key == netstring.NoKey a standard netstring is encoded otherwise a "keyed" netstring is encoded. "key" must pass Key.Assess() otherwise an error is returned.
Accepted strconv shorthand of 'T' and 'f' represents true and false respectively. Recommended conversion back to boolean is via strconv.ParseBool()
func (*Encoder) EncodeByte ¶ added in v1.0.1
EncodeByte encodes a single byte as a netstring. "key" must pass Key.Assess() otherwise an error is returned.
func (*Encoder) EncodeBytes ¶ added in v1.0.1
EncodeBytes encodes the variadic arguments as a series of bytes in a single netstring.
This function returns an error if key.Assess() returns an error. If key == netstring.NoKey then a standard netstring is encoded otherwise a "keyed" netstring is encoded.
EncodeBytes is the recommended function to create an end-of-message sentinel for "keyed" netstring. If, e.g., a "key" if 'z' is the sentinel, then:
EncodeBytes('z')
generates the appropriate "keyed" netstring.
func (*Encoder) EncodeFloat32 ¶ added in v1.0.1
EncodeFloat32 encodes a float32 as a netstring using strconv.FormatFloat with the 'f' format. Recommended conversion back to float32 is via strconv.ParseFloat(). "key" must pass Key.Assess() otherwise an error is returned.
func (*Encoder) EncodeFloat64 ¶ added in v1.0.1
EncodeFloat64 encodes a float64 as a netstring using strconv.FormatFloat with the 'f' format. Recommended conversion back to float64 is via strconv.ParseFloat(). "key" must pass Key.Assess() otherwise an error is returned.
func (*Encoder) EncodeInt ¶ added in v1.0.1
EncodeInt encodes an int as a netstring using strconv.FormatInt. Recommended conversion back to int is via strconv.ParseInt(). "key" must pass Key.Assess() otherwise an error is returned.
func (*Encoder) EncodeInt32 ¶ added in v1.0.1
EncodeInt32 encodes an int32 as a netstring using strconv.FormatInt. "key" must pass Key.Assess() otherwise an error is returned.
func (*Encoder) EncodeInt64 ¶ added in v1.0.1
EncodeInt64 encodes an int64 as a netstring using strconv.FormatInt. Recommended conversion back to int64 is via strconv.ParseInt(). "key" must pass Key.Assess() otherwise an error is returned.
func (*Encoder) EncodeString ¶ added in v1.0.1
EncodeString encodes a string as a netstring. If key == netstring.NoKey a standard netstring is encoded otherwise a "keyed" netstring is encoded. "key" must pass Key.Assess() otherwise an error is returned.
func (*Encoder) EncodeUint ¶ added in v1.0.1
EncodeInt encodes a uint as a netstring using strconv.FormatUint. Recommended conversion back to int is via strconv.ParseUint(). "key" must pass Key.Assess() otherwise an error is returned.
func (*Encoder) EncodeUint32 ¶ added in v1.0.1
EncodeUint32 encodes a uint32 as a netstring using strconv.FormatUInt. Recommended conversion back to int32 is via strconv.ParseInt(). "key" must pass Key.Assess() otherwise an error is returned.
func (*Encoder) EncodeUint64 ¶ added in v1.0.1
EncodeUint64 encodes a uint64 as a netstring using strconv.FormatUint. Recommended conversion back to int64 is via strconv.ParseUint(). "key" must pass Key.Assess() otherwise an error is returned.
func (*Encoder) Marshal ¶ added in v1.0.1
Marshal takes "message" as a simple struct or a pointer to a simple struct and encodes all exported fields with a "netstring" tag as a series of "keyed" netstrings. If there is no "netstring" tag the field is ignored. The reason the "netstring" tag is required is to supply a netstring key value which assists Unmarshal in locating the appropriate field on the receiving side. Marshal cannot be used to encode standard netstrings.
The "eom" parameter is used to create an end-of-message sentinel "keyed" netstring. It can be any valid Key excepting netstring.NoKey. The sentinel follows the simple struct netstrings with Encoder.EncodeBytes(eom).
There are significant constraints as to what constitutes a valid simple struct. In large part this is because netstrings are ill-suited to support complex messages - use encoding/json or protobufs for those. Candidate fields (i.e. exported with a "netstring" tag) can only be one of the following basic go types: all ints and uints, all floats, strings and byte slices. That's it! Put another way, fields cannot be complex types such as maps, arrays, structs, pointers, etc. Any unsupported field type which has a "netstring" tag returns an error.
The "netstring" tag value must be a valid netstring.Key and each "netstring" tag value must be unique otherwise an error is returned.
Though fields are encoded in the order found in the struct via the "reflect" package, this sequence should not be relied on. Always use the "keyed" values to associate netstrings to fields.
To assist go applications wishing to Unmarshal, it is good practice to use the first netstring as a message type so that the receiving side can select the corresponding struct to Unmarshal in to. Having to know the type before seeing the payload is a fundamental issue for all go Unmarshal functions such as json.Unmarshal in that they have to know ahead of time what type of struct the message contains; thus the message type has to effectively precede the message. At least with netstrings that's easy to arrange.
Type and tag checking is performed while encoding so any error return probably leaves the output stream in an indeterminate state.
An example:
type record struct { Age int `netstring:"a"` Country string `netstring:"c"` TLD []byte `netstring:"t"` CountryCode []byte `netstring:"C"` Name string `netstring:"n"` Height uint16 // Ignored - no netstring tag dbKey int64 // Ignored - not exported } ... var bbuf bytes.Buffer enc := netstring.NewEncoder(&bbuf) enc.EncodeString('M', "r0") // Message type 'r', version zero r := record{22, "New Zealand", []byte{'n', 'z'}, []byte("64"), "Bob", 173, 42} enc.Marshal('Z', &r) fmt.Println(bbuf.String()) // "3:Mr0,3:a22,12:cNew Zealand,3:tnz,3:C64,4:nBob,1:Z,"
Particularly note the preceding message type "r0" and the trailing end-of-message sentinel 'Z'.
type Key ¶ added in v1.0.1
type Key byte
Key is the byte value provided to the Encoder Encode*() functions to determine whether the encoded netstring is a standard netstring or a "keyed" netstring. Valid values are: NoKey (or 0) for a standard netstring or an isalpha() value ('a'-'z' or 'A'-'Z') for a "keyed" netstring. All other values are invalid. Key is also the type returned by the Decoder functions. Use Key.Assess() to determine the validity and type of a key.
const NoKey Key = 0
NoKey is the special "key" provided to the Encoder.Encode*() functions to indicate that a standard netstring should be encoded.
func (Key) Assess ¶ added in v1.0.1
Assess determines whether the Key 'k' is valid or not and whether it implies a standard or "keyed" netstring. NoKey.Assess() returns keyed=false and err=nil which is to say that Assess treats NoKey as valid but it signifies a standard netstring.
"keyed" is set true if 'k' is in the range 'a'-'z' or 'A'-'Z', inclusive.