README

gogen-avro

Build Status MIT licensed Version 6.5.0

Generates type-safe Go code based on your Avro schemas, including serializers and deserializers that support Avro's schema evolution rules.

Table of contents

Installation

gogen-avro has two parts: a tool which you install on your system (usually on your GOPATH) to generate code, and a runtime library that gets imported.

To install the gogen-avro executable to $GOPATH/bin/ and generate structs, first download the repository:

go get -d github.com/actgardner/gogen-avro/cmd/gogen-avro

Then run:

go install github.com/actgardner/gogen-avro/cmd/gogen-avro

We recommend pinning a specific SHA of the gogen-avro tool when you compile your schemas with a tool like retool. This will ensure your builds are repeatable.

For the library imports, you should manage the dependency on this repo using dep or a similar tool, like any other library.

Usage

To generate Go source files from one or more Avro schema files, run:

gogen-avro [--containers=false] [--sources-comment=false] [--short-unions=false] [--package=<package name>] <output directory> <avro schema files>

You can also use a go:generate directive in a source file (example):

//go:generate $GOPATH/bin/gogen-avro . primitives.avsc

Note: If you want to parse multiple .avsc files into a single Go package (a single folder), make sure you put them all in one line. gogen-avro produces a file, primitive.go, that will be overwritten if you run it multiple times with different .avsc files and the same output folder.

Generated Methods

For each record in the provided schemas, gogen-avro will produce a struct, and the following methods:

New<RecordType>()

A constructor to create a new record struct, with no values set.

New<RecordType>Writer(writer io.Writer, codec container.Codec, recordsPerBlock int64) (*container.Writer, error)

Creates a new container.Writer which writes generated structs to writer with Avro OCF format. This is the method you want if you're writing Avro to files. codec supports Identity, Deflate and Snappy encodings per the Avro spec.

New<RecordType>Reader(reader io.Reader) (<RecordTypeReader>, error)

Creates a new <RecordTypeReader> which reads data in the Avro OCF format into generated structs. This is the method you want if you're reading Avro data from files. It will handle the codec and schema evolution for you based on the OCF headers and the reader schema used to generate the structs.

<RecordType>.Serialize(io.Writer) error

Write the contents of the struct into the given io.Writer in the Avro binary format, with no Avro Object Container File (OCF) framing.

Deserialize<RecordType>(io.Reader) (<RecordType>, error)

Read Avro data from the given io.Reader and deserialize it into the generated struct. This assumes the schema used to write the data is identical to the schema used to generate the struct. This method assumes there's no OCF framing. This method is also slow because it re-compiles the bytecode for your type every time - if you need performance you should call compiler.Compile once and then vm.Eval for each record.

Working with Object Container Files (OCF)

An example of how to write a container file can be found in example/container/example.go.

Godocs for the container package

Single-Object Encoding

An example of how to read and write Single-Object encoded messages (for use with Kafka, for instance) can be found in example/single_object/example.go.

Godocs for the soe package

Example

The example directory contains simple example projects with an Avro schema. Once you've installed gogen-avro on your GOPATH, you can install the example projects:

# Build the Go source files from the Avro schema using the generate directive
go generate github.com/actgardner/gogen-avro/example

# Install the example projects on the GOPATH
go install github.com/actgardner/gogen-avro/example/record
go install github.com/actgardner/gogen-avro/example/container

Naming

Gogen-avro converts field and type names to be valid, public Go names by following a few simple steps:

  • removing leading underscore characters (_)
  • upper-casing the first letter of the name

This minimizes the risk that two fields with different Avro names will have the same Go name.

Gogen-avro respects namespaces and aliases when resolving type names. However, generated files will all be placed directly into the package specified by the user. This may cause issues in rare cases where two types have different namespaces but the same name.

Type Conversion

Gogen-avro produces a Go struct which reflects the structure of your Avro schema. Most Go types map neatly onto Avro types:

Avro Type Go Type Notes
null interface{} This is just a placeholder, nothing is encoded/decoded
boolean bool
int, long int32, int64
float, double float32, float64
bytes []byte
string string
enum custom type Generates a type with a constant for each symbol
array<type> []<type>
map<type> custom struct Generates a struct with a field M, M has the type map[string]<type>
fixed [<n>]byte Fixed fields are given a custom type, which is an alias for an appropriately sized byte array
union custom struct Unions are handled as a struct with one field per possible type, and an enum field to dictate which field to read

union is more complicated than primitive types. We generate a struct and enum whose name is uniquely determined by the types in the union. For a field whose type is ["null", "int"] we generate the following:

type UnionNullInt struct {
	// All the possible types the union could take on
	Null               interface{}
	Int                int32
	// Which field actually has data in it - defaults to the first type in the list, "null"
	UnionType          UnionNullTypeEnum
}

type UnionNullIntTypeEnum int

const (
	UnionNullIntTypeEnumNull            UnionNullIntTypeEnum = 0
	UnionNullIntTypeEnumInt             UnionNullIntTypeEnum = 1
)

Versioning

Until version 6.0 this project used gopkg.in for versioning of both the code generation tool and library. Older versions are still available on gopkg.in.

Releases from 6.0 onward use semver tags (ex. v6.0.0) which are compatible with dep and modules. See Releases.

Reporting Issues

When reporting issues, please include your reader and writer schemas, and the output from the compiler logs by adding this to one of your source files:

import (
	"github.com/actgardner/gogen-avro/compiler"
)

func init() {
	compiler.LoggingEnabled = true
}

The logs will be printed on stdout.

Thanks

Thanks to LinkedIn's goavro library, for providing the encoders for primitives.

Expand ▾ Collapse ▴

Directories

Path Synopsis
cmd/gogen-avro
compiler Compiler has methods to generate GADGT VM bytecode from Avro schemas
container Container provides a Reader and Writer which serialize and deserialize gogen-avro structs to the Avro Object Container File (OCF) format.
container/avro Code generated by github.com/actgardner/gogen-avro.
example
example/avro Code generated by github.com/actgardner/gogen-avro.
example/container This example shows serializing and deserializing records in a object container file
example/record This example shows serializing and deserializing records as byte buffers without OCF framing
example/single_object This example shows serializing and deserializing records in single-object encoding
generator Utility methods for managing and writing generated code
generator/flat
generator/flat/templates
parser
resolver
schema gogen-avro's internal representation of Avro schemas
schema/canonical
soe Package soe provides convenience methods to read and write Avro Single-Object Encoding headers
test/alias-field Code generated by github.com/actgardner/gogen-avro.
test/alias-field/evolution Code generated by github.com/actgardner/gogen-avro.
test/alias-fixed Code generated by github.com/actgardner/gogen-avro.
test/alias-record Code generated by github.com/actgardner/gogen-avro.
test/alias-record/nested Code generated by github.com/actgardner/gogen-avro.
test/arrays Code generated by github.com/actgardner/gogen-avro.
test/arrays-union Code generated by github.com/actgardner/gogen-avro.
test/avro-java-string Code generated by github.com/actgardner/gogen-avro.
test/complex-arrays Code generated by github.com/actgardner/gogen-avro.
test/complex-arrays-multifile Code generated by github.com/actgardner/gogen-avro.
test/complex-union Code generated by github.com/actgardner/gogen-avro.
test/default-fixed Code generated by github.com/actgardner/gogen-avro.
test/default-union Code generated by github.com/actgardner/gogen-avro.
test/default-union/evolution Code generated by github.com/actgardner/gogen-avro.
test/enum Code generated by github.com/actgardner/gogen-avro.
test/evolve-union Code generated by github.com/actgardner/gogen-avro.
test/evolve-union/evolution Code generated by github.com/actgardner/gogen-avro.
test/fixed Code generated by github.com/actgardner/gogen-avro.
test/go-struct-tags Code generated by github.com/actgardner/gogen-avro.
test/maps Code generated by github.com/actgardner/gogen-avro.
test/maps-union Code generated by github.com/actgardner/gogen-avro.
test/namespace-full Code generated by github.com/actgardner/gogen-avro.
test/namespace-none Code generated by github.com/actgardner/gogen-avro.
test/namespace-short Code generated by github.com/actgardner/gogen-avro.
test/nested Code generated by github.com/actgardner/gogen-avro.
test/nested-maps Code generated by github.com/actgardner/gogen-avro.
test/primitive Code generated by github.com/actgardner/gogen-avro.
test/primitive/evolution Code generated by github.com/actgardner/gogen-avro.
test/recursive Code generated by github.com/actgardner/gogen-avro.
test/recursive-short Code generated by github.com/actgardner/gogen-avro.
test/string Code generated by github.com/actgardner/gogen-avro.
test/union Code generated by github.com/actgardner/gogen-avro.
test/union-root Code generated by github.com/actgardner/gogen-avro.
test/union-root-short Code generated by github.com/actgardner/gogen-avro.
test/union-short Code generated by github.com/actgardner/gogen-avro.
test/union/evolution Code generated by github.com/actgardner/gogen-avro.
vm The GADGT VM implementation and instruction set
vm/types Wrappers for Avro primitive types implementing the methods required by GADGT