pbr

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 21, 2025 License: Apache-2.0 Imports: 4 Imported by: 2

README

pbr CI Go Report Card Godoc Reference

Package pbr is a low-level reader for protocol buffers encoded data in Golang.
Main feature is the support for lazy/conditional decoding of fields.

This package can help decoding performance in two ways:

  1. fields can be conditionally decoded, skipping fields that are not needed for a specific use case

  2. decoding directly into specific types or performing other transformations, where additional state can be skipped by manually decoding into types directly.

Note: using gogoprotobuf is still faster.
Note: Writing code with this package is like writing an auto-generated protobuf decoder and is very time consuming. It should be used only in specific cases and for stable protobuf definitions.

Usage

First, the encoded protobuf data is used to initialize a new Message. Then the fields are searched by reading or skipping them.

msg := pbr.New(encodedData)
for msg.Next() {
    switch msg.FieldNumber() {
    case 1: // an int64 type
        v, err := msg.Int64()
        if err != nil {
            // handle
        }
    case 3: // repeated number types can be returned as a slice
        ids, err := msg.RepeatedInt64(nil)
        if err != nil {
            // handle
        }
    case 2: // for more control repeated+packed fields can be read using an iterator
        iter, err := msg.Iterator(nil)
        if err != nil {
            // handle
        }

        userIDs := make([]UserID, 0, iter.Count(pbr.WireTypeVarint))
        for iter.HasNext() {
            v, err := iter.Int64()
            if err != nil {
                // handle
            }

            userIDs = append(userIDs, UserID(v))
        }
    default:
        msg.Skip() // required if value not needed.
    }
}

if msg.Error() != nil {
    // handle
}

After calling Next() you must call an accessor function (Int64(), RepeatedInt64(), Iterator(), etc.) or Skip() to ignore the field. All these functions, including Next() and Skip(), must not be called twice in a row.

Value Accessor Functions

There is an accessor for each one the protobuf scalar value types.

There is a corresponding set of functions for repeated fields, e.g. RepeatedInt64(buf []int64) ([]int64, error). Repeated fields may or may not be packed, so a predefined buffer variable should be passed when called. E.g.:

var ids []int64
msg := pbr.New(encodedData)
for msg.Next() {
    switch msg.FieldNumber() {
    case 1: // repeated int64 field
        var err error
        ids, err = msg.RepeatedInt64(ids)
        if err != nil {
            // handle
        }
    default:
        msg.Skip()
    }
}

if msg.Error() != nil {
    // handle
}

If the ids are 'packed', RepeatedInt64() will be called once. If the ids are simply repeated RepeatedInt64() will be called N times, but the resulting array of ids will be the same.

For more control over the values in a packed, repeated field use an Iterator.
See above for an example.

Decoding Embedded Messages

Embedded messages can be handled recursively, or the raw data can be returned and decoded using a standard/auto-generated proto.Unmarshal function.

msg := pbr.New(encodedData)
for msg.Next() {
    fn := msg.FieldNumber()
    // use pbr recursively
    if fn == 1 && needFieldNumber1 {
        embeddedMsg, err := msg.Message()
        for embeddedMsg.Next() {
            switch embeddedMsg.FieldNumber() {
            case 1:
                // do something
            default:
                embeddedMsg.Skip()
            }
        }
    }

    // if you need the whole message decode the message in the standard way.
    if fn == 2 && needFieldNumber2 {
        data, err := msg.MessageData()
        v := &ProtoBufThing()
        err = proto.Unmarshal(data, v)
    }
}

Larger Example

Start with a customer message with embedded orders and items, need to count only the number of items in open orders.

message Customer {
  required int64 id = 1;
  optional string username = 2;
  repeated Order orders = 3;
  repeated int64 favorite_ids = 4 [packed=true];
}

message Order {
  required int64 id = 1;
  required bool open = 2;
  repeated Item items = 3;
}

message Item {
  // a big object
}

Sample Code:

var openCount, itemCount, favoritesCount int
customer := pbr.New(data)
for customer.Next() {
    switch customer.FieldNumber() {
    case 1: // id
        id, err := customer.Int64()
        if err != nil {
            panic(err)
        }
        _ = id // do something or skip this case if not needed
    case 2: // username
        username, err := customer.String()
        if err != nil {
            panic(err)
        }
        _ = username // do something or skip this case if not needed
    case 3: // orders
        var open bool
        var count int
        orderData, _ := customer.MessageData()
        order := pbr.New(orderData)
        for order.Next() {
            switch order.FieldNumber() {
            case 2: // open
                v, _ := order.Bool()
                open = v
            case 3: // item
                count++

                // we're not reading the data but we still need to skip it.
                order.Skip()
            default:
                // required to move past unneeded fields
                order.Skip()
            }
        }

        if open {
            openCount++
            itemCount += count
        }
    case 4: // favorite ids
        iter, err := customer.Iterator(nil)
        if err != nil {
        	panic(err)
        }

        // typically this section would only be run once but it is valid
        // protobuf to contain multiple sections of repeated fields that should be concatenated together
        favoritesCount += iter.Count(pbr.WireTypeVarint)
    default:
        // unread fields must be skipped
        customer.Skip()
    }
}

fmt.Printf("Open Orders: %d\n", openCount)
fmt.Printf("Items:       %d\n", itemCount)
fmt.Printf("Favorites:   %d\n", favoritesCount)

// Output:
// Open Orders: 2
// Items:       4
// Favorites:   8

Wire Type Start Group and End Group

Groups are an old type of protobuf wires that have been deprecated for a long time. They work like parentheses, but do not contain any information about the length of the data. Therefore, their contents cannot be effectively skipped. Only the start and end group indicators can be read and skipped like any other field. This will cause the data to be read without parentheses, whatever that means in practice. To get the raw protobuf data within a group, try the following:

var groupFieldNum = 123
var groupData []byte
msg := New(data)
for msg.Next() {
    if msg.FieldNumber() == groupFieldNum && msg.WireType() == WireTypeStartGroup {
        start, end := msg.Index, msg.Index
        for msg.Next() {
            msg.Skip()
            if msg.FieldNumber() == groupFieldNum && msg.WireType() == WireTypeEndGroup {
                break
            }
            end = msg.Index
        }
        // groupData would be the raw protobuf encoded bytes of the fields in the group.
        groupData = msg.Data[start:end]
    }
}

Documentation

Index

Constants

View Source
const (
	// WireType describes the encoding method for the next value in the stream.
	WireTypeVarint          = 0
	WireType64bit           = 1
	WireTypeLengthDelimited = 2
	WireTypeStartGroup      = 3 // deprecated by protobuf, not supported
	WireTypeEndGroup        = 4 // deprecated by protobuf, not supported
	WireType32bit           = 5
)

Variables

View Source
var (
	// ErrIntOverflow is returned when scanning a varint-encoded integer,
	// the value is found to be too long for the integer type.
	ErrIntOverflow = errors.New("protoscan: integer overflow")
	// ErrInvalidLength is returned when the length is not valid,
	// usually as a result of an invalid type scan.
	ErrInvalidLength = errors.New("protoscan: invalid length")
)

Functions

This section is empty.

Types

type Iterator

type Iterator struct {
	// contains filtered or unexported fields
}

Iterator allows for moving across a packed repeated field in a 'controlled' fashion.

func (*Iterator) Bool

func (b *Iterator) Bool() (bool, error)

Bool is encoded as 0x01 or 0x00 plus the field+type prefix byte. 2 bytes total.

func (*Iterator) Count

func (i *Iterator) Count(wireType int) (count int)

Count returns the total number of values in the given repeating field. The answer depends on the type/encoding or the field: double, float, fixed, sfixed are WireType32bit or WireType64bit, all other types (int, uint, sint) are WireTypeVarint. Any other value will cause the function to panic.

func (*Iterator) Double

func (b *Iterator) Double() (float64, error)

Double values are encoded as a fixed length of 8 bytes in their IEEE-754 format.

func (*Iterator) FieldNumber

func (i *Iterator) FieldNumber() int

FieldNumber returns the number for the current repeated field. These numbers are defined in the protobuf definition file used to encode the message.

func (*Iterator) Fixed32

func (b *Iterator) Fixed32() (uint32, error)

Fixed32 reads a fixed 4 byte value as a uint32. This proto type is more efficient than uint32 if values are often greater than 2^28.

func (*Iterator) Fixed64

func (b *Iterator) Fixed64() (uint64, error)

Fixed64 reads a fixed 8 byte value as an uint64. This proto type is more efficient than uint64 if values are often greater than 2^56.

func (*Iterator) Float

func (b *Iterator) Float() (float32, error)

Float values are encoded as a fixed length of 4 bytes in their IEEE-754 format.

func (*Iterator) HasNext

func (i *Iterator) HasNext() bool

HasNext is used in a 'for' loop to read through all the elements. Returns false when all the items have been read. This method does NOT need to be called, reading a value automatically moves in the index forward. This behavior is different than Message.Next().

func (*Iterator) Int32

func (b *Iterator) Int32() (int32, error)

Int32 reads a variable-length encoding of up to 4 bytes. This field type is best used if the field only has positive numbers, otherwise use sint32. Note, this field can also by read as an Int64.

func (*Iterator) Int64

func (b *Iterator) Int64() (int64, error)

Int64 reads a variable-length encoding of up to 8 bytes. This field type is best used if the field only has positive numbers, otherwise use sint64.

func (*Iterator) Sfixed32

func (b *Iterator) Sfixed32() (int32, error)

Sfixed32 reads a fixed 4 byte value signed value.

func (*Iterator) Sfixed64

func (b *Iterator) Sfixed64() (int64, error)

Sfixed64 reads a fixed 8 byte signed value.

func (*Iterator) Sint32

func (b *Iterator) Sint32() (int32, error)

Sint32 uses variable-length encoding with zig-zag encoding for signed values. This field type more efficiently encodes negative numbers than regular int32s.

func (*Iterator) Sint64

func (b *Iterator) Sint64() (int64, error)

Sint64 uses variable-length encoding with zig-zag encoding for signed values. This field type more efficiently encodes negative numbers than regular int64s.

func (*Iterator) Skip

func (i *Iterator) Skip(wireType int, count int)

Skip will move the interator forward 'count' value without actually reading it. For a new iterator, 'count' will move the pointer so that the next value call will be the 'counth' value. The correct wireType must be specified: double, float, fixed, sfixed are WireType32bit or WireType64bit, all other types (int, uint, sint) are WireTypeVarint. Any other value will cause the function to panic.

func (*Iterator) Uint32

func (b *Iterator) Uint32() (v uint32, err error)

Uint32 reads a variable-length encoding of up to 4 bytes.

func (*Iterator) Uint64

func (b *Iterator) Uint64() (v uint64, err error)

Uint64 reads a variable-length encoding of up to 8 bytes.

func (*Iterator) Varint32

func (b *Iterator) Varint32() (v uint32, err error)

Varint32 reads up to 32-bits of variable-length encoded data. Note that negative int32 values could still be encoded as 64-bit varints due to their leading 1s.

func (*Iterator) Varint64

func (b *Iterator) Varint64() (v uint64, err error)

Varint64 reads up to 64-bits of variable-length encoded data.

type Message

type Message struct {
	// contains filtered or unexported fields
}

Message is a container for a protobuf message type ready to be scanned.

func New

func New(data []byte) *Message

New creates a new Message scanner for the given encoded protobuf data.

func (*Message) Bool

func (b *Message) Bool() (bool, error)

Bool is encoded as 0x01 or 0x00 plus the field+type prefix byte. 2 bytes total.

func (*Message) Bytes

func (m *Message) Bytes() ([]byte, error)

Bytes returns the encode sequence of bytes.

func (*Message) Double

func (b *Message) Double() (float64, error)

Double values are encoded as a fixed length of 8 bytes in their IEEE-754 format.

func (*Message) Error

func (m *Message) Error() error

Error will return any errors that were encountered during scanning. Errors could be due to reading the incorrect types or forgetting to skip and unused value.

func (*Message) FieldNumber

func (m *Message) FieldNumber() int

FieldNumber returns the number for the current value being scanned. These numbers are defined in the protobuf definition file used to encode the message.

func (*Message) Fixed32

func (b *Message) Fixed32() (uint32, error)

Fixed32 reads a fixed 4 byte value as a uint32. This proto type is more efficient than uint32 if values are often greater than 2^28.

func (*Message) Fixed64

func (b *Message) Fixed64() (uint64, error)

Fixed64 reads a fixed 8 byte value as an uint64. This proto type is more efficient than uint64 if values are often greater than 2^56.

func (*Message) Float

func (b *Message) Float() (float32, error)

Float values are encoded as a fixed length of 4 bytes in their IEEE-754 format.

func (*Message) Int32

func (b *Message) Int32() (int32, error)

Int32 reads a variable-length encoding of up to 4 bytes. This field type is best used if the field only has positive numbers, otherwise use sint32. Note, this field can also by read as an Int64.

func (*Message) Int64

func (b *Message) Int64() (int64, error)

Int64 reads a variable-length encoding of up to 8 bytes. This field type is best used if the field only has positive numbers, otherwise use sint64.

func (*Message) Iterator

func (m *Message) Iterator(iter *Iterator) (*Iterator, error)

Iterator will use the current field. Field must be a packed repeated field.

func (*Message) Message

func (m *Message) Message(msg *Message) (*Message, error)

Message will return a pointer to an embedded message that can then be scanned in kind of a recursive fashion. Will reuse the provided Message object if provided.

func (*Message) MessageData

func (m *Message) MessageData() ([]byte, error)

MessageData returns the encoded data a message. This data can then be decoded using conventional tools.

func (*Message) Next

func (m *Message) Next() bool

Next will move the scanner to the next value. Should be used in a for loop.

func (*Message) RepeatedBool

func (m *Message) RepeatedBool(buf []bool) ([]bool, error)

RepeatedBool will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) RepeatedDouble

func (m *Message) RepeatedDouble(buf []float64) ([]float64, error)

RepeatedDouble will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) RepeatedFixed32

func (m *Message) RepeatedFixed32(buf []uint32) ([]uint32, error)

RepeatedFixed32 will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) RepeatedFixed64

func (m *Message) RepeatedFixed64(buf []uint64) ([]uint64, error)

RepeatedFixed64 will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) RepeatedFloat

func (m *Message) RepeatedFloat(buf []float32) ([]float32, error)

RepeatedFloat will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) RepeatedInt32

func (m *Message) RepeatedInt32(buf []int32) ([]int32, error)

RepeatedInt32 will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) RepeatedInt64

func (m *Message) RepeatedInt64(buf []int64) ([]int64, error)

RepeatedInt64 will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) RepeatedSfixed32

func (m *Message) RepeatedSfixed32(buf []int32) ([]int32, error)

RepeatedSfixed32 will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) RepeatedSfixed64

func (m *Message) RepeatedSfixed64(buf []int64) ([]int64, error)

RepeatedSfixed64 will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) RepeatedSint32

func (m *Message) RepeatedSint32(buf []int32) ([]int32, error)

RepeatedSint32 will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) RepeatedSint64

func (m *Message) RepeatedSint64(buf []int64) ([]int64, error)

RepeatedSint64 will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) RepeatedUint32

func (m *Message) RepeatedUint32(buf []uint32) ([]uint32, error)

RepeatedUint32 will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) RepeatedUint64

func (m *Message) RepeatedUint64(buf []uint64) ([]uint64, error)

RepeatedUint64 will append the repeated value(s) to the buffer. This method supports packed or unpacked encoding.

func (*Message) Reset

func (m *Message) Reset(newData []byte)

Reset will set the index to 0 so the message can be read again. Optionally pass in new data to reuse the Message object.

func (*Message) Sfixed32

func (b *Message) Sfixed32() (int32, error)

Sfixed32 reads a fixed 4 byte value signed value.

func (*Message) Sfixed64

func (b *Message) Sfixed64() (int64, error)

Sfixed64 reads a fixed 8 byte signed value.

func (*Message) Sint32

func (b *Message) Sint32() (int32, error)

Sint32 uses variable-length encoding with zig-zag encoding for signed values. This field type more efficiently encodes negative numbers than regular int32s.

func (*Message) Sint64

func (b *Message) Sint64() (int64, error)

Sint64 uses variable-length encoding with zig-zag encoding for signed values. This field type more efficiently encodes negative numbers than regular int64s.

func (*Message) Skip

func (m *Message) Skip()

Skip will move the scanner past the current value if it is not needed. If a value is not parsed this method must be called to move the decoder past the value.

func (*Message) String

func (m *Message) String() (string, error)

String reads a string type. This data will always contain UTF-8 encoded or 7-bit ASCII text.

func (*Message) Uint32

func (b *Message) Uint32() (v uint32, err error)

Uint32 reads a variable-length encoding of up to 4 bytes.

func (*Message) Uint64

func (b *Message) Uint64() (v uint64, err error)

Uint64 reads a variable-length encoding of up to 8 bytes.

func (*Message) Varint32

func (b *Message) Varint32() (v uint32, err error)

Varint32 reads up to 32-bits of variable-length encoded data. Note that negative int32 values could still be encoded as 64-bit varints due to their leading 1s.

func (*Message) Varint64

func (b *Message) Varint64() (v uint64, err error)

Varint64 reads up to 64-bits of variable-length encoded data.

func (*Message) WireType

func (m *Message) WireType() int

WireType returns the 'type' of the data at the current location.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL