gcs

package module
v3.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 10, 2023 License: ISC Imports: 12 Imported by: 9

README

gcs

Build Status ISC License Doc

Package gcs provides an API for building and using Golomb-coded sets.

A Golomb-Coded Set (GCS) is a space-efficient probabilistic data structure that is used to test set membership with a tunable false positive rate while simultaneously preventing false negatives. In other words, items that are in the set will always match, but items that are not in the set will also sometimes match with the chosen false positive rate.

This package currently implements two different versions for backwards compatibility. Version 1 is deprecated and therefore should no longer be used.

Version 2 is the GCS variation that follows the specification details in DCP0005.

Version 2 sets do not permit empty items (data of zero length) to be added and are parameterized by the following:

  • A parameter B that defines the remainder code bit size
  • A parameter M that defines the false positive rate as 1/M
  • A key for the SipHash-2-4 function
  • The items to include in the set

A comprehensive suite of tests is provided to ensure proper functionality.

GCS use in Decred

GCS is used as a mechanism for storing, transmitting, and committing to per-block filters. Consensus-validating full nodes commit to a single filter for every block and serve the filter to SPV clients that match against the filter locally to determine if the block is potentially relevant. The required parameters for Decred are defined by the blockcf2 package.

For more details, see the Block Filters section of DCP0005.

Installation and Updating

This package is part of the github.com/decred/dcrd/gcs/v2 module. Use the standard go tooling for working with modules to incorporate it.

License

Package blockchain is licensed under the copyfree ISC License.

Documentation

Overview

Package gcs provides an API for building and using a Golomb-coded set filter.

A Golomb-Coded Set (GCS) is a space-efficient probabilistic data structure that is used to test set membership with a tunable false positive rate while simultaneously preventing false negatives. In other words, items that are in the set will always match, but items that are not in the set will also sometimes match with the chosen false positive rate.

This package currently implements two different versions for backwards compatibility. Version 1 is deprecated and therefore should no longer be used.

Version 2 is the GCS variation that follows the specification details in DCP0005: https://github.com/decred/dcps/blob/master/dcp-0005/dcp-0005.mediawiki#golomb-coded-sets.

Version 2 sets do not permit empty items (data of zero length) to be added and are parameterized by the following:

* A parameter `B` that defines the remainder code bit size * A parameter `M` that defines the false positive rate as `1/M` * A key for the SipHash-2-4 function * The items to include in the set

Errors

Errors returned by this package are of type gcs.Error. This allows the caller to programmatically determine the specific error by examining the ErrorKind field of the type asserted gcs.Error while still providing rich error messages with contextual information. See ErrorKind in the package documentation for a full list.

GCS use in Decred

GCS is used as a mechanism for storing, transmitting, and committing to per-block filters. Consensus-validating full nodes commit to a single filter for every block and serve the filter to SPV clients that match against the filter locally to determine if the block is potentially relevant. The required parameters for Decred are defined by the blockcf2 package.

For more details, see the Block Filters section of DCP0005: https://github.com/decred/dcps/blob/master/dcp-0005/dcp-0005.mediawiki#block-filters

Index

Constants

View Source
const (
	// ErrNTooBig signifies that the filter can't handle N items.
	ErrNTooBig = ErrorKind("ErrNTooBig")

	// ErrPTooBig signifies that the filter can't handle `1/2**P`
	// collision probability.
	ErrPTooBig = ErrorKind("ErrPTooBig")

	// ErrBTooBig signifies that the provided Golomb coding bin size is larger
	// than the maximum allowed value.
	ErrBTooBig = ErrorKind("ErrBTooBig")

	// ErrMisserialized signifies a filter was misserialized and is missing the
	// N and/or P parameters of a serialized filter.
	ErrMisserialized = ErrorKind("ErrMisserialized")
)

These constants are used to identify a specific RuleError.

View Source
const KeySize = 16

KeySize is the size of the byte array required for key material for the SipHash keyed hash function.

Variables

This section is empty.

Functions

func MakeHeaderForFilter

func MakeHeaderForFilter(filter *FilterV1, prevHeader *chainhash.Hash) chainhash.Hash

MakeHeaderForFilter makes a filter chain header for a filter, given the filter and the previous filter chain header.

Types

type Error

type Error struct {
	Err         error
	Description string
}

Error identifies an error related to gcs filters. It has full support for errors.Is and errors.As, so the caller can ascertain the specific reason for the error by checking the underlying error.

func (Error) Error

func (e Error) Error() string

Error satisfies the error interface and prints human-readable errors.

func (Error) Unwrap

func (e Error) Unwrap() error

Unwrap returns the underlying wrapped error.

type ErrorKind

type ErrorKind string

ErrorKind identifies a kind of error. It has full support for errors.Is and errors.As, so the caller can directly check against an error kind when determining the reason for an error.

func (ErrorKind) Error

func (e ErrorKind) Error() string

Error satisfies the error interface and prints human-readable errors.

type FilterV1

type FilterV1 struct {
	// contains filtered or unexported fields
}

FilterV1 describes an immutable filter that can be built from a set of data elements, serialized, deserialized, and queried in a thread-safe manner. The serialized form is compressed as a Golomb Coded Set (GCS) along with the number of members of the set. The hash function used is SipHash, a keyed function. The key used in building the filter is required in order to match filter values and is not included in the serialized form.

func FromBytesV1

func FromBytesV1(P uint8, d []byte) (*FilterV1, error)

FromBytesV1 deserializes a version 1 GCS filter from a known P and serialized filter as returned by Bytes().

func NewFilterV1

func NewFilterV1(P uint8, key [KeySize]byte, data [][]byte) (*FilterV1, error)

NewFilterV1 builds a new version 1 GCS filter with a collision probability of 1 / 2^P for the given key and data.

func (*FilterV1) Bytes

func (f *FilterV1) Bytes() []byte

Bytes returns the serialized format of the GCS filter which includes N, but does not include other parameters such as the false positive rate or the key.

func (*FilterV1) Hash

func (f *FilterV1) Hash() chainhash.Hash

Hash returns the BLAKE256 hash of the filter.

func (*FilterV1) Match

func (f *FilterV1) Match(key [KeySize]byte, data []byte) bool

Match checks whether a []byte value is likely (within collision probability) to be a member of the set represented by the filter.

func (*FilterV1) MatchAny

func (f *FilterV1) MatchAny(key [KeySize]byte, data [][]byte) bool

MatchAny checks whether any []byte value is likely (within collision probability) to be a member of the set represented by the filter faster than calling Match() for each value individually.

func (*FilterV1) N

func (f *FilterV1) N() uint32

N returns the size of the data set used to build the filter.

func (*FilterV1) P

func (f *FilterV1) P() uint8

P returns the filter's collision probability as a negative power of 2. For example, a collision probability of 1 / 2^20 is represented as 20.

type FilterV2

type FilterV2 struct {
	// contains filtered or unexported fields
}

FilterV2 describes an immutable filter that can be built from a set of data elements, serialized, deserialized, and queried in a thread-safe manner. The serialized form is compressed as a Golomb Coded Set (GCS) along with the number of members of the set. The hash function used is SipHash, a keyed function. The key used in building the filter is required in order to match filter values and is not included in the serialized form.

Version 2 filters differ from version 1 filters in four ways:

  1. Support for independently specifying the false positive rate and Golomb coding bin size which allows minimizing the filter size
  2. A faster (incompatible with version 1) reduction function
  3. A more compact serialization for the number of members in the set
  4. Deduplication of all hash collisions prior to reducing and serializing the deltas

func FromBytesV2

func FromBytesV2(B uint8, M uint64, d []byte) (*FilterV2, error)

FromBytesV2 deserializes a version 2 GCS filter from a known B, M, and serialized filter as returned by Bytes().

func NewFilterV2

func NewFilterV2(B uint8, M uint64, key [KeySize]byte, data [][]byte) (*FilterV2, error)

NewFilterV2 builds a new GCS filter with the provided tunable parameters that contains every item of the passed data as a member of the set.

B is the tunable bits parameter for constructing the filter that is used as the bin size in the underlying Golomb coding with a value of 2^B. The optimal value of B to minimize the size of the filter for a given false positive rate 1/M is floor(log_2(M) - 0.055256). The maximum allowed value for B is 32. An error will be returned for larger values.

M is the inverse of the target false positive rate for the filter. The optimal value of M to minimize the size of the filter for a given B is ceil(1.497137 * 2^B).

key is a key used in the SipHash function used to hash each data element prior to inclusion in the filter. This helps thwart would be attackers attempting to choose elements that intentionally cause false positives.

The general process for determining optimal parameters for B and M to minimize the size of the filter is to start with the desired false positive rate and calculate B per the aforementioned formula accordingly. Then, if the application permits the false positive rate to be varied, calculate the optimal value of M via the formula provided under the description of M.

func (*FilterV2) B

func (f *FilterV2) B() uint8

B returns the tunable bits parameter that was used to construct the filter. It represents the bin size in the underlying Golomb coding with a value of 2^B.

func (*FilterV2) Bytes

func (f *FilterV2) Bytes() []byte

Bytes returns the serialized format of the GCS filter which includes N, but does not include other parameters such as the false positive rate or the key.

func (*FilterV2) Hash

func (f *FilterV2) Hash() chainhash.Hash

Hash returns the BLAKE256 hash of the filter.

func (*FilterV2) Match

func (f *FilterV2) Match(key [KeySize]byte, data []byte) bool

Match checks whether a []byte value is likely (within collision probability) to be a member of the set represented by the filter.

func (*FilterV2) MatchAny

func (f *FilterV2) MatchAny(key [KeySize]byte, data [][]byte) bool

MatchAny checks whether any []byte value is likely (within collision probability) to be a member of the set represented by the filter faster than calling Match() for each value individually.

func (*FilterV2) N

func (f *FilterV2) N() uint32

N returns the size of the data set used to build the filter.

Directories

Path Synopsis
Package blockcf2 provides functions for building committed filters for blocks using Golomb-coded sets in a way that is useful for light clients such as SPV wallets.
Package blockcf2 provides functions for building committed filters for blocks using Golomb-coded sets in a way that is useful for light clients such as SPV wallets.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL