zetasketch

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 17, 2021 License: Apache-2.0 Imports: 6 Imported by: 0

README

ZetaSketch

Test Go Reference License

A collection of libraries for single-pass, distributed, sublinear-space approximate aggregation and sketching algorithms. Currently: HyperLogLog++; more to come.

Go port of the original Java library https://github.com/google/zetasketch. Copyright 2019 Google LLC, Licensed under the Apache License, Version 2.0.

Documentation

Overview

Package zetasketch provices a collection of libraries for single-pass, distributed, sublinear-space approximate aggregation and sketching algorithms.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Aggregator

type Aggregator interface {
	// Add adds a value.
	Add(v Value)
	// NumValues returns the total number of input values that this aggregator has seen.
	NumValues() int64
	// Merge merges two aggregators.
	Merge(other Aggregator) error

	encoding.BinaryMarshaler
	encoding.BinaryUnmarshaler
}

Aggregator provides an interface that wraps distributed, online aggregation algorithm.

type HLL

type HLL struct {
	// contains filtered or unexported fields
}

HLL implements a HLL++ aggregator for estimating cardinalities of multisets.

The precision defines the accuracy of the HLL++ aggregator at the cost of the memory used. The upper bound on the memory required is 2^precision bytes, but less memory is used for smaller cardinalities. The relative error is 1.04 / sqrt(2^precision). A typical value used at Google is 15, which gives an error of about 0.6% while requiring an upper bound of 32KiB of memory.

Note that this aggregator is not designed to be thread safe.

func NewHLL

func NewHLL(cfg *HLLConfig) *HLL

NewHLL inits a new HLL++ aggregator.

func (*HLL) Add

func (h *HLL) Add(v Value)

Add adds value v to the aggregator.

func (*HLL) MarshalBinary

func (h *HLL) MarshalBinary() ([]byte, error)

MarshalBinary serializes aggregator to bytes.

func (*HLL) Merge

func (h *HLL) Merge(other Aggregator) error

Merge merges aggregator other into h.

func (*HLL) NumValues

func (h *HLL) NumValues() int64

NumValues returns the number of values seen.

func (*HLL) Result

func (h *HLL) Result() int64

Result returns an estimate of the unique of values.

func (*HLL) UnmarshalBinary

func (h *HLL) UnmarshalBinary(data []byte) error

UnmarshalBinary deserializes aggregator from bytes.

type HLLConfig

type HLLConfig struct {
	// Defaults to 15.
	Precision uint8

	// If no sparse precision is specified, the default is calculated as precision + 5.
	SparsePrecision uint8
}

HLLConfig speficies the configuration parameters for the HLL++ aggregator.

type Value

type Value interface {
	Sum64() uint64
}

Value is a hashable value.

func BinaryValue

func BinaryValue(p []byte) Value

BinaryValue converts a byte slice to a Value.

func StringValue

func StringValue(s string) Value

StringValue converts a string to a Value.

func Uint32Value

func Uint32Value(v uint32) Value

Uint32Value converts a number to a Value.

func Uint64Value

func Uint64Value(v uint64) Value

Uint64Value converts a number slice to a Value.

Directories

Path Synopsis
internal
hash
Package hash implements hashing as done in zetasketch Java library https://github.com/google/zetasketch/blob/master/java/com/google/zetasketch/internal/hash/.
Package hash implements hashing as done in zetasketch Java library https://github.com/google/zetasketch/blob/master/java/com/google/zetasketch/internal/hash/.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL