bloom

package
v1.9.4-6277238fa1e72d2... Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 10, 2019 License: Apache-2.0 Imports: 6 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrInvalidLengthBloom = fmt.Errorf("proto: negative length found during unmarshaling")
	ErrIntOverflowBloom   = fmt.Errorf("proto: integer overflow")
)

Functions

func FilterSizeForFalsePositiveRate

func FilterSizeForFalsePositiveRate(falsePositiveRate float64, elementCount int) int

FilterSizeForFalsePositiveRate returns the memory footprint for a filter with the given constraints.

Types

type BloomFilter

type BloomFilter struct {
	NumSubhashes uint32 `protobuf:"varint,1,opt,name=num_subhashes,json=numSubhashes,proto3" json:"num_subhashes,omitempty"`
	// TODO: we could make each bucket signed, which would allow us to do some interesting things,
	// but it might make the design a little confusing.
	// Two BloomFilters with identical hash_length and len(buckets) can be summed
	// to produce a new filter which should be identical to running all operations
	// from the original two filters onto a blank filter.
	// Negative bucket values may be useful if we were to add a background
	// reprocessing stage that would iterate over all existing items and re-add
	// them, but would also track live updates to the set.  Due to live updates,
	// some buckets may need to go negative temporarily - but we would lose the
	// guarantee that removing something that wasn't added to the set is an error.
	// Perhaps better to provide a 'BloomFilterDelta' that can later be combined
	// into an existing filter, while still preserving the invariant that all
	// buckets are positive.
	Buckets              []uint32 `protobuf:"varint,2,rep,packed,name=buckets,proto3" json:"buckets,omitempty"`
	XXX_NoUnkeyedLiteral struct{} `json:"-"`
	XXX_unrecognized     []byte   `json:"-"`
	XXX_sizecache        int32    `json:"-"`
}

func NewFilterWithFalsePositiveRate

func NewFilterWithFalsePositiveRate(falsePositiveRate float64, elementCount int, maxBytes int) *BloomFilter

NewFilterWithFalsePositiveRate constructs a new bloom filter with the given constraints:

  • 'falsePositiveRate' - the expected false positive rate when the filter is loaded with 'elementCount' elements
  • 'elementCount' - the expected number of elements that will be present in the filter
  • 'maxBytes' - the maximum memory footprint to allocate for this filter.

Given these, the filter will choose a size that will provide the specified constraints. If the memory footprint is capped by 'maxBytes', the falsePositiveRate will increase.

func NewFilterWithSize

func NewFilterWithSize(bytes int, elementCount int) *BloomFilter

NewFilterWithSize constructs a new bloom filter with the given constraints.

  • 'bytes' - the number of bytes to allocate for the filter buckets
  • 'elementCount' - the expected number of elements that will be present in the filter

Given these, the filter will choose the optimal number of hashes to perform to get the best possible false-positive rate.

func (*BloomFilter) Add

func (f *BloomFilter) Add(hash []byte)

Add adds an item to the filter.

func (*BloomFilter) Descriptor

func (*BloomFilter) Descriptor() ([]byte, []int)

func (*BloomFilter) FalsePositiveRate

func (f *BloomFilter) FalsePositiveRate(elementCount int) float64

FalsePositiveRate returns the expected rate of false positives if the filter contains the given number of elements.

func (*BloomFilter) GetBuckets

func (m *BloomFilter) GetBuckets() []uint32

func (*BloomFilter) GetNumSubhashes

func (m *BloomFilter) GetNumSubhashes() uint32

func (*BloomFilter) IsNotPresent

func (f *BloomFilter) IsNotPresent(hash []byte) bool

IsNotPresent returns true if an item corresponding to the given hash has definitely not been added to the set.

func (*BloomFilter) Marshal

func (m *BloomFilter) Marshal() (dAtA []byte, err error)

func (*BloomFilter) MarshalTo

func (m *BloomFilter) MarshalTo(dAtA []byte) (int, error)

func (*BloomFilter) MarshalToSizedBuffer

func (m *BloomFilter) MarshalToSizedBuffer(dAtA []byte) (int, error)

func (*BloomFilter) OverflowRate

func (f *BloomFilter) OverflowRate() float64

OverflowRate iterates over all buckets in the filter and returns the ratio of buckets that are in overflow to the total number of buckets. Once a bucket overflows, it will never return to normal, even if all items corresponding to that bucket are removed - a new filter must be constructed instead.

func (*BloomFilter) ProtoMessage

func (*BloomFilter) ProtoMessage()

func (*BloomFilter) Remove

func (f *BloomFilter) Remove(hash []byte)

Remove removes an item from the filter

func (*BloomFilter) Reset

func (m *BloomFilter) Reset()

func (*BloomFilter) Size

func (m *BloomFilter) Size() (n int)

func (*BloomFilter) String

func (m *BloomFilter) String() string

func (*BloomFilter) Unmarshal

func (m *BloomFilter) Unmarshal(dAtA []byte) error

func (*BloomFilter) UpperBoundCount

func (f *BloomFilter) UpperBoundCount(hash []byte) int

UpperBoundCount returns the maximum number of items in the filter matching the given hash.

func (*BloomFilter) XXX_DiscardUnknown

func (m *BloomFilter) XXX_DiscardUnknown()

func (*BloomFilter) XXX_Marshal

func (m *BloomFilter) XXX_Marshal(b []byte, deterministic bool) ([]byte, error)

func (*BloomFilter) XXX_Merge

func (m *BloomFilter) XXX_Merge(src proto.Message)

func (*BloomFilter) XXX_Size

func (m *BloomFilter) XXX_Size() int

func (*BloomFilter) XXX_Unmarshal

func (m *BloomFilter) XXX_Unmarshal(b []byte) error

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL