hashing

package
v0.0.0-...-bc21918 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 12, 2021 License: Apache-2.0, BSD-2-Clause, BSD-3-Clause, + 8 more Imports: 12 Imported by: 0

Documentation

Overview

Package hashing provides utilities for and an implementation of a hash table which is more performant than the default go map implementation by leveraging xxh3 and some custom hash functions.

Index

Constants

View Source
const KeyNotFound = -1

KeyNotFound is the constant returned by memo table functions when a key isn't found in the table

Variables

This section is empty.

Functions

This section is empty.

Types

type BinaryMemoTable

type BinaryMemoTable struct {
	// contains filtered or unexported fields
}

BinaryMemoTable is our hashtable for binary data using the BinaryBuilder to construct the actual data in an easy to pass around way with minimal copies while using a hash table to keep track of the indexes into the dictionary that is created as we go.

func NewBinaryMemoTable

func NewBinaryMemoTable(mem memory.Allocator, initial, valuesize int) *BinaryMemoTable

NewBinaryMemoTable returns a hash table for Binary data, the passed in allocator will be utilized for the BinaryBuilder, if nil then memory.DefaultAllocator will be used. initial and valuesize can be used to pre-allocate the table to reduce allocations. With initial being the initial number of entries to allocate for and valuesize being the starting amount of space allocated for writing the actual binary data.

func (*BinaryMemoTable) CopyFixedWidthValues

func (b *BinaryMemoTable) CopyFixedWidthValues(start, width int, out []byte)

CopyFixedWidthValues exists to cope with the fact that the table doesn't keep track of the fixed width when inserting the null value the databuffer holds a zero length byte slice for the null value (if found)

func (*BinaryMemoTable) CopyOffsets

func (b *BinaryMemoTable) CopyOffsets(out []int8)

CopyOffsets copies the list of offsets into the passed in slice, the offsets being the start and end values of the underlying allocated bytes in the builder for the individual values of the table. out should be at least sized to Size()+1

func (*BinaryMemoTable) CopyOffsetsSubset

func (b *BinaryMemoTable) CopyOffsetsSubset(start int, out []int8)

CopyOffsetsSubset is like CopyOffsets but instead of copying all of the offsets, it gets a subset of the offsets in the table starting at the index provided by "start".

func (*BinaryMemoTable) CopyValues

func (b *BinaryMemoTable) CopyValues(out interface{})

CopyValues copies the raw binary data bytes out, out should be a []byte with at least ValuesSize bytes allocated to copy into.

func (*BinaryMemoTable) CopyValuesSubset

func (b *BinaryMemoTable) CopyValuesSubset(start int, out interface{})

CopyValuesSubset copies the raw binary data bytes out starting with the value at the index start, out should be a []byte with at least ValuesSize bytes allocated

func (*BinaryMemoTable) Get

func (b *BinaryMemoTable) Get(val interface{}) (int, bool)

Get returns the index of the specified value in the table or KeyNotFound, and a boolean indicating whether it was found in the table.

func (*BinaryMemoTable) GetNull

func (s *BinaryMemoTable) GetNull() (int, bool)

GetNull returns the index of a null that has been inserted into the table or KeyNotFound. The bool returned will be true if there was a null inserted into the table, and false otherwise.

func (*BinaryMemoTable) GetOrInsert

func (b *BinaryMemoTable) GetOrInsert(val interface{}) (idx int, found bool, err error)

GetOrInsert returns the index of the given value in the table, if not found it is inserted into the table. The return value 'found' indicates whether the value was found in the table (true) or inserted (false) along with any possible error.

func (*BinaryMemoTable) GetOrInsertNull

func (b *BinaryMemoTable) GetOrInsertNull() (idx int, found bool)

GetOrInsertNull retrieves the index of a null in the table or inserts null into the table, returning the index and a boolean indicating if it was found in the table (true) or was inserted (false).

func (*BinaryMemoTable) Release

func (b *BinaryMemoTable) Release()

Release is used to tell the underlying builder that it can release the memory allocated when the reference count reaches 0, this is safe to be called from multiple goroutines simultaneously

func (*BinaryMemoTable) Reset

func (s *BinaryMemoTable) Reset()

Reset dumps all of the data in the table allowing it to be reutilized.

func (*BinaryMemoTable) Retain

func (b *BinaryMemoTable) Retain()

Retain increases the ref count, it is safe to call it from multiple goroutines simultaneously.

func (*BinaryMemoTable) Size

func (s *BinaryMemoTable) Size() int

Size returns the current size of the memo table including the null value if one has been inserted.

func (*BinaryMemoTable) ValuesSize

func (b *BinaryMemoTable) ValuesSize() int

ValuesSize returns the current total size of all the raw bytes that have been inserted into the memotable so far.

func (*BinaryMemoTable) VisitValues

func (b *BinaryMemoTable) VisitValues(start int, visitFn func([]byte))

VisitValues exists to run the visitFn on each value currently in the hash table.

func (*BinaryMemoTable) WriteOut

func (b *BinaryMemoTable) WriteOut(out []byte)

func (*BinaryMemoTable) WriteOutSubset

func (b *BinaryMemoTable) WriteOutSubset(start int, out []byte)

type Float32HashTable

type Float32HashTable struct {
	// contains filtered or unexported fields
}

Float32HashTable is a hashtable specifically for float32 that is utilized with the MemoTable to generalize interactions for easier implementation of dictionaries without losing performance.

func NewFloat32HashTable

func NewFloat32HashTable(cap uint64) *Float32HashTable

NewFloat32HashTable returns a new hash table for float32 values initialized with the passed in capacity or 32 whichever is larger.

func (*Float32HashTable) CopyValues

func (h *Float32HashTable) CopyValues(out []float32)

CopyValues is used for copying the values out of the hash table into the passed in slice, in the order that they were first inserted

func (*Float32HashTable) CopyValuesSubset

func (h *Float32HashTable) CopyValuesSubset(start int, out []float32)

CopyValuesSubset copies a subset of the values in the hashtable out, starting with the value at start, in the order that they were inserted.

func (*Float32HashTable) Insert

func (h *Float32HashTable) Insert(e *entryFloat32, v uint64, val float32, memoIdx int32) error

Insert updates the given entry with the provided hash value, payload value and memo index. The entry pointer must have been retrieved via lookup in order to actually insert properly.

func (*Float32HashTable) Lookup

func (h *Float32HashTable) Lookup(v uint64, cmp func(float32) bool) (*entryFloat32, bool)

Lookup retrieves the entry for a given hash value assuming it's payload value returns true when passed to the cmp func. Returns a pointer to the entry for the given hash value, and a boolean as to whether it was found. It is not safe to use the pointer if the bool is false.

func (*Float32HashTable) Reset

func (h *Float32HashTable) Reset(cap uint64)

Reset drops all of the values in this hash table and re-initializes it with the specified initial capacity as if by calling New, but without having to reallocate the object.

func (*Float32HashTable) VisitEntries

func (h *Float32HashTable) VisitEntries(visit func(*entryFloat32))

VisitEntries will call the passed in function on each *valid* entry in the hash table, a valid entry being one which has had a value inserted into it.

func (*Float32HashTable) WriteOut

func (h *Float32HashTable) WriteOut(out []byte)

func (*Float32HashTable) WriteOutSubset

func (h *Float32HashTable) WriteOutSubset(start int, out []byte)

type Float32MemoTable

type Float32MemoTable struct {
	// contains filtered or unexported fields
}

Float32MemoTable is a wrapper over the appropriate hashtable to provide an interface conforming to the MemoTable interface defined in the encoding package for general interactions regarding dictionaries.

func NewFloat32MemoTable

func NewFloat32MemoTable(num int64) *Float32MemoTable

NewFloat32MemoTable returns a new memotable with num entries pre-allocated to reduce further allocations when inserting.

func (*Float32MemoTable) CopyValues

func (s *Float32MemoTable) CopyValues(out interface{})

CopyValues will copy the values from the memo table out into the passed in slice which must be of the appropriate type.

func (*Float32MemoTable) CopyValuesSubset

func (s *Float32MemoTable) CopyValuesSubset(start int, out interface{})

CopyValuesSubset is like CopyValues but only copies a subset of values starting at the provided start index

func (*Float32MemoTable) Get

func (s *Float32MemoTable) Get(val interface{}) (int, bool)

Get returns the index of the requested value in the hash table or KeyNotFound along with a boolean indicating if it was found or not.

func (*Float32MemoTable) GetNull

func (s *Float32MemoTable) GetNull() (int, bool)

GetNull returns the index of an inserted null or KeyNotFound along with a bool that will be true if found and false if not.

func (*Float32MemoTable) GetOrInsert

func (s *Float32MemoTable) GetOrInsert(val interface{}) (idx int, found bool, err error)

GetOrInsert will return the index of the specified value in the table, or insert the value into the table and return the new index. found indicates whether or not it already existed in the table (true) or was inserted by this call (false).

func (*Float32MemoTable) GetOrInsertNull

func (s *Float32MemoTable) GetOrInsertNull() (idx int, found bool)

GetOrInsertNull will return the index of the null entry or insert a null entry if one currently doesn't exist. The found value will be true if there was already a null in the table, and false if it inserted one.

func (*Float32MemoTable) Reset

func (s *Float32MemoTable) Reset()

Reset allows this table to be re-used by dumping all the data currently in the table.

func (*Float32MemoTable) Size

func (s *Float32MemoTable) Size() int

Size returns the current number of inserted elements into the table including if a null has been inserted.

func (*Float32MemoTable) WriteOut

func (s *Float32MemoTable) WriteOut(out []byte)

func (*Float32MemoTable) WriteOutSubset

func (s *Float32MemoTable) WriteOutSubset(start int, out []byte)

type Float64HashTable

type Float64HashTable struct {
	// contains filtered or unexported fields
}

Float64HashTable is a hashtable specifically for float64 that is utilized with the MemoTable to generalize interactions for easier implementation of dictionaries without losing performance.

func NewFloat64HashTable

func NewFloat64HashTable(cap uint64) *Float64HashTable

NewFloat64HashTable returns a new hash table for float64 values initialized with the passed in capacity or 32 whichever is larger.

func (*Float64HashTable) CopyValues

func (h *Float64HashTable) CopyValues(out []float64)

CopyValues is used for copying the values out of the hash table into the passed in slice, in the order that they were first inserted

func (*Float64HashTable) CopyValuesSubset

func (h *Float64HashTable) CopyValuesSubset(start int, out []float64)

CopyValuesSubset copies a subset of the values in the hashtable out, starting with the value at start, in the order that they were inserted.

func (*Float64HashTable) Insert

func (h *Float64HashTable) Insert(e *entryFloat64, v uint64, val float64, memoIdx int32) error

Insert updates the given entry with the provided hash value, payload value and memo index. The entry pointer must have been retrieved via lookup in order to actually insert properly.

func (*Float64HashTable) Lookup

func (h *Float64HashTable) Lookup(v uint64, cmp func(float64) bool) (*entryFloat64, bool)

Lookup retrieves the entry for a given hash value assuming it's payload value returns true when passed to the cmp func. Returns a pointer to the entry for the given hash value, and a boolean as to whether it was found. It is not safe to use the pointer if the bool is false.

func (*Float64HashTable) Reset

func (h *Float64HashTable) Reset(cap uint64)

Reset drops all of the values in this hash table and re-initializes it with the specified initial capacity as if by calling New, but without having to reallocate the object.

func (*Float64HashTable) VisitEntries

func (h *Float64HashTable) VisitEntries(visit func(*entryFloat64))

VisitEntries will call the passed in function on each *valid* entry in the hash table, a valid entry being one which has had a value inserted into it.

func (*Float64HashTable) WriteOut

func (h *Float64HashTable) WriteOut(out []byte)

func (*Float64HashTable) WriteOutSubset

func (h *Float64HashTable) WriteOutSubset(start int, out []byte)

type Float64MemoTable

type Float64MemoTable struct {
	// contains filtered or unexported fields
}

Float64MemoTable is a wrapper over the appropriate hashtable to provide an interface conforming to the MemoTable interface defined in the encoding package for general interactions regarding dictionaries.

func NewFloat64MemoTable

func NewFloat64MemoTable(num int64) *Float64MemoTable

NewFloat64MemoTable returns a new memotable with num entries pre-allocated to reduce further allocations when inserting.

func (*Float64MemoTable) CopyValues

func (s *Float64MemoTable) CopyValues(out interface{})

CopyValues will copy the values from the memo table out into the passed in slice which must be of the appropriate type.

func (*Float64MemoTable) CopyValuesSubset

func (s *Float64MemoTable) CopyValuesSubset(start int, out interface{})

CopyValuesSubset is like CopyValues but only copies a subset of values starting at the provided start index

func (*Float64MemoTable) Get

func (s *Float64MemoTable) Get(val interface{}) (int, bool)

Get returns the index of the requested value in the hash table or KeyNotFound along with a boolean indicating if it was found or not.

func (*Float64MemoTable) GetNull

func (s *Float64MemoTable) GetNull() (int, bool)

GetNull returns the index of an inserted null or KeyNotFound along with a bool that will be true if found and false if not.

func (*Float64MemoTable) GetOrInsert

func (s *Float64MemoTable) GetOrInsert(val interface{}) (idx int, found bool, err error)

GetOrInsert will return the index of the specified value in the table, or insert the value into the table and return the new index. found indicates whether or not it already existed in the table (true) or was inserted by this call (false).

func (*Float64MemoTable) GetOrInsertNull

func (s *Float64MemoTable) GetOrInsertNull() (idx int, found bool)

GetOrInsertNull will return the index of the null entry or insert a null entry if one currently doesn't exist. The found value will be true if there was already a null in the table, and false if it inserted one.

func (*Float64MemoTable) Reset

func (s *Float64MemoTable) Reset()

Reset allows this table to be re-used by dumping all the data currently in the table.

func (*Float64MemoTable) Size

func (s *Float64MemoTable) Size() int

Size returns the current number of inserted elements into the table including if a null has been inserted.

func (*Float64MemoTable) WriteOut

func (s *Float64MemoTable) WriteOut(out []byte)

func (*Float64MemoTable) WriteOutSubset

func (s *Float64MemoTable) WriteOutSubset(start int, out []byte)

type Int32HashTable

type Int32HashTable struct {
	// contains filtered or unexported fields
}

Int32HashTable is a hashtable specifically for int32 that is utilized with the MemoTable to generalize interactions for easier implementation of dictionaries without losing performance.

func NewInt32HashTable

func NewInt32HashTable(cap uint64) *Int32HashTable

NewInt32HashTable returns a new hash table for int32 values initialized with the passed in capacity or 32 whichever is larger.

func (*Int32HashTable) CopyValues

func (h *Int32HashTable) CopyValues(out []int32)

CopyValues is used for copying the values out of the hash table into the passed in slice, in the order that they were first inserted

func (*Int32HashTable) CopyValuesSubset

func (h *Int32HashTable) CopyValuesSubset(start int, out []int32)

CopyValuesSubset copies a subset of the values in the hashtable out, starting with the value at start, in the order that they were inserted.

func (*Int32HashTable) Insert

func (h *Int32HashTable) Insert(e *entryInt32, v uint64, val int32, memoIdx int32) error

Insert updates the given entry with the provided hash value, payload value and memo index. The entry pointer must have been retrieved via lookup in order to actually insert properly.

func (*Int32HashTable) Lookup

func (h *Int32HashTable) Lookup(v uint64, cmp func(int32) bool) (*entryInt32, bool)

Lookup retrieves the entry for a given hash value assuming it's payload value returns true when passed to the cmp func. Returns a pointer to the entry for the given hash value, and a boolean as to whether it was found. It is not safe to use the pointer if the bool is false.

func (*Int32HashTable) Reset

func (h *Int32HashTable) Reset(cap uint64)

Reset drops all of the values in this hash table and re-initializes it with the specified initial capacity as if by calling New, but without having to reallocate the object.

func (*Int32HashTable) VisitEntries

func (h *Int32HashTable) VisitEntries(visit func(*entryInt32))

VisitEntries will call the passed in function on each *valid* entry in the hash table, a valid entry being one which has had a value inserted into it.

func (*Int32HashTable) WriteOut

func (h *Int32HashTable) WriteOut(out []byte)

func (*Int32HashTable) WriteOutSubset

func (h *Int32HashTable) WriteOutSubset(start int, out []byte)

type Int32MemoTable

type Int32MemoTable struct {
	// contains filtered or unexported fields
}

Int32MemoTable is a wrapper over the appropriate hashtable to provide an interface conforming to the MemoTable interface defined in the encoding package for general interactions regarding dictionaries.

func NewInt32MemoTable

func NewInt32MemoTable(num int64) *Int32MemoTable

NewInt32MemoTable returns a new memotable with num entries pre-allocated to reduce further allocations when inserting.

func (*Int32MemoTable) CopyValues

func (s *Int32MemoTable) CopyValues(out interface{})

CopyValues will copy the values from the memo table out into the passed in slice which must be of the appropriate type.

func (*Int32MemoTable) CopyValuesSubset

func (s *Int32MemoTable) CopyValuesSubset(start int, out interface{})

CopyValuesSubset is like CopyValues but only copies a subset of values starting at the provided start index

func (*Int32MemoTable) Get

func (s *Int32MemoTable) Get(val interface{}) (int, bool)

Get returns the index of the requested value in the hash table or KeyNotFound along with a boolean indicating if it was found or not.

func (*Int32MemoTable) GetNull

func (s *Int32MemoTable) GetNull() (int, bool)

GetNull returns the index of an inserted null or KeyNotFound along with a bool that will be true if found and false if not.

func (*Int32MemoTable) GetOrInsert

func (s *Int32MemoTable) GetOrInsert(val interface{}) (idx int, found bool, err error)

GetOrInsert will return the index of the specified value in the table, or insert the value into the table and return the new index. found indicates whether or not it already existed in the table (true) or was inserted by this call (false).

func (*Int32MemoTable) GetOrInsertNull

func (s *Int32MemoTable) GetOrInsertNull() (idx int, found bool)

GetOrInsertNull will return the index of the null entry or insert a null entry if one currently doesn't exist. The found value will be true if there was already a null in the table, and false if it inserted one.

func (*Int32MemoTable) Reset

func (s *Int32MemoTable) Reset()

Reset allows this table to be re-used by dumping all the data currently in the table.

func (*Int32MemoTable) Size

func (s *Int32MemoTable) Size() int

Size returns the current number of inserted elements into the table including if a null has been inserted.

func (*Int32MemoTable) WriteOut

func (s *Int32MemoTable) WriteOut(out []byte)

func (*Int32MemoTable) WriteOutSubset

func (s *Int32MemoTable) WriteOutSubset(start int, out []byte)

type Int64HashTable

type Int64HashTable struct {
	// contains filtered or unexported fields
}

Int64HashTable is a hashtable specifically for int64 that is utilized with the MemoTable to generalize interactions for easier implementation of dictionaries without losing performance.

func NewInt64HashTable

func NewInt64HashTable(cap uint64) *Int64HashTable

NewInt64HashTable returns a new hash table for int64 values initialized with the passed in capacity or 32 whichever is larger.

func (*Int64HashTable) CopyValues

func (h *Int64HashTable) CopyValues(out []int64)

CopyValues is used for copying the values out of the hash table into the passed in slice, in the order that they were first inserted

func (*Int64HashTable) CopyValuesSubset

func (h *Int64HashTable) CopyValuesSubset(start int, out []int64)

CopyValuesSubset copies a subset of the values in the hashtable out, starting with the value at start, in the order that they were inserted.

func (*Int64HashTable) Insert

func (h *Int64HashTable) Insert(e *entryInt64, v uint64, val int64, memoIdx int32) error

Insert updates the given entry with the provided hash value, payload value and memo index. The entry pointer must have been retrieved via lookup in order to actually insert properly.

func (*Int64HashTable) Lookup

func (h *Int64HashTable) Lookup(v uint64, cmp func(int64) bool) (*entryInt64, bool)

Lookup retrieves the entry for a given hash value assuming it's payload value returns true when passed to the cmp func. Returns a pointer to the entry for the given hash value, and a boolean as to whether it was found. It is not safe to use the pointer if the bool is false.

func (*Int64HashTable) Reset

func (h *Int64HashTable) Reset(cap uint64)

Reset drops all of the values in this hash table and re-initializes it with the specified initial capacity as if by calling New, but without having to reallocate the object.

func (*Int64HashTable) VisitEntries

func (h *Int64HashTable) VisitEntries(visit func(*entryInt64))

VisitEntries will call the passed in function on each *valid* entry in the hash table, a valid entry being one which has had a value inserted into it.

func (*Int64HashTable) WriteOut

func (h *Int64HashTable) WriteOut(out []byte)

func (*Int64HashTable) WriteOutSubset

func (h *Int64HashTable) WriteOutSubset(start int, out []byte)

type Int64MemoTable

type Int64MemoTable struct {
	// contains filtered or unexported fields
}

Int64MemoTable is a wrapper over the appropriate hashtable to provide an interface conforming to the MemoTable interface defined in the encoding package for general interactions regarding dictionaries.

func NewInt64MemoTable

func NewInt64MemoTable(num int64) *Int64MemoTable

NewInt64MemoTable returns a new memotable with num entries pre-allocated to reduce further allocations when inserting.

func (*Int64MemoTable) CopyValues

func (s *Int64MemoTable) CopyValues(out interface{})

CopyValues will copy the values from the memo table out into the passed in slice which must be of the appropriate type.

func (*Int64MemoTable) CopyValuesSubset

func (s *Int64MemoTable) CopyValuesSubset(start int, out interface{})

CopyValuesSubset is like CopyValues but only copies a subset of values starting at the provided start index

func (*Int64MemoTable) Get

func (s *Int64MemoTable) Get(val interface{}) (int, bool)

Get returns the index of the requested value in the hash table or KeyNotFound along with a boolean indicating if it was found or not.

func (*Int64MemoTable) GetNull

func (s *Int64MemoTable) GetNull() (int, bool)

GetNull returns the index of an inserted null or KeyNotFound along with a bool that will be true if found and false if not.

func (*Int64MemoTable) GetOrInsert

func (s *Int64MemoTable) GetOrInsert(val interface{}) (idx int, found bool, err error)

GetOrInsert will return the index of the specified value in the table, or insert the value into the table and return the new index. found indicates whether or not it already existed in the table (true) or was inserted by this call (false).

func (*Int64MemoTable) GetOrInsertNull

func (s *Int64MemoTable) GetOrInsertNull() (idx int, found bool)

GetOrInsertNull will return the index of the null entry or insert a null entry if one currently doesn't exist. The found value will be true if there was already a null in the table, and false if it inserted one.

func (*Int64MemoTable) Reset

func (s *Int64MemoTable) Reset()

Reset allows this table to be re-used by dumping all the data currently in the table.

func (*Int64MemoTable) Size

func (s *Int64MemoTable) Size() int

Size returns the current number of inserted elements into the table including if a null has been inserted.

func (*Int64MemoTable) WriteOut

func (s *Int64MemoTable) WriteOut(out []byte)

func (*Int64MemoTable) WriteOutSubset

func (s *Int64MemoTable) WriteOutSubset(start int, out []byte)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL