Documentation
¶
Overview ¶
Package hashing provides utilities for and an implementation of a hash table which is more performant than the default go map implementation by leveraging xxh3 and some custom hash functions.
Index ¶
- Constants
- func Hash(b []byte, alg uint64) uint64
- type BinaryBuilderIFace
- type BinaryMemoTable
- func (b *BinaryMemoTable) CopyFixedWidthValues(start, width int, out []byte)
- func (b *BinaryMemoTable) CopyLargeOffsets(out []int64)
- func (b *BinaryMemoTable) CopyLargeOffsetsSubset(start int, out []int64)
- func (b *BinaryMemoTable) CopyOffsets(out []int32)
- func (b *BinaryMemoTable) CopyOffsetsSubset(start int, out []int32)
- func (b *BinaryMemoTable) CopyValues(out interface{})
- func (b *BinaryMemoTable) CopyValuesSubset(start int, out interface{})
- func (b *BinaryMemoTable) Exists(val []byte) bool
- func (b *BinaryMemoTable) Get(val interface{}) (int, bool)
- func (s *BinaryMemoTable) GetNull() (int, bool)
- func (b *BinaryMemoTable) GetOrInsert(val interface{}) (idx int, found bool, err error)
- func (b *BinaryMemoTable) GetOrInsertBytes(val []byte) (idx int, found bool, err error)
- func (b *BinaryMemoTable) GetOrInsertNull() (idx int, found bool)
- func (b *BinaryMemoTable) InsertOrGet(val []byte) (idx int, found bool, err error)
- func (b *BinaryMemoTable) Release()
- func (s *BinaryMemoTable) Reset()
- func (b *BinaryMemoTable) Retain()
- func (s *BinaryMemoTable) Size() int
- func (BinaryMemoTable) TypeTraits() TypeTraits
- func (b *BinaryMemoTable) Value(i int) []byte
- func (b *BinaryMemoTable) ValuesSize() int
- func (b *BinaryMemoTable) VisitValues(start int, visitFn func([]byte))
- func (b *BinaryMemoTable) WriteOut(out []byte)
- func (b *BinaryMemoTable) WriteOutSubset(start int, out []byte)
- type ByteSlice
- type Float32MemoTable
- type Float64MemoTable
- type HashTable
- func (h *HashTable[T]) CopyValues(out []T)
- func (h *HashTable[T]) CopyValuesSubset(start int, out []T)
- func (h *HashTable[T]) Insert(e *entry[T], v uint64, val T, memoIdx int32) error
- func (h *HashTable[T]) Lookup(v uint64, cmp func(T) bool) (*entry[T], bool)
- func (h *HashTable[T]) Reset(cap uint64)
- func (h *HashTable[T]) VisitEntries(visit func(*entry[T]))
- func (h *HashTable[T]) WriteOut(out []byte)
- func (h *HashTable[T]) WriteOutSubset(start int, out []byte)
- type Int16MemoTable
- type Int32MemoTable
- type Int64MemoTable
- type Int8MemoTable
- type MemoTable
- type MemoTypes
- type NumericMemoTable
- type Table
- func (t *Table[T]) CopyValues(out any)
- func (t *Table[T]) CopyValuesSubset(start int, out any)
- func (t *Table[T]) Exists(val T) bool
- func (t *Table[T]) Get(val any) (int, bool)
- func (t *Table[T]) GetNull() (int, bool)
- func (t *Table[T]) GetOrInsert(val any) (idx int, found bool, err error)
- func (t *Table[T]) GetOrInsertBytes(val []byte) (idx int, found bool, err error)
- func (t *Table[T]) GetOrInsertNull() (idx int, found bool)
- func (t *Table[T]) InsertOrGet(val T) (idx int, found bool, err error)
- func (t *Table[T]) Reset()
- func (t *Table[T]) Size() int
- func (t *Table[T]) TypeTraits() TypeTraits
- func (t *Table[T]) WriteOut(out []byte)
- func (t *Table[T]) WriteOutLE(out []byte)
- func (t *Table[T]) WriteOutSubset(start int, out []byte)
- func (t *Table[T]) WriteOutSubsetLE(start int, out []byte)
- type TypeTraits
- type TypedMemoTable
- type Uint16MemoTable
- type Uint32MemoTable
- type Uint64MemoTable
- type Uint8MemoTable
Constants ¶
const KeyNotFound = -1
KeyNotFound is the constant returned by memo table functions when a key isn't found in the table
Variables ¶
This section is empty.
Functions ¶
Types ¶
type BinaryBuilderIFace ¶
type BinaryMemoTable ¶
type BinaryMemoTable struct {
// contains filtered or unexported fields
}
BinaryMemoTable is our hashtable for binary data using the BinaryBuilder to construct the actual data in an easy to pass around way with minimal copies while using a hash table to keep track of the indexes into the dictionary that is created as we go.
func NewBinaryMemoTable ¶
func NewBinaryMemoTable(initial, valuesize int, bldr BinaryBuilderIFace) *BinaryMemoTable
NewBinaryMemoTable returns a hash table for Binary data, the passed in allocator will be utilized for the BinaryBuilder, if nil then memory.DefaultAllocator will be used. initial and valuesize can be used to pre-allocate the table to reduce allocations. With initial being the initial number of entries to allocate for and valuesize being the starting amount of space allocated for writing the actual binary data.
func (*BinaryMemoTable) CopyFixedWidthValues ¶
func (b *BinaryMemoTable) CopyFixedWidthValues(start, width int, out []byte)
CopyFixedWidthValues exists to cope with the fact that the table doesn't keep track of the fixed width when inserting the null value the databuffer holds a zero length byte slice for the null value (if found)
func (*BinaryMemoTable) CopyLargeOffsets ¶
func (b *BinaryMemoTable) CopyLargeOffsets(out []int64)
CopyLargeOffsets copies the list of offsets into the passed in slice, the offsets being the start and end values of the underlying allocated bytes in the builder for the individual values of the table. out should be at least sized to Size()+1
func (*BinaryMemoTable) CopyLargeOffsetsSubset ¶
func (b *BinaryMemoTable) CopyLargeOffsetsSubset(start int, out []int64)
CopyLargeOffsetsSubset is like CopyOffsets but instead of copying all of the offsets, it gets a subset of the offsets in the table starting at the index provided by "start".
func (*BinaryMemoTable) CopyOffsets ¶
func (b *BinaryMemoTable) CopyOffsets(out []int32)
CopyOffsets copies the list of offsets into the passed in slice, the offsets being the start and end values of the underlying allocated bytes in the builder for the individual values of the table. out should be at least sized to Size()+1
func (*BinaryMemoTable) CopyOffsetsSubset ¶
func (b *BinaryMemoTable) CopyOffsetsSubset(start int, out []int32)
CopyOffsetsSubset is like CopyOffsets but instead of copying all of the offsets, it gets a subset of the offsets in the table starting at the index provided by "start".
func (*BinaryMemoTable) CopyValues ¶
func (b *BinaryMemoTable) CopyValues(out interface{})
CopyValues copies the raw binary data bytes out, out should be a []byte with at least ValuesSize bytes allocated to copy into.
func (*BinaryMemoTable) CopyValuesSubset ¶
func (b *BinaryMemoTable) CopyValuesSubset(start int, out interface{})
CopyValuesSubset copies the raw binary data bytes out starting with the value at the index start, out should be a []byte with at least ValuesSize bytes allocated
func (*BinaryMemoTable) Exists ¶ added in v18.3.0
func (b *BinaryMemoTable) Exists(val []byte) bool
func (*BinaryMemoTable) Get ¶
func (b *BinaryMemoTable) Get(val interface{}) (int, bool)
Get returns the index of the specified value in the table or KeyNotFound, and a boolean indicating whether it was found in the table.
func (*BinaryMemoTable) GetNull ¶
func (s *BinaryMemoTable) GetNull() (int, bool)
GetNull returns the index of a null that has been inserted into the table or KeyNotFound. The bool returned will be true if there was a null inserted into the table, and false otherwise.
func (*BinaryMemoTable) GetOrInsert ¶
func (b *BinaryMemoTable) GetOrInsert(val interface{}) (idx int, found bool, err error)
func (*BinaryMemoTable) GetOrInsertBytes ¶
func (b *BinaryMemoTable) GetOrInsertBytes(val []byte) (idx int, found bool, err error)
GetOrInsertBytes returns the index of the given value in the table, if not found it is inserted into the table. The return value 'found' indicates whether the value was found in the table (true) or inserted (false) along with any possible error.
func (*BinaryMemoTable) GetOrInsertNull ¶
func (b *BinaryMemoTable) GetOrInsertNull() (idx int, found bool)
GetOrInsertNull retrieves the index of a null in the table or inserts null into the table, returning the index and a boolean indicating if it was found in the table (true) or was inserted (false).
func (*BinaryMemoTable) InsertOrGet ¶ added in v18.3.0
func (b *BinaryMemoTable) InsertOrGet(val []byte) (idx int, found bool, err error)
GetOrInsert returns the index of the given value in the table, if not found it is inserted into the table. The return value 'found' indicates whether the value was found in the table (true) or inserted (false) along with any possible error.
func (*BinaryMemoTable) Release ¶
func (b *BinaryMemoTable) Release()
Release is used to tell the underlying builder that it can release the memory allocated when the reference count reaches 0, this is safe to be called from multiple goroutines simultaneously
func (*BinaryMemoTable) Reset ¶
func (s *BinaryMemoTable) Reset()
Reset dumps all of the data in the table allowing it to be reutilized.
func (*BinaryMemoTable) Retain ¶
func (b *BinaryMemoTable) Retain()
Retain increases the ref count, it is safe to call it from multiple goroutines simultaneously.
func (*BinaryMemoTable) Size ¶
func (s *BinaryMemoTable) Size() int
Size returns the current size of the memo table including the null value if one has been inserted.
func (BinaryMemoTable) TypeTraits ¶
func (BinaryMemoTable) TypeTraits() TypeTraits
func (*BinaryMemoTable) Value ¶
func (b *BinaryMemoTable) Value(i int) []byte
func (*BinaryMemoTable) ValuesSize ¶
func (b *BinaryMemoTable) ValuesSize() int
ValuesSize returns the current total size of all the raw bytes that have been inserted into the memotable so far.
func (*BinaryMemoTable) VisitValues ¶
func (b *BinaryMemoTable) VisitValues(start int, visitFn func([]byte))
VisitValues exists to run the visitFn on each value currently in the hash table.
func (*BinaryMemoTable) WriteOut ¶
func (b *BinaryMemoTable) WriteOut(out []byte)
func (*BinaryMemoTable) WriteOutSubset ¶
func (b *BinaryMemoTable) WriteOutSubset(start int, out []byte)
type Float32MemoTable ¶
type Float64MemoTable ¶
type HashTable ¶ added in v18.4.0
type HashTable[T fixedLenMemoTypes] struct {
// contains filtered or unexported fields
}
func NewHashTable ¶ added in v18.4.0
func (*HashTable[T]) CopyValues ¶ added in v18.4.0
func (h *HashTable[T]) CopyValues(out []T)
func (*HashTable[T]) CopyValuesSubset ¶ added in v18.4.0
func (*HashTable[T]) VisitEntries ¶ added in v18.4.0
func (h *HashTable[T]) VisitEntries(visit func(*entry[T]))
func (*HashTable[T]) WriteOutSubset ¶ added in v18.4.0
type Int16MemoTable ¶
type Int32MemoTable ¶
type Int64MemoTable ¶
type Int8MemoTable ¶
type MemoTable ¶
type MemoTable interface { TypeTraits() TypeTraits // Reset drops everything in the table allowing it to be reused Reset() // Size returns the current number of unique values stored in // the table, including whether or not a null value has been // inserted via GetOrInsertNull. Size() int // CopyValues populates out with the values currently in the table, out must // be a slice of the appropriate type for the table type. CopyValues(out any) // CopyValuesSubset is like CopyValues but only copies a subset of values starting // at the indicated index. CopyValuesSubset(start int, out any) // Get returns the index of the table the specified value is, and a boolean indicating // whether or not the value was found in the table. Will panic if val is not the appropriate // type for the underlying table. Get(val interface{}) (int, bool) // GetOrInsert returns the index of the table the specified value is, // and a boolean indicating whether or not the value was found in // the table (if false, the value was inserted). An error is returned // if val is not the appropriate type for the table. GetOrInsert(val interface{}) (idx int, existed bool, err error) // GetOrInsertBytes returns the index of the table the specified value is, // and a boolean indicating whether or not the value was found in // the table (if false, the value was inserted). An error is returned // if val is not the appropriate type for the table. This function is intended to be used by // the BinaryMemoTable to prevent unnecessary allocations of the data when converting from a []byte to interface{}. GetOrInsertBytes(val []byte) (idx int, existed bool, err error) // GetOrInsertNull returns the index of the null value in the table, // inserting one if it hasn't already been inserted. It returns a boolean // indicating if the null value already existed or not in the table. GetOrInsertNull() (idx int, existed bool) // GetNull returns the index of the null value in the table, but does not // insert one if it doesn't already exist. Will return -1 if it doesn't exist // indicated by a false value for the boolean. GetNull() (idx int, exists bool) // WriteOut copies the unique values of the memotable out to the byte slice // provided. Must have allocated enough bytes for all the values. WriteOut(out []byte) // WriteOutSubset is like WriteOut, but only writes a subset of values // starting with the index offset. WriteOutSubset(offset int, out []byte) }
MemoTable interface for hash tables and dictionary encoding.
Values will remember the order they are inserted to generate a valid dictionary.
type NumericMemoTable ¶
type Table ¶ added in v18.4.0
type Table[T fixedLenMemoTypes] struct {
// contains filtered or unexported fields
}
func NewMemoTable ¶ added in v18.4.0
func (*Table[T]) CopyValues ¶ added in v18.4.0
func (*Table[T]) CopyValuesSubset ¶ added in v18.4.0
func (*Table[T]) GetOrInsert ¶ added in v18.4.0
func (*Table[T]) GetOrInsertBytes ¶ added in v18.4.0
func (*Table[T]) GetOrInsertNull ¶ added in v18.4.0
func (*Table[T]) InsertOrGet ¶ added in v18.4.0
func (*Table[T]) TypeTraits ¶ added in v18.4.0
func (t *Table[T]) TypeTraits() TypeTraits