dohash

package module
v1.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 13, 2024 License: BSD-3-Clause Imports: 5 Imported by: 1

README

dohash

Hashing any value

Imagine you have a type AType defined like:

type Blo = struct {
	n int
	a [5]byte
}
type AType = struct {
	p string
	arr [3]int
	z Blo
}

This type is comparable and can be used as key, for example, in a map[AType] string. If, however, you want to make your own hash table (for example, a map replacement but with different characteristics or guarantees like a cuckoo hash table), things can get complicated. The normal hash interfaces, like hash.Hash64 hash a slice of bytes which are written to it first (it implements the io.Writer interface) by calling the method Sum64. The method returns the calculated hash. How do we convert the AType to a slice of bytes to hash it? This is involved and slow to do by hand even if it is done only for that concrete type. The ReHash type wraps a regular hash which implements the hash.Hash64 interface making it capable to hash any comparable type. It provides the method SumObj64 to hash any comparable type. It is similar to writing the bytes with the data of the type and then calling Sum64, except you pass the type directly as parameter and it does the work of serializing the data itself. With it, you can hash the content of a variable of type AType directly. Even if you change the type, for example by adding new fields, as long as it stays comparable, the code does not need to change, enabling the construction of generic data types built on hashes.

Note that SumObj64 is not a superset of Sum64, because the former cannot be instantiated to the latter. Sum64 receives a slice of bytes as a parameter, which is not comparable.

There is also a version which hashes non-comparable types (only the subset that is comparable), ReSubHash.

A local copy of the godoc is here.

Generic hash: ReHash

Uses a hash.Hash64 interface to create a type which can hash any comparable value. The hash is non-portable but is implemented trying not to compromise performance. Note that the generic interface itself will have a performance cost.

In contrast to https://github.com/dolthub/maphash it does not obtain the hasher by reflecting on the runtime, but builds one itself by reflecting on the data type, so it is capable of using any hash function provided by the user. The Generic Maphash version which is wired to hash/maphash is more directly comparable with it.

Example:

var s AType	//  AType exists and is comparable
var mh maphash.Hash
hasher := dohash.NewReHash[AType](&mh)
fmt.Printf("hash: %x\n", hasher.SumObj64(s))
hasher.Reset()

The hasher can also be constructed from an example:

var s AType	//  AType exists and is comparable
var mh maphash.Hash
hasher := dohash.NewReHashFromExample(&mh, s)
fmt.Printf("hash: %x\n", hasher.SumObj64(s))
hasher.Reset()

Generic hash for not comparable types: ReSubHash

Behaves exactly like ReHash, but the method SumSubObj64 can deal with any data type, but it will only hash the subset of the data type that is comparable, ignoring the rest (use with care).

Generic maphash: MapHash

Similar to the above but faster because it is hardwired to hash/maphash.

Examples:

var s AType	//  AType exists and is comparable
var mh maphash.Hash
hasher := dohash.NewMapHash[AType]()
//  may call SetSeed or any method of MapHash on hasher
fmt.Printf("hash: %x\n", hasher.SumObj64(s))
hasher.Reset()

or

var s AType	//  AType exists and is comparable
var mh maphash.Hash
hasher := dohash.NewMapHashFromExample(s)
//  may call SetSeed or any method of MapHash on hasher
fmt.Printf("hash: %x\n", hasher.SumObj64(s))
hasher.Reset()

Documentation

Overview

Package to make a hash.Hash64 interface able to hash any value. For comparable types use ReHash and for non comparable types, use ReSubHash, which will be able to hash the comparable subset. All of the types calculate the hash in a non-portable way, that is, the hash obtained may be different on different architectures/systems (this is useful, for example, to write datatypes, like a hash table). It is written to be fast but in clean portable, non-fragile go. An slighly faster version hardwired to hash/maphash is also included. An example of client of this package to implement a cuckoo hash table can be found in cuckoo.

Example
package main

import (
	"fmt"
	"gitlab.eif.urjc.es/paurea/dohash"
	"hash/maphash"
)

func main() {
	type inner = struct {
		p string
		o byte
	}
	type outer = struct {
		i  int
		s  string
		in inner
	}
	s := [2]outer{{2, "bla", inner{"oooo", 'c'}}, {2, "bla", inner{"oooo", 'c'}}}

	var mh maphash.Hash
	dh := dohash.NewReHash[[2]outer](&mh)

	fmt.Printf("%x", dh.SumObj64(s))
	dh.Reset()
}
Output:

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Hasher64

type Hasher64[K comparable] interface {
	SumObj64(key K) (s uint64)
}

Hashes any comparable type.

type MapHash

type MapHash[K comparable] struct {
	maphash.Hash
	// contains filtered or unexported fields
}

Wrapper for hash/maphash.Hash capable of hashing values of type K.

Example
package main

import (
	"fmt"
	"gitlab.eif.urjc.es/paurea/dohash"
)

func main() {
	type inner = struct {
		p string
		o byte
	}
	type outer = struct {
		i  int
		s  string
		in inner
	}
	s := [2]outer{{2, "bla", inner{"oooo", 'c'}}, {2, "bla", inner{"oooo", 'c'}}}

	hm := dohash.NewMapHashFromExample(s)

	fmt.Printf("%x", hm.SumObj64(s))
	hm.Reset()
}
Output:

func NewMapHash

func NewMapHash[K comparable]() (hm *MapHash[K])

Create a MapHash capable of hashing values of type K.

func NewMapHashFromExample

func NewMapHashFromExample[K comparable](val K) (hm *MapHash[K])

Create a MapHash from an example value capable of hashing values of the same type K. The value will only have its type inspected.

func (*MapHash[K]) SumObj64

func (hm *MapHash[K]) SumObj64(key K) (s uint64)

SumObj64 hashes the key and returns the current 64-bit value, which depends on hm's seed and the sequence of bytes added to h since the last call to Hash.Reset or Hash.SetSeed.

type ReHash

type ReHash[K comparable] struct {
	hash.Hash64
	// contains filtered or unexported fields
}

Wraps a hash.Hash64 adding the information needed to be able to provide generic methods which hash any object of type K which has to be comparable.

func NewReHash

func NewReHash[K comparable](h hash.Hash64) (rh *ReHash[K])

Create a ReHash capable of hashing values of type K.

func NewReHashFromExample

func NewReHashFromExample[K comparable](h hash.Hash64, val K) (rh *ReHash[K])

Create a ReHash from an example value capable of hashing values of the same type K. The value will only have its type inspected.

func (*ReHash[K]) SumObj64

func (rh *ReHash[K]) SumObj64(key K) (s uint64)

SumObj64 hashes the key and returns the current 64-bit value, and updates its state. The semantics are similar to that of the underlying hash when hash/Sum64 is called.

type ReSubHash

type ReSubHash[K any] struct {
	hash.Hash64
	// contains filtered or unexported fields
}

Wraps a hash.Hash64 adding the information needed to be able to provide generic methods which hash any object of type K ignoring the parts which are not comparable

func NewSubHash

func NewSubHash[K any](h hash.Hash64) (rh *ReSubHash[K])

Create a ReHash capable of hashing values of type K. The non-comparable fields (even in the subfields of the fields) will be ignored.

func NewSubHashFromExample

func NewSubHashFromExample[K any](h hash.Hash64, val K) (rh *ReSubHash[K])

Create a ReSubHash from an example value capable of hashing values of the same type K. The non-comparable fields (even in the subfields of the fields) will be ignored. The value will only have its type inspected.

func (*ReSubHash[K]) SumSubObj64

func (rh *ReSubHash[K]) SumSubObj64(key K) (s uint64)

Calculate the hash of any type K. See hash.go for details on how this works. The only difference is that this method will ignore any non-comparable fields.

type SubHasher64

type SubHasher64[K any] interface {
	SumSubObj64(key K) (s uint64)
}

Hashes the comparable subset of any type. The non-comparable fields will be ignored, in the fields of the type and in the fields of the fields recursively.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL