hash2

package
Version: v0.0.0-...-df3943b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 8, 2014 License: BSD-3-Clause Imports: 2 Imported by: 0

Documentation

Overview

This package implements a set of utility hash functions.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ComputeMd5Checksum

func ComputeMd5Checksum(data []byte) []byte

This returns the data's MD5 checksum.

WARNING: Do NOT Use MD5 in security contexts (defending against intentional manipulations of data from untrusted sources); use only for checking data integrity against machine errors.

func ConsistentHash

func ConsistentHash(key uint64, numShards uint16) uint16

A space efficient permutation-based consistent hashing function. This implementation supports up to a maximum of (1 << 16 - 1), 65535, number of shards.

Implementation details:

Unlike the standard ring-based algorithm (e.g., as described in dynamo db), this algorithm relays on shard permutations to determine the key's shard mapping. The idea is as follow:

1. Assume there exist a set of shard ids, S, which contains every possible
   shard ids in the universe (in this case 0 .. 65535).
2. Now suppose, A (a subset of S), is the set of available shard ids, and we
   want to find the shard mapping for key, K
3. Use K as the pseudorandom generator's seed, and generate a random
   permutation of S using variable-base permutation encoding (see
   http://stackoverflow.com/questions/1506078/fast-permutation-number-permutation-mapping-algorithms
   for additional details)
4. Ignore all shard ids in the permutation that are not in set A
5. Finally, use the first shard id as K's shard mapping.

NOTE: Because each key generates a different permutation, the data distribution is generally more uniform than the standard algorithm (The standard algorithm works around this issue by adding more points to the ring, which unfortunately uses even more memory).

Complexity: this algorithm is O(1) in theory (because the max shard id is known), but O(n) in practice.

Example:

1. Assume S = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, and A = {0, 1, 2, 3, 4}.
2. Now suppose K = 31415 and perm(S, K) = (3, 1, 9, 4, 7, 5, 8, 2, 0, 6).
3. After ignoring S - A, the remaining ids are (3, 1, 4, 2, 0)
4. Therefore, the key belongs to shard 3.

func ValidateMd5Checksum

func ValidateMd5Checksum(data []byte, sum []byte) bool

This returns true iff the data matches the provided checksum.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL