Documentation

Overview

    This package implements a set of utility hash functions.

    Index

    Constants

    This section is empty.

    Variables

    This section is empty.

    Functions

    func ComputeMd5Checksum

    func ComputeMd5Checksum(data []byte) []byte

      This returns the data's MD5 checksum.

      WARNING: Do NOT Use MD5 in security contexts (defending against intentional manipulations of data from untrusted sources); use only for checking data integrity against machine errors.

      func ConsistentHash

      func ConsistentHash(key uint64, numShards uint16) uint16

        ConsistentHash is a space efficient permutation-based consistent hashing function. This implementation supports up to a maximum of (1 << 16 - 1), 65535, number of shards.

        Implementation details:

        Unlike the standard ring-based algorithm (e.g., as described in dynamo db), this algorithm relays on shard permutations to determine the key's shard mapping. The idea is as follow:

        1. Assume there exist a set of shard ids, S, which contains every possible
           shard ids in the universe (in this case 0 .. 65535).
        2. Now suppose, A (a subset of S), is the set of available shard ids, and we
           want to find the shard mapping for key, K
        3. Use K as the pseudorandom generator's seed, and generate a random
           permutation of S using variable-base permutation encoding (see
           http://stackoverflow.com/questions/1506078/fast-permutation-number-permutation-mapping-algorithms
           for additional details)
        4. Ignore all shard ids in the permutation that are not in set A
        5. Finally, use the first shard id as K's shard mapping.
        

        NOTE: Because each key generates a different permutation, the data distribution is generally more uniform than the standard algorithm (The standard algorithm works around this issue by adding more points to the ring, which unfortunately uses even more memory).

        Complexity: this algorithm is O(1) in theory (because the max shard id is known), but O(n) in practice.

        Example:

        1. Assume S = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, and A = {0, 1, 2, 3, 4}.
        2. Now suppose K = 31415 and perm(S, K) = (3, 1, 9, 4, 7, 5, 8, 2, 0, 6).
        3. After ignoring S - A, the remaining ids are (3, 1, 4, 2, 0)
        4. Therefore, the key belongs to shard 3.
        

        func ValidateMd5Checksum

        func ValidateMd5Checksum(data []byte, sum []byte) bool

          This returns true iff the data matches the provided checksum.

          Types

          This section is empty.