blake2b

package
v1.7.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 29, 2021 License: MIT, Apache-2.0 Imports: 3 Imported by: 0

README

BLAKE2b-SIMD

Pure Go implementation of BLAKE2b using SIMD optimizations.

Introduction

This package was initially based on the pure go BLAKE2b implementation of Dmitry Chestnykh and merged with the (cgo dependent) AVX optimized BLAKE2 implementation (which in turn is based on the official implementation. It does so by using Go's Assembler for amd64 architectures with a golang only fallback for other architectures.

In addition to AVX there is also support for AVX2 as well as SSE. Best performance is obtained with AVX2 which gives roughly a 4X performance increase approaching hashing speeds of 1GB/sec on a single core.

Benchmarks

This is a summary of the performance improvements. Full details are shown below.

Technology 128K
AVX2 3.94x
AVX 3.28x
SSE 2.85x

asm2plan9s

In order to be able to work more easily with AVX2/AVX instructions, a separate tool was developed to convert AVX2/AVX instructions into the corresponding BYTE sequence as accepted by Go assembly. See asm2plan9s for more information.

bt2sum

bt2sum is a utility that takes advantages of the BLAKE2b SIMD optimizations to compute check sums using the BLAKE2 Tree hashing mode in so called 'unlimited fanout' mode.

Technical details

BLAKE2b is a hashing algorithm that operates on 64-bit integer values. The AVX2 version uses the 256-bit wide YMM registers in order to essentially process four operations in parallel. AVX and SSE operate on 128-bit values simultaneously (two operations in parallel). Below are excerpts from compressAvx2_amd64.s, compressAvx_amd64.s, and compress_generic.go respectively.

    VPADDQ  YMM0,YMM0,YMM1   /* v0 += v4, v1 += v5, v2 += v6, v3 += v7 */
    VPADDQ  XMM0,XMM0,XMM2   /* v0 += v4, v1 += v5 */
    VPADDQ  XMM1,XMM1,XMM3   /* v2 += v6, v3 += v7 */
    v0 += v4
    v1 += v5
    v2 += v6
    v3 += v7

Detailed benchmarks

Example performance metrics were generated on Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz - 6 physical cores, 12 logical cores running Ubuntu GNU/Linux with kernel version 4.4.0-24-generic (vanilla with no optimizations).

AVX2
$ benchcmp go.txt avx2.txt
benchmark                old ns/op     new ns/op     delta
BenchmarkHash64-12       1481          849           -42.67%
BenchmarkHash128-12      1428          746           -47.76%
BenchmarkHash1K-12       6379          2227          -65.09%
BenchmarkHash8K-12       37219         11714         -68.53%
BenchmarkHash32K-12      140716        35935         -74.46%
BenchmarkHash128K-12     561656        142634        -74.60%

benchmark                old MB/s     new MB/s     speedup
BenchmarkHash64-12       43.20        75.37        1.74x
BenchmarkHash128-12      89.64        171.35       1.91x
BenchmarkHash1K-12       160.52       459.69       2.86x
BenchmarkHash8K-12       220.10       699.32       3.18x
BenchmarkHash32K-12      232.87       911.85       3.92x
BenchmarkHash128K-12     233.37       918.93       3.94x
AVX2: Comparison to other hashing techniques
$ go test -bench=Comparison
BenchmarkComparisonMD5-12    	    1000	   1726121 ns/op	 607.48 MB/s
BenchmarkComparisonSHA1-12   	     500	   2005164 ns/op	 522.94 MB/s
BenchmarkComparisonSHA256-12 	     300	   5531036 ns/op	 189.58 MB/s
BenchmarkComparisonSHA512-12 	     500	   3423030 ns/op	 306.33 MB/s
BenchmarkComparisonBlake2B-12	    1000	   1232690 ns/op	 850.64 MB/s

Benchmarks below were generated on a MacBook Pro with a 2.7 GHz Intel Core i7.

AVX
$ benchcmp go.txt  avx.txt 
benchmark               old ns/op     new ns/op     delta
BenchmarkHash64-8       813           458           -43.67%
BenchmarkHash128-8      766           401           -47.65%
BenchmarkHash1K-8       4881          1763          -63.88%
BenchmarkHash8K-8       36127         12273         -66.03%
BenchmarkHash32K-8      140582        43155         -69.30%
BenchmarkHash128K-8     567850        173246        -69.49%

benchmark               old MB/s     new MB/s     speedup
BenchmarkHash64-8       78.63        139.57       1.78x
BenchmarkHash128-8      166.98       318.73       1.91x
BenchmarkHash1K-8       209.76       580.68       2.77x
BenchmarkHash8K-8       226.76       667.46       2.94x
BenchmarkHash32K-8      233.09       759.29       3.26x
BenchmarkHash128K-8     230.82       756.56       3.28x
SSE
$ benchcmp go.txt sse.txt 
benchmark               old ns/op     new ns/op     delta
BenchmarkHash64-8       813           478           -41.21%
BenchmarkHash128-8      766           411           -46.34%
BenchmarkHash1K-8       4881          1870          -61.69%
BenchmarkHash8K-8       36127         12427         -65.60%
BenchmarkHash32K-8      140582        49512         -64.78%
BenchmarkHash128K-8     567850        199040        -64.95%

benchmark               old MB/s     new MB/s     speedup
BenchmarkHash64-8       78.63        133.78       1.70x
BenchmarkHash128-8      166.98       311.23       1.86x
BenchmarkHash1K-8       209.76       547.37       2.61x
BenchmarkHash8K-8       226.76       659.20       2.91x
BenchmarkHash32K-8      233.09       661.81       2.84x
BenchmarkHash128K-8     230.82       658.52       2.85x

License

Released under the Apache License v2.0. You can find the complete text in the file LICENSE.

Contributing

Contributions are welcome, please send PRs for any enhancements.

Documentation

Overview

Package blake2b implements BLAKE2b cryptographic hash function.

Index

Constants

View Source
const (
	BlockSize  = 128 // block size of algorithm
	Size       = 64  // maximum digest size
	SaltSize   = 16  // maximum salt size
	PersonSize = 16  // maximum personalization string size
	KeySize    = 64  // maximum size of key
)

Variables

This section is empty.

Functions

func New

func New(c *Config) (hash.Hash, error)

New returns a new hash.Hash configured with the given Config. Config can be nil, in which case the default one is used, calculating 64-byte digest. Returns non-nil error if Config contains invalid parameters.

func New256

func New256() hash.Hash

New256 returns a new hash.Hash computing the BLAKE2b 32-byte checksum.

func New512

func New512() hash.Hash

New512 returns a new hash.Hash computing the BLAKE2b 64-byte checksum.

func NewMAC

func NewMAC(outBytes uint8, key []byte) hash.Hash

NewMAC returns a new hash.Hash computing BLAKE2b prefix- Message Authentication Code of the given size in bytes (up to 64) with the given key (up to 64 bytes in length).

func Sum256

func Sum256(data []byte) (out [32]byte)

Sum256 returns a 32-byte BLAKE2b hash of data.

func Sum512

func Sum512(data []byte) [64]byte

Sum512 returns a 64-byte BLAKE2b hash of data.

Types

type Config

type Config struct {
	Size   uint8  // digest size (if zero, default size of 64 bytes is used)
	Key    []byte // key for prefix-MAC
	Salt   []byte // salt (if < 16 bytes, padded with zeros)
	Person []byte // personalization (if < 16 bytes, padded with zeros)
	Tree   *Tree  // parameters for tree hashing
}

Config is used to configure hash function parameters and keying. All parameters are optional.

type Tree

type Tree struct {
	Fanout        uint8  // fanout
	MaxDepth      uint8  // maximal depth
	LeafSize      uint32 // leaf maximal byte length (0 for unlimited)
	NodeOffset    uint64 // node offset (0 for first, leftmost or leaf)
	NodeDepth     uint8  // node depth (0 for leaves)
	InnerHashSize uint8  // inner hash byte length
	IsLastNode    bool   // indicates processing of the last node of layer
}

Tree represents parameters for tree hashing.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL