sha256

package module

v0.0.0-...-c378aaa Latest Latest Go to latest Published: Apr 3, 2018 License: Apache-2.0 Imports: 2 Imported by: 6

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/1lann/sha256-simd

Links

Open Source Insights

README ¶

sha256-simd

Accelerate SHA256 computations in pure Go for both Intel (AVX2, AVX, SSE) as well as ARM (arm64) platforms.

This is a fork of https://github.com/minio/sha256-simd. Changes made:

Optimise for Krist mining by adding comparison operations for parts of the hash against the work.

Introduction

This package is designed as a drop-in replacement for crypto/sha256. For Intel CPUs it has three flavors for AVX2, AVX and SSE whereby the fastest method is automatically chosen depending on CPU capabilities. For ARM CPUs with the Cryptography Extensions advantage is taken of the SHA2 instructions resulting in a massive performance improvement.

This package uses Golang assembly and as such does not depend on cgo. The Intel versions are based on the implementations as described in "Fast SHA-256 Implementations on Intel Architecture Processors" by J. Guilford et al.

Performance

Below is the speed in MB/s for a single core (ranked fast to slow) as well as the factor of improvement over crypto/sha256 (when applicable).

Processor	Package	Speed	Improvement
1.2 GHz ARM Cortex-A53	minio/sha256-simd (ARM64)	638.2 MB/s	105x
2.4 GHz Intel Xeon CPU E5-2620 v3	minio/sha256-simd (AVX2) (*)	355.0 MB/s	1.88x
2.4 GHz Intel Xeon CPU E5-2620 v3	minio/sha256-simd (AVX)	306.0 MB/s	1.62x
2.4 GHz Intel Xeon CPU E5-2620 v3	minio/sha256-simd (SSE)	298.7 MB/s	1.58x
2.4 GHz Intel Xeon CPU E5-2620 v3	crypto/sha256	189.2 MB/s
1.2 GHz ARM Cortex-A53	crypto/sha256	6.1 MB/s

(*) Measured with the "unrolled"/"demacro-ed" AVX2 version. Due to some Golang assembly restrictions the AVX2 version that uses defines loses about 15% performance. The optimized version is contained in the git history so for maximum speed you want to do this after getting: git cat-file blob 586b6e > sha256blockAvx2_amd64.s (or vendor it for your project; see here to view it in its full glory).

See further down for detailed performance.

Comparison to other hashing techniques

As measured on Intel Xeon (same as above) with AVX2 version:

Method	Package	Speed
BLAKE2B	minio/blake2b-simd	851 MB/s
MD5	crypto/md5	607 MB/s
SHA1	crypto/sha1	522 MB/s
SHA256	minio/sha256-simd	355 MB/s
SHA512	crypto/sha512	306 MB/s

asm2plan9s

In order to be able to work more easily with AVX2/AVX instructions, a separate tool was developed to convert AVX2/AVX instructions into the corresponding BYTE sequence as accepted by Go assembly. See asm2plan9s for more information.

Why and benefits

One of the most performance sensitive parts of Minio server (object storage server compatible with Amazon S3) is related to SHA256 hash sums calculations. For instance during multi part uploads each part that is uploaded needs to be verified for data integrity by the server. Likewise in order to generated pre-signed URLs check sums must be calculated to ensure their validity.

Other applications that can benefit from enhanced SHA256 performance are deduplication in storage systems, intrusion detection, version control systems, integrity checking, etc.

ARM SHA Extensions

The 64-bit ARMv8 core has introduced new instructions for SHA1 and SHA2 acceleration as part of the Cryptography Extensions. Below you can see a small excerpt highlighting one of the rounds as is done for the SHA256 calculation process (for full code see sha256block_arm64.s).

sha256h    q2, q3, v9.4s
sha256h2   q3, q4, v9.4s
sha256su0  v5.4s, v6.4s
rev32      v8.16b, v8.16b
add        v9.4s, v7.4s, v18.4s
mov        v4.16b, v2.16b
sha256h    q2, q3, v10.4s
sha256h2   q3, q4, v10.4s
sha256su0  v6.4s, v7.4s
sha256su1  v5.4s, v7.4s, v8.4s

Detailed benchmarks

ARM64

Benchmarks generated on a 1.2 Ghz Quad-Core ARM Cortex A53 equipped Pine64.

minio@minio-arm:~/gopath/src/github.com/sha256-simd$ benchcmp golang.txt arm64.txt
benchmark                 old ns/op     new ns/op     delta
BenchmarkHash8Bytes-4     11836         1403          -88.15%
BenchmarkHash1K-4         181143        3138          -98.27%
BenchmarkHash8K-4         1365652       14356         -98.95%
BenchmarkHash1M-4         173192200     1642954       -99.05%

benchmark                 old MB/s     new MB/s     speedup
BenchmarkHash8Bytes-4     0.68         5.70         8.38x
BenchmarkHash1K-4         5.65         326.30       57.75x
BenchmarkHash8K-4         6.00         570.63       95.11x
BenchmarkHash1M-4         6.05         638.23       105.49x

Example performance metrics were generated on Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz - 6 physical cores, 12 logical cores running Ubuntu GNU/Linux with kernel version 4.4.0-24-generic (vanilla with no optimizations).

AVX2

$ benchcmp go.txt avx2.txt
benchmark                  old ns/op     new ns/op     delta
BenchmarkHash8Bytes-12     446           364           -18.39%
BenchmarkHash1K-12         5919          3279          -44.60%
BenchmarkHash8K-12         43791         23655         -45.98%
BenchmarkHash1M-12         5544989       2969305       -46.45%

benchmark                  old MB/s     new MB/s     speedup
BenchmarkHash8Bytes-12     17.93        21.96        1.22x
BenchmarkHash1K-12         172.98       312.27       1.81x
BenchmarkHash8K-12         187.07       346.31       1.85x
BenchmarkHash1M-12         189.10       353.14       1.87x

AVX

$ benchcmp go.txt avx.txt
benchmark                  old ns/op     new ns/op     delta
BenchmarkHash8Bytes-12     446           346           -22.42%
BenchmarkHash1K-12         5919          3701          -37.47%
BenchmarkHash8K-12         43791         27222         -37.84%
BenchmarkHash1M-12         5544989       3426938       -38.20%

benchmark                  old MB/s     new MB/s     speedup
BenchmarkHash8Bytes-12     17.93        23.06        1.29x
BenchmarkHash1K-12         172.98       276.64       1.60x
BenchmarkHash8K-12         187.07       300.93       1.61x
BenchmarkHash1M-12         189.10       305.98       1.62x

SSE

$ benchcmp go.txt sse.txt
benchmark                  old ns/op     new ns/op     delta
BenchmarkHash8Bytes-12     446           362           -18.83%
BenchmarkHash1K-12         5919          3751          -36.63%
BenchmarkHash8K-12         43791         27396         -37.44%
BenchmarkHash1M-12         5544989       3444623       -37.88%

benchmark                  old MB/s     new MB/s     speedup
BenchmarkHash8Bytes-12     17.93        22.05        1.23x
BenchmarkHash1K-12         172.98       272.92       1.58x
BenchmarkHash8K-12         187.07       299.01       1.60x
BenchmarkHash1M-12         189.10       304.41       1.61x

License

Released under the Apache License v2.0. You can find the complete text in the file LICENSE.

Contributing

Contributions are welcome, please send PRs for any enhancements.

Documentation ¶

Rendered for

Index ¶

Constants
func New() hash.Hash
func Sum256(data []byte) [Size]byte
func SumCmp256(data []byte, work uint32) bool
func SumToNum256(data []byte) int64

Constants ¶

View Source

const BlockSize = 64

BlockSize - The blocksize of SHA256 in bytes.

View Source

const Size = 32

Size - The size of a SHA256 checksum in bytes.

Variables ¶

This section is empty.

Functions ¶

func New ¶

func New() hash.Hash

New returns a new hash.Hash computing the SHA256 checksum.

func Sum256 ¶

func Sum256(data []byte) [Size]byte

Sum256 - single caller sha256 helper

func SumCmp256 ¶

func SumCmp256(data []byte, work uint32) bool

Sums and compares

func SumToNum256 ¶

func SumToNum256(data []byte) int64

Types ¶

This section is empty.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL