hd

package module

v1.0.2 Latest Latest Go to latest Published: Feb 18, 2025 License: MIT Imports: 1 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/nikolaydubina/go-hackers-delight

Links

Open Source Insights

README ¶

go-hackers-delight

An interactive Go showcase of "Hacker's Delight" 2nd Edition by Henrey S.Warren Jr, 2013.[^1][^2]

If you write optimizing compilers or high-performance code, you must read this book. — Guy L. Steele [PhD, ACM Award, AAAI Fellow, Sun Fellow, core Java design]

This is the first book that promises to tell the deep, dark secrets of computer arithmetic, and it delivers in spades. It contains every trick I knew plus many, many more. A godsend for library developers, compiler writers, and lovers of elegant hacks. It deserves a spot on your self right next to Knuth. In the ten years since the first edition came out, it's been absolutely invaluable to my work at Sun and Google. — Joshua Bloch [PhD, Distinguished Engineer at Sun, Chief Java Architect at Google 2004~2012]

Methodology

extensively fuzzing
not using any Go packages, not even standard library
using generics whenever possible
verifying compiled code via https://godbolt.org

Observations

native math.Pow(x, 1./3)[^4] performance is worse than Cbrt algorithm 💥
native math.Pow(x, 1./3)[^4][^5] underflows but Cbrt algorithm is correct 💥
native math.Pow(x, n)[^4] performance is worse than Pow algorithm 💥
native math.Log2(x)[^4] performance is worse than Log2 algorithm 💥
native math.Log10(x)[^4] performance is worse than Log10 algorithm 💥
native math.Log10(x) [^4][^5] overflows for math.MaxUint64 but Log10 algorithm is correct 💥
simplistic LRUCache performance is 2x worse than LRUCache algorithm 💥

Appendix: Benchmarks

$ go test -bench .
goos: darwin
goarch: arm64
pkg: github.com/nikolaydubina/go-hackers-delight
cpu: Apple M3 Max
BenchmarkNoop/---------------------------------16         	1000000000	         0.0000001 ns/op
BenchmarkAbs/basic-16                                     	 1489353	       811.6 ns/op
BenchmarkAbs/Abs-16                                       	 1460946	       818.8 ns/op
BenchmarkAbs/Abs2-16                                      	 1464883	       816.4 ns/op
BenchmarkAbs/Abs3-16                                      	 1443763	       830.9 ns/op
BenchmarkAbs/Abs4-16                                      	 1431141	       882.4 ns/op
BenchmarkAbs/AbsFastMul-16                                	 1494900	       803.1 ns/op
BenchmarkAvg/basic-16                                     	  611107	      1970 ns/op
BenchmarkAvg/AvgFloor-16                                  	  811885	      1603 ns/op
BenchmarkAvg/AvgCeil-16                                   	  609648	      1916 ns/op
BenchmarkCycleThree/basic-16                              	1000000000	         1.105 ns/op
BenchmarkCycleThree/CycleThreeValues-16                   	1000000000	         1.000 ns/op
BenchmarkLeadingZeros/uint32/basic-16                     	 1474354	       816.2 ns/op
BenchmarkLeadingZeros/uint32/LeadingZerosUint32-16        	 1000000	      1174 ns/op
BenchmarkLeadingZeros/uint64/basic-16                     	 1483987	       808.8 ns/op
BenchmarkLeadingZeros/uint64/LeadingZerosUint64-16        	  892615	      1349 ns/op
BenchmarkCompress/Compress-16                             	150361084	         8.064 ns/op
BenchmarkCompress/Compress2-16                            	54269787	        21.47 ns/op
BenchmarkLRU/basic-16                                     	447112422	         2.689 ns/op
BenchmarkLRU/LRUCache-16                                  	625848724	         1.878 ns/op
BenchmarkMul/uint32/basic-16                              	  610756	      1963 ns/op
BenchmarkMul/uint32/MultiplyHighOrder32-16                	  590781	      1851 ns/op
BenchmarkMul/uint64/basic-16                              	  109549	     11017 ns/op
BenchmarkMul/uint64/MultiplyHighOrder64-16                	   71486	     19700 ns/op
BenchmarkDivMod/DivMod/3/basic-16                         	 1483515	       809.9 ns/op
BenchmarkDivMod/DivMod/3/DivMod3Signed-16                 	  630376	      1903 ns/op
BenchmarkDivMod/DivMod/3/DivMod3Signed2-16                	 1000000	      1078 ns/op
BenchmarkDivMod/DivMod/7/basic-16                         	 1470925	       813.2 ns/op
BenchmarkDivMod/DivMod/7/DivMod7Signed-16                 	  617737	      1977 ns/op
BenchmarkDivMod/Div/3/basic-16                            	 1474382	       816.0 ns/op
BenchmarkDivMod/Div/3/Div3Signed-16                       	  855132	      1387 ns/op
BenchmarkDivMod/Div/3/Div3ShiftSigned-16                  	  958530	      1245 ns/op
BenchmarkDivMod/Div/7/basic-16                            	 1480100	       808.9 ns/op
BenchmarkDivMod/Div/7/Div7Signed-16                       	  831228	      1438 ns/op
BenchmarkDivMod/Div/7/Div7ShiftSigned-16                  	  890666	      1347 ns/op
BenchmarkDivMod/Mod/3/basic-16                            	 1494051	       807.6 ns/op
BenchmarkDivMod/Mod/3/Mod3Signed-16                       	  896326	      1306 ns/op
BenchmarkDivMod/Mod/3/Mod3Signed2-16                      	 1491132	       804.5 ns/op
BenchmarkDivMod/Mod/7/basic-16                            	 1489940	       804.5 ns/op
BenchmarkDivMod/Mod/7/Mod7Signed-16                       	  865971	      1394 ns/op
BenchmarkDivMod/Mod/7/Mod7Signed2-16                      	 1000000	      1108 ns/op
BenchmarkDivMod/Mod/10/basic-16                           	 1470753	       816.9 ns/op
BenchmarkDivMod/Mod/10/Mod10Signed-16                     	 1000000	      1113 ns/op
BenchmarkDivMod/DivExact/7/basic-16                       	 1492069	       803.8 ns/op
BenchmarkDivMod/DivExact/7/DivExact7-16                   	 1452621	       820.2 ns/op
BenchmarkDivMod/DivExact/7/Div7Signed-16                  	  763095	      1558 ns/op
BenchmarkDivMod/DivExact/7/Div7ShiftSigned-16             	  869890	      1382 ns/op
BenchmarkCbrt/basic-16                                    	    4669	    260382 ns/op
BenchmarkCbrt/Cbrt-16                                     	    7956	    150951 ns/op
BenchmarkPow/basic-16                                     	    2311	    520212 ns/op
BenchmarkPow/Pow-16                                       	    6334	    188887 ns/op
BenchmarkLog/uint32/2/basic-16                            	   10000	    120539 ns/op
BenchmarkLog/uint32/2/Log2-16                             	   96388	     12463 ns/op
BenchmarkLog/uint32/10/basic-16                           	   13887	     86154 ns/op
BenchmarkLog/uint32/10/Log10-16                           	   54652	     21962 ns/op
BenchmarkLog/uint64/2/basic-16                            	   10000	    120627 ns/op
BenchmarkLog/uint64/2/Log2-16                             	   81048	     15237 ns/op
BenchmarkLog/uint64/10/basic-16                           	   14146	     84683 ns/op
BenchmarkLog/uint64/10/Log10-16                           	   51010	     23544 ns/op
BenchmarkSqrt/basic-16                                    	  134250	      8954 ns/op
BenchmarkSqrt/SqrtNewton-16                               	   18208	     63912 ns/op
BenchmarkSqrt/SqrtBinarySearch-16                         	    7302	    168653 ns/op
BenchmarkSqrt/SqrtShiftAndSubtract-16                     	   10000	    114464 ns/op
BenchmarkCRC32/basic-16                                   	  181860	      6610 ns/op
BenchmarkCRC32/CRC32Basic-16                              	     434	   2778007 ns/op
BenchmarkCRC32/CRC32TableLookup-16                        	    7068	    168972 ns/op
BenchmarkRSqrtFloat32/basic-16                            	  139954	      8663 ns/op
BenchmarkRSqrtFloat32/RSqrtFloat32-16                     	   99153	     12287 ns/op
PASS
ok  	github.com/nikolaydubina/go-hackers-delight	83.522s

[^1]: showcase in C — https://github.com/hcs0/Hackers-Delight [^2]: showcase in Rust — https://github.com/victoryang00/Delight [^4]: we are comparing native float64 result converted to uint32, as there is no better standard function [^5]: which is due to float64 not having enough precision

Documentation ¶

Overview ¶

Chapter 2: Basics ¶

For better performance, it is advisable to avoid branches in simple functions.

Right-to-Left Computability — a function mapping words to words can be implemented with word-parallel `add`, `substract`, `and`, `or` and `not` instructions if and only if each bit of of the result depends only on the bits at and to the right of each input operand. All these operands themselves depend only on right bits, so any of their composition also depends on the right bits.

It can be shown that in the ordinary addition of binary numbers with each bit independently equally likely to be 0 or 1, a carry occurs at each position with probability about 0.5.

If instruction set includes an instruction for each of 16 Boolean functions of two variables, then any Boolean function of three variables can be implemented with four or fewer instructions (and four variables with seven instructions).

Comparison functions are a couple branch-free instructions that store result in the most significant bit.

Overflow means result of operation is too large or too small to fit into the variable. Many hardware supply bits for overflow, but high-lever languages may not have access to it (rejected Go overflow proposal).

Chapter 8: Multiplication ¶

Any multiplication can be decomposed in a summation of left shifts. For example,

13 = 0b1101
x * 13 = 8x + 4x + x
t1 := x << 2
t2 := x << 3
result := t1 + t2 + x

For every multiplication there are multiple paths possible that utilize registers and shifts. These paths can have fewer or more instructions and registers to accomplish this. In general, it is nontrivial to find minimum number instructions required and which these instructions are. Following theorem gives upper bound: "Multiplication of a variable x by an n-bit constant m, m >= 1, can be accomplished with at most n instructions of the type add, subtract, and shift left by any given amount."

Chapter 9: Integer Division ¶

Short Division is division of single word by another (e.g. 32bit / 32bit => 32bit). This is typical division operator available in high level languages.
Long Division is division of multi-word by single word (e.g. 64bit / 32bit => 32bit).

Chapter 10: Integer Division by Constants ¶

On many computers, division is very time consuming and is to be avoided. A value of 20 or more elementary Add instructions is usually needed and execution time is usually large. When the divisor is a constant, the division can be replaced by faster operations.

Certain divisors have repetitive patterns in binary representation which allows to use few (~20) shift, add, subtract instructions to approximate residual and get quotient. It is not trivial how to generate such optimal code.

Chapter 12: Unusual Bases for Number Systems ¶

Standard binary representation of an non-negative integer number is base 2.

(aₙ, ... a₃ a₂, a₁, a₀) = aₙ ⋅ 2ⁿ + ... + a₃ ⋅ 2³ + a₂ ⋅ 2² + a₁ ⋅ 2¹ + a₀ ⋅ 2⁰

However, there are alternative bases, including complex numbers.

(aₙ, ... a₃ a₂, a₁, a₀) = aₙ ⋅ -2ⁿ + ... + a₃ ⋅ -2³ + a₂ ⋅ -2² + a₁ ⋅ -2¹ + a₀ ⋅ -2⁰
(aₙ, ... a₃ a₂, a₁, a₀) = aₙ ⋅ (-1 + 𝑖)ⁿ + ... + a₃ ⋅ (-1 + 𝑖)³ + a₂ ⋅ (-1 + 𝑖)² + a₁ ⋅ (-1 + 𝑖)¹ + a₀ ⋅ (-1 + 𝑖)⁰
(aₙ, ... a₃ a₂, a₁, a₀) = aₙ ⋅ (-1 - 𝑖)ⁿ + ... + a₃ ⋅ (-1 - 𝑖)³ + a₂ ⋅ (-1 - 𝑖)² + a₁ ⋅ (-1 - 𝑖)¹ + a₀ ⋅ (-1 - 𝑖)⁰

Which base is the most efficient? It can be argued that under standard assumptions of hardware design (e.g. "cost of b-state circuit" is proportional to b) the most efficient base is 𝑒. However, it is not practical. Meanwhile, base 2 is only 5% more costly than base 3 and only 6% more costly than base 𝑒. Which is an interesting result, given base 2 is both convenient to use and is theoretically optimal.

Chapter 18: Primes ¶

It is known, that there is no polynomial f(n) that produces a prime for every n.

Index ¶

Constants
Variables
func Abs[T Signed](x T) T
func Abs2[T Signed](x T) T
func Abs3[T Signed](x T) T
func Abs4[T Signed](x T) T
func AbsDiff[T Integer](x, y T) T
func AbsDiff2[T Signed](x, y T) T
func AbsDiffUnsigned(x, y uint32) uint32
func AbsFastMul(x int32) int32
func AddDoubleLength(x [2]uint32, y [2]uint32) [2]uint32
func AddFourUint8(x, y uint32) uint32
func AddTwoUint16(x, y uint32) uint32
func AvgCeil[T Integer](x, y T) T
func AvgFloor[T Integer](x, y T) T
func BitSize32(x int32) uint8
func CLPTwo(x uint32) uint32
func CLPTwo2(x uint32) uint32
func CLPTwo3(x uint32) uint32
func CRC32Basic(data []byte) uint32
func CRC32TableLookup(data []byte) uint32
func CSA[T Unsigned](a, b, c T) (h, l T)
func Cbrt(x uint32) uint32
func CheckBits(u uint32) uint32
func CompareCountOnes(x, y uint32) int
func Compress(x, m uint32) uint32
func Compress2[T Unsigned](x, m T) T
func Correct(pr, u uint32) (cr uint32, errb uint8)
func CountOnes(x uint32) uint32
func CountOnes1(x uint32) uint32
func CountOnes3(x uint32) uint32
func CountOnes4[T Unsigned](x T) T
func CountOnes5(x uint32) uint32
func CountOnesArrayGroup8(A []uint32) uint32
func CycleThreeIdentifier(vs [3]int32, n1, n2 uint8) [2][3]int32
func CycleThreeValues(a, b, c, x int32) int32
func DifferenceOrZero[T Signed](x, y T) T
func DifferenceOrZeroRanges[T int32 | uint32](x, y T) T
func DifferenceOrZeroUnsigned(x, y uint32) uint32
func Div1000ShiftSigned(n int32) int32
func Div1000ShiftUnsigned(n uint32) uint32
func Div100ShiftSigned(n int32) int32
func Div100ShiftUnsigned(n uint32) uint32
func Div10ShiftSigned(n int32) int32
func Div10ShiftUnsigned(n uint32) uint32
func Div11ShiftSigned(n int32) int32
func Div11ShiftUnsigned(n uint32) uint32
func Div12ShiftSigned(n int32) int32
func Div12ShiftUnsigned(n uint32) uint32
func Div13ShiftSigned(n int32) int32
func Div13ShiftUnsigned(n uint32) uint32
func Div3ShiftSigned(n int32) int32
func Div3ShiftUnsigned(n uint32) uint32
func Div3Signed(n int32) int32
func Div5ShiftSigned(n int32) int32
func Div5ShiftUnsigned(n uint32) uint32
func Div5Signed(n int32) int32
func Div6ShiftSigned(n int32) int32
func Div6ShiftUnsigned(n uint32) uint32
func Div7ShiftSigned(n int32) int32
func Div7ShiftUnsigned(n uint32) uint32
func Div7Signed(n int32) int32
func Div9ShiftSigned(n int32) int32
func Div9ShiftUnsigned(n uint32) uint32
func DivExact[T Integer](n, d T) T
func DivExact7(n int32) int32
func DivMod3Signed(n int32) (q, r int32)
func DivMod3Signed2(n int32) (q, r int32)
func DivMod5Signed(n int32) (q, r int32)
func DivMod7Signed(n int32) (q, r int32)
func DivModConstSigned(n, d int32) (q, r int32)
func DivModConstSignedMagic(d int32) (M, s int32)
func DivModConstUnsigned(n, d uint32) (q, r uint32)
func DivModConstUnsignedMagic(d uint32) (M uint32, a, s int32)
func DivModLongUnsigned64b32b(x uint64, y uint32) (q, r uint32)
func DivModLongUnsigned64b32b2(x uint64, y uint32) (q, r uint32)
func DivModMultiWordUnsigned(q, r, u, v []uint16)
func DivModSignedPowTwo(n int32, k int) (q, r int32)
func DivModUnsignedPowTwo(n uint32, k int) (q, r uint32)
func DivModUnsignedSeven(n uint32) (q, r uint32)
func DivModUnsignedThree(n uint32) (q, r uint32)
func DivSignedPowTwo(n int32, k int) int32
func DivSignedPowTwo2(n int32, k int) int32
func DivSignedPowTwo_fixed5(n int32) int32
func DoubleLengthInt32FromUint64(x uint64) [2]uint32
func Equal[T Signed](x, y T) T
func Equal2(x, y int32) int32
func Equal3(x, y int32) int32
func Equal4(x, y int32) int32
func Equal5[T Integer](x, y T) T
func EqualZero[T Signed](x T) T
func EqualZero2(x int32) int32
func EqualZero3(x int32) int32
func EqualZero4[T Integer](x T) T
func EqualZero5[T Integer](x T) T
func ExchangeBitsInRegister[T Unsigned](x, md, mo T, k int) T
func ExchangeBitsInRegisterFast[T Unsigned](x, md, mo T, k int) T
func ExchangeRegisters[T Integer](x, y T) (T, T)
func ExchangeRegistersMasked[T Integer](x, y, m T) (T, T)
func ExchangeRegistersMasked2[T Integer](x, y, m T) (T, T)
func ExchangeRegistersMasked3[T Integer](x, y, m T) (T, T)
func ExchangeRegistersMasked4[T Integer](x, y, m T) (T, T)
func Expand(x, m uint32) uint32
func ExtendSign7(x uint32) uint32
func ExtendSign7Three(x uint32) uint32
func ExtendSign7Two(x uint32) uint32
func FLPTwo(x uint32) uint32
func FLPTwo2(x uint32) uint32
func FLPTwo3(x uint32) uint32
func FLPTwo4(x uint32) uint32
func FLPTwo5(x uint32) uint32
func FindFirstStringOnes(x uint32, n int) int
func FindFirstStringOnes1(x uint32, n int) int
func FindInByte(x uint32, m byte) int
func FindInByteEq(x, y uint32) int
func FirstOneOffDifferentBits[T int32 | uint32](vs [3]T) [3]int8
func FourUint8FromUint32(x uint32) [4]uint8
func FourUint8ToUint32(x [4]uint8) uint32
func GrayCodeFromUint[T Unsigned](n T) T
func GrayCodeToUint16(g uint16) (b uint16)
func GrayCodeToUint32(g uint32) (b uint32)
func GrayCodeToUint64(g uint64) (b uint64)
func HigherEqualZero[T Signed](x T) T
func HigherZero[T Signed](x T) T
func HigherZero2[T Signed](x T) T
func HigherZero3[T Signed](x T) T
func ISIGN[T Signed](x, y T) T
func ISIGN2[T Signed](x, y T) T
func ISIGN3[T Signed](x, y T) T
func ISIGN4[T Signed](x, y T) T
func IncrReversed(x uint32) uint32
func Int64FromNx16b(v []uint16) int64
func IntToNx16b[T int | int32 | int64 | uint | uint32 | uint64](v T) []uint16
func IsAddOverflow[T Signed](x, y T) T
func IsAddOverflow2[T Signed](x, y T) T
func IsAddOverflowUnsigned[T Unsigned](x, y T) T
func IsAddOverflowUnsigned2[T Unsigned](x, y T) T
func IsAddOverflowUnsigned3[T Unsigned](x, y T) bool
func IsAddOverflowUnsigned4[T Unsigned](x, y, sum T) bool
func IsDivExactSigned(n, d int32) bool
func IsDivExactUnsigned(n, d uint32) bool
func IsDivExactUnsignedOdd(n, d uint32) bool
func IsDivOverflow(x, y int32) bool
func IsDivOverflowUnsigned[T Unsigned](x, y T) bool
func IsInRange[T Integer](x, a, b T) bool
func IsInRangeClosedOpen[T Integer](x, a, b T) bool
func IsInRangeOpen[T Integer](x, a, b T) bool
func IsInRangeOpen2[T Integer](x, a, b T) bool
func IsInRangeOpenClosed[T Integer](x, a, b T) bool
func IsInRangePowerTwo[T Unsigned](x T, n int) bool
func IsInRangePowerTwoOffset[T Unsigned](x, a T, n int) bool
func IsMostSignificantSet[T int32 | uint32](x T) bool
func IsMulOverflow(x, y int32) bool
func IsMulOverflowUnsigned(x, y uint32) bool
func IsMulOverflowUnsignedNLZ(x, y uint32) bool
func IsPowerOfTwoBoundaryCrossed[T Unsigned](a, l, b T) bool
func IsPowerOfTwoBoundaryCrossed2[T Unsigned](a, l, b T) bool
func IsPowerOfTwoBoundaryCrossed3[T Unsigned](a, l, b T) bool
func IsPowerOfTwoBoundaryCrossed4[T Unsigned](a, l, b T) bool
func IsSubOverflow[T Signed](x, y T) T
func IsSubOverflow2[T Signed](x, y T) T
func IsSubOverflowUnsigned[T Unsigned](x, y T) T
func IsSubOverflowUnsigned2[T Unsigned](x, y T) T
func IsSubOverflowUnsigned3[T Unsigned](x, y T) bool
func IsSubOverflowUnsigned4[T Unsigned](x, y, sub T) bool
func IsolateRightmostOneBit[T Signed](x T) T
func LeadingZerosEqual[T Unsigned](x, y T) bool
func LeadingZerosLess[T Unsigned](x, y T) bool
func LeadingZerosLessOrEqual[T Unsigned](x, y T) bool
func LeadingZerosUint32(x uint32) uint8
func LeadingZerosUint32BinarySearch(x uint32) uint8
func LeadingZerosUint64(x uint64) uint8
func LenLongestStringOnes[T Unsigned](x T) uint8
func LenShortestStringOnes(x uint32) (p, n uint8)
func Less[T Signed](x, y T) T
func Less2[T Signed](x, y T) T
func Less3[T Signed](x, y T) T
func Less4[T Integer](x, y T) T
func LessOrEqual[T Signed](x, y T) T
func LessOrEqual2[T Signed](x, y T) T
func LessOrEqualUnsigned[T Unsigned](x, y T) T
func LessOrEqualZero[T Signed](x T) T
func LessOrEqualZero2[T Signed](x T) T
func LessUnsigned[T Unsigned](x, y T) T
func LessUnsigned2[T Unsigned](x, y T) T
func LessZero[T Signed](x T) T
func Log10x32(x uint32) uint32
func Log10x64(x uint64) uint64
func Log2(x uint32) uint32
func Log2x64(x uint64) uint64
func LoopDetectionGosper(f func(int) int, x0 int) (μLower, μUpper, λ int)
func Max[T Signed](x, y T) T
func MaxAND(minx, maxx, miny, maxy uint32) uint32
func MaxOR(minx, maxx, miny, maxy uint32) uint32
func MaxRanges[T int32 | uint32](x, y T) T
func Min[T Signed](x, y T) T
func MinAND(minx, maxx, miny, maxy uint32) uint32
func MinOR(minx, maxx, miny, maxy uint32) uint32
func MinRanges[T int32 | uint32](x, y T) T
func Mod10Signed(n int32) int32
func Mod10Unsigned(n uint32) uint32
func Mod3Signed(n int32) int32
func Mod3Signed2(n int32) int32
func Mod3Unsigned(n uint32) uint32
func Mod3Unsigned2(n uint32) uint32
func Mod3Unsigned3(n uint32) uint32
func Mod3Unsigned4(n uint32) uint32
func Mod3Unsigned5(n uint32) uint32
func Mod3Unsigned6(n uint32) uint32
func Mod5Signed(n int32) int32
func Mod5Signed2(n int32) int32
func Mod5Unsigned(n uint32) uint32
func Mod5Unsigned2(n uint32) uint32
func Mod63Unsigned(n uint32) uint32
func Mod63Unsigned2(n uint32) uint32
func Mod7Signed(n int32) int32
func Mod7Signed2(n int32) int32
func Mod7Unsigned(n uint32) uint32
func Mod7Unsigned2(n uint32) uint32
func Mod9Signed(n int32) int32
func Mod9Signed2(n int32) int32
func Mod9Unsigned(n uint32) uint32
func Mod9Unsigned2(n uint32) uint32
func MulMultiWord(w, u, v []uint16)
func MultiplicativeInverseEuclid(d uint32) uint32
func MultiplicativeInverseNewton[T Integer](d T) T
func MultiplyHighOrder32[T uint32 | int32](u, v T) T
func MultiplyHighOrder64[T uint64 | int64](u, v T) T
func NAbs[T Signed](x T) T
func NAbs2[T Signed](x T) T
func NAbs3[T Signed](x T) T
func NewCRC32TableLookupTable() [256]uint32
func NextHigherNumberWithSameNumberOfOnes[T Unsigned](x T) T
func NotEqual[T Signed](x, y T) T
func NotEqual2(x, y int32) int32
func NotEqual3[T Integer](x, y T) T
func NotEqualZero[T Signed](x T) T
func NotEqualZero2(x int32) int32
func NotEqualZero3[T Integer](x T) T
func NotEqualZero4[T int32 | uint32](x T) T
func Parity(x uint32) uint32
func Parity2(x uint32) uint32
func Parity3(x uint32) uint32
func Parity4(x uint32) uint32
func Pow[T Integer](x T, n uint) T
func RSqrtFloat32(x float32) float32
func Reverse8Bit(x uint32) uint32
func ReverseBits(x uint32) uint32
func ReverseBits2(x uint32) uint32
func ReverseBits64Knuth(x uint64) uint64
func ReverseByte(b byte) byte
func RotateLeft(x uint32, n int) uint32
func RotateRight(x uint32, n int) uint32
func RoundDownBlockPowerOfTwo[T Unsigned](x T, k uint8) T
func RoundDownBlockPowerOfTwo2[T Unsigned](x T, k uint8) T
func RoundUpBlockPowerOfTwo[T Unsigned](x T, k uint8) T
func RoundUpBlockPowerOfTwo2[T Unsigned](x T, k uint8) T
func SetBitLastZero[T Signed](x T) T
func SetTrailingOnesWithRightMostOne[T Signed](x T) T
func SetTrailingZeros[T Signed](x T) T
func SetTrailingZeros2[T Signed](x T) T
func SetTrailingZeros3[T Signed](x T) T
func SetTrailingZerosWithRightMostOne[T Signed](x T) T
func SetupCycleThreeValues(a, b, c int32) (c1, c2, c3 int32, n1, n2 uint8)
func SetupCycleThreeValuesN1N2(a, b, c int32) (na, nb, nc int32, n1, n2 uint8)
func ShiftLeftDoubleLength(x [2]uint32, n int) [2]uint32
func ShiftRightSigned32[T int32 | uint32](x T, v int) T
func ShiftRightSigned64[T int64 | uint64](x T, v int) T
func ShiftRightSignedDoubleLength(x [2]uint32, n int) [2]uint32
func ShiftRightSignedFromUnsigned(x uint32, n int) uint32
func ShiftRightSignedFromUnsigned2(x uint32, n int) uint32
func ShiftRightSignedFromUnsigned3(x uint32, n int) uint32
func ShiftRightSignedFromUnsigned4(x uint32, n int) uint32
func ShiftRightSignedFromUnsigned5(x uint32, n int) uint32
func ShiftRightUnsigned32[T int32 | uint32](x T, v int) T
func ShiftRightUnsigned64[T int64 | uint64](x T, v int) T
func ShiftRightUnsignedDoubleLength(x [2]uint32, n int) [2]uint32
func ShuffleInner(x uint32) uint32
func ShuffleOuter(x uint32) uint32
func Sign(x int32) int32
func SqrtBinarySearch(x uint32) uint32
func SqrtNewton(x uint32) uint32
func SqrtShiftAndSubtract(x uint32) uint32
func SubDoubleLength(x [2]uint32, y [2]uint32) [2]uint32
func SubFourUint8(x, y uint32) uint32
func SubTwoUint16(x, y uint32) uint32
func Syndrome(pr, u uint32) uint32
func TrailingZerosUint32(x uint32) uint8
func TransposeMatrix8bx8b(A, B []byte, m, n int)
func TurnOffRightMostBit[T Signed](x T) T
func TurnOffRightmostOnes[T Signed](x T) T
func TurnOffRightmostOnes2[T Signed](x T) T
func TurnOffTrailingOnes[T Signed](x T) T
func TurnOnRightMostBit[T Signed](x T) T
func TurnOnTrailingZeros[T Signed](x T) T
func TwoUint16FromUint32(x uint32) [2]uint16
func TwoUint16ToUint32(x [2]uint16) uint32
func Uint64FromNx16b(v []uint16) uint64
func UnShuffleInner(x uint32) uint32
func UnShuffleOuter(x uint32) uint32
func ZByteL(x uint32) int
func ZByteL1(x uint32) int
func ZByteL64(x uint64) int
type Integer
type LRUCache
- func (c *LRUCache) Hit(i uint8)
- func (c *LRUCache) LeastRecentlyUsed() uint8
type Signed
type Unsigned

Examples ¶

Abs
AbsDiff
AddDoubleLength
AddFourUint8
AddTwoUint16
AvgCeil
AvgFloor
BitSize32 (Bit32)
BitSize32 (Bit6)
BitSize32 (Bit7)
BitSize32 (Zero)
CLPTwo
Cbrt
CheckBits (No_error)
CompareCountOnes (Left_bigger)
CompareCountOnes (Left_smaller)
CompareCountOnes (Same)
Compress
CountOnes
CountOnesArrayGroup8
CycleThreeValues
DivSignedPowTwo_fixed5
Equal (Equal)
Equal (Not_equal)
ExchangeBitsInRegister
ExchangeBitsInRegisterFast
ExchangeRegistersMasked
ExchangeRegistersMasked (Bits)
ExtendSign7 (Extended)
ExtendSign7 (NotExtended)
ExtendSign7Three (Extended)
ExtendSign7Three (NotExtended)
ExtendSign7Two (Extended)
ExtendSign7Two (NotExtended)
FLPTwo
FindFirstStringOnes
FindInByte
FindInByteEq
FirstOneOffDifferentBits
GrayCodeFromUint
ISIGN
IntToNx16b (Small)
IsAddOverflow
IsAddOverflowUnsigned
IsAddOverflowUnsigned4
IsMostSignificantSet (Int32)
IsMostSignificantSet (Uint32)
IsPowerOfTwoBoundaryCrossed (Crossed)
IsPowerOfTwoBoundaryCrossed (Not_crossed)
IsSubOverflow
IsSubOverflowUnsigned
IsSubOverflowUnsigned4
IsolateRightmostOneBit
LeadingZerosUint32
LeadingZerosUint32 (Long)
LenLongestStringOnes
LenShortestStringOnes
LenShortestStringOnes (Zero)
Log2 (Four)
Log2 (Nine)
Log2 (One)
Log2 (Zero)
LoopDetectionGosper
MulMultiWord
NAbs
NextHigherNumberWithSameNumberOfOnes
Parity (Even)
Parity (Odd)
Reverse8Bit
ReverseBits
ReverseBits64Knuth
ReverseByte
RotateLeft
RotateRight
RoundDownBlockPowerOfTwo
RoundUpBlockPowerOfTwo
SetBitLastZero
SetTrailingOnesWithRightMostOne
SetTrailingZeros
SetTrailingZeros2
SetTrailingZeros3
SetTrailingZerosWithRightMostOne
SetupCycleThreeValues
ShiftLeftDoubleLength
ShiftRightSignedDoubleLength
ShiftRightSignedDoubleLength (Negative)
ShiftRightSignedFromUnsigned
ShiftRightSignedFromUnsigned2
ShiftRightSignedFromUnsigned3
ShiftRightSignedFromUnsigned4
ShiftRightSignedFromUnsigned5
ShiftRightUnsignedDoubleLength
ShuffleInner
ShuffleOuter
Sign
SubDoubleLength
SubFourUint8
SubTwoUint16
TrailingZerosUint32
TransposeMatrix8bx8b
TurnOffRightMostBit
TurnOffRightmostOnes
TurnOffRightmostOnes2
TurnOffTrailingOnes
TurnOnRightMostBit
TurnOnTrailingZeros
ZByteL
ZByteL64

Constants ¶

View Source

const SqrtFloat32ErrorRate float32 = 0.035

SqrtFloat32ErrorRate is error rate (abs(true_value - result) / result) that RSqrtFloat32 produces.

Variables ¶

View Source

var CRC32Generator uint32 = 0x4C11DB7

CRC32Generator is Cycle Redundancy Check generator polynomial. Each bit i corresponds to the 1/0 of coefficient at x^i in polynomial. Among others, it is used in following standards: ISO 3309 (HDLC), ANSI X3.66 (ADCCP), ISO/IEC/IEEE 802-3 (Ethernet), MPEG-2, PKZIP, Gzip, Bzip2, POSIX cksum, PNG.

View Source

var CRC32GeneratorReversed uint32 = 0xEDB88320

View Source

var InterestingPrimes = [...]uint{
	(1 << 32) - 5,
}

View Source

var PowerOfTwo = [...]uint{
	1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192,
	16384, 32768, 65536, 131072, 262144, 524288, 1048576, 2097152,
	4194304, 8388608, 16777216, 33554432, 67108864, 134217728, 268435456,
	536870912, 1073741824, 2147483648, 4294967296, 8589934592,
	17179869184, 34359738368, 68719476736, 137438953472, 274877906944,
	549755813888, 1099511627776, 2199023255552, 4398046511104, 8796093022208,
	17592186044416, 35184372088832, 70368744177664, 140737488355328,
	281474976710656, 562949953421312, 1125899906842624, 2251799813685248,
	4503599627370496, 9007199254740992, 18014398509481984, 36028797018963968,
	72057594037927936, 144115188075855872, 288230376151711744, 576460752303423488,
	1152921504606846976, 2305843009213693952, 4611686018427387904, 9223372036854775808,
}

PowerOfTwo that fit into uint64.

Functions ¶

func Abs ¶

func Abs[T Signed](x T) T

Abs can be computed in three or four branch-free instructions.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Print(hd.Abs(-42))
}

Output:

42

func Abs2 ¶

func Abs2[T Signed](x T) T

func Abs3 ¶

func Abs3[T Signed](x T) T

func Abs4 ¶

func Abs4[T Signed](x T) T

func AbsDiff ¶

func AbsDiff[T Integer](x, y T) T

AbsDiff is absolute difference that does not overflow.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Print(hd.AbsDiff(1, 100))
}

Output:

99

func AbsDiff2 ¶

func AbsDiff2[T Signed](x, y T) T

func AbsDiffUnsigned ¶

func AbsDiffUnsigned(x, y uint32) uint32

func AbsFastMul ¶

func AbsFastMul(x int32) int32

AbsFastMul when you have fast instruction for +1/-1 multiplication.

func AddDoubleLength ¶

func AddDoubleLength(x [2]uint32, y [2]uint32) [2]uint32

AddDoubleLength of uint64 numbers encoded in two uint32. This takes nine instructions.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	x := [2]uint32{310, 410}
	y := [2]uint32{100, 200}
	fmt.Println(hd.AddDoubleLength(x, y))
}

Output:

[410 610]

func AddFourUint8 ¶

func AddFourUint8(x, y uint32) uint32

AddFourUint8 executes in eight branch-free instructions

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	x := hd.FourUint8ToUint32([4]uint8{1, 2, 254, 4})
	y := hd.FourUint8ToUint32([4]uint8{2, 3, 1, 5})

	sum := hd.AddFourUint8(x, y)

	fmt.Println(hd.FourUint8FromUint32(sum))
}

Output:

[3 5 255 9]

func AddTwoUint16 ¶

func AddTwoUint16(x, y uint32) uint32

AddFourUint8 executes in seven branch-free instructions

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	x := hd.TwoUint16ToUint32([2]uint16{65535 - 1, 4})
	y := hd.TwoUint16ToUint32([2]uint16{1, 5})

	sum := hd.AddTwoUint16(x, y)

	fmt.Println(hd.TwoUint16FromUint32(sum))
}

Output:

[65535 9]

func AvgCeil ¶

func AvgCeil[T Integer](x, y T) T

AvgCeil does not cause overflow. Rounded up. In Go, when left operand of shift operator is signed int, then arithmetic shift is performed. Arithmetic shift is same thing as signed shift. Since in Go right shift is unsigned, then formula for int32 is the same as for uint32.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Print(hd.AvgCeil[int32](-101, -200))
}

Output:

-150

func AvgFloor ¶

func AvgFloor[T Integer](x, y T) T

AvgFloor does not cause overflow. Rounded down. This is same as unsigned version, but uses signed-shift right. In Go, when left operand of shift operator is signed int, then arithmetic shift is performed. Since in Go right shift is unsigned, then formula for int32 is the same as for uint32.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Print(hd.AvgFloor[int32](-101, -200))
}

Output:

-151

func BitSize32 ¶

func BitSize32(x int32) uint8

BitSize32 returns minimum number of bits requires to represent number in two's complement signed number.

Example (Bit32) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.BitSize32(0xFFFF_FFFF >> 1))
}

Output:

32

Example (Bit6) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.BitSize32(0b0000_1101))
}

Output:

5

Example (Bit7) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.BitSize32(0b0011_1101))
}

Output:

7

Example (Zero) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.BitSize32(0))
}

Output:

0

func CLPTwo ¶

func CLPTwo(x uint32) uint32

CLPTwo is Ceil to nearest Power of Two. Values >= 2^31 will be zero.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.CLPTwo(200))
}

Output:

256

func CLPTwo2 ¶

func CLPTwo2(x uint32) uint32

func CLPTwo3 ¶

func CLPTwo3(x uint32) uint32

CLPTwo3 is alternative branch-free version when NLZ is not available.

func CRC32Basic ¶

func CRC32Basic(data []byte) uint32

CRC32Basic computes CRC 32bit checksum one bit a time.

func CRC32TableLookup ¶

func CRC32TableLookup(data []byte) uint32

CRC32TableLookup uses precomputed values.

func CSA ¶

func CSA[T Unsigned](a, b, c T) (h, l T)

CSA is Carry Save Adder

func Cbrt ¶

func Cbrt(x uint32) uint32

Cbrt is cube root, it starts of with shift based Newton algorithm for Sqrt and modifies it to cuber root

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.Cbrt(64))
}

Output:

4

func CheckBits ¶

func CheckBits(u uint32) uint32

CheckBits produces 6bits of parity checks. It is simple Hamming Code based error detection and correction scheme.

Example (No_error) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	v := uint32(0xABCD_EF01)
	p := hd.CheckBits(v)
	u, errb := hd.Correct(p, v)
	fmt.Printf("%0x %v", u, errb)
}

Output:

abcdef01 0

func CompareCountOnes ¶

func CompareCountOnes(x, y uint32) int

Example (Left_bigger) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	var x uint32 = 0b0000_0100_0100_01111
	var y uint32 = 0b0000_0100_0100_01110
	fmt.Println(hd.CompareCountOnes(x, y))
}

Output:

1

Example (Left_smaller) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	var x uint32 = 0b0000_0100_0100_01110
	var y uint32 = 0b0000_0100_0100_01111
	fmt.Println(hd.CompareCountOnes(x, y))
}

Output:

-1

Example (Same) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	var x uint32 = 0b0000_0100_0100_01110
	var y uint32 = 0b0000_0100_0100_01110
	fmt.Println(hd.CompareCountOnes(x, y))
}

Output:

0

func Compress ¶

func Compress(x, m uint32) uint32

Compress (Generalized Extract) selects bits from the x where m is 1 and puts them in order to least significant bits. This uses divide and conquer. This executes in 169 instructions on 64bit RISC. There is also hardware focused version that does not work well in general RISC.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	//          x:   abcd_efgh_ijkl_mnop_qrst_uvwx_yzAB_CDEF
	var x uint32 = 0b1010_1011_0101_1110_0001_0101_0000_1000
	var m uint32 = 0b0000_1111_0011_0011_1010_1010_0101_0101
	// 		  exp:   0000_0000_0000_0000_efgh_klop_qsuw_zBDF
	fmt.Printf("%032b", hd.Compress(x, m))
}

Output:

00000000000000001011011000000000

func Compress2 ¶

func Compress2[T Unsigned](x, m T) T

Compress2 is direct version of Compress. This is 260 RISC instructions in loop brach-free.

func Correct ¶

func Correct(pr, u uint32) (cr uint32, errb uint8)

Correct bits in pr that contain 32 "information" bits stored in ur and 7 "check" bits stored in pr. Check bits are extracted and error is corrected in returned result. This can correct up to 1 wrong bit. This is based on Hamming error correcting code. It returns corrected bits and number of error bits detected (0,1,2). TODO: 2 bits error detection is not working.

func CountOnes ¶

func CountOnes(x uint32) uint32

CountOnes (pop) uses divide and conquer to count number set bits. Folklore says counting ones is important for NSA, but nobody knows why. Applications of this function include Hamming Distance.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.CountOnes5(uint32(0b0000_0100_0100_01110)))
}

Output:

5

func CountOnes1 ¶

func CountOnes1(x uint32) uint32

CountOnes1 is optimized version of divide and conquer that executes in 21 instructions and is branch-free.

func CountOnes3 ¶

func CountOnes3(x uint32) uint32

CountOnes3 it executes in 19 RISC instructions, but works well on machines with two addresses.

func CountOnes4 ¶

func CountOnes4[T Unsigned](x T) T

CountOnes4 is very fast for if number of 1 bits is small.

func CountOnes5 ¶

func CountOnes5(x uint32) uint32

CountOnes5 uses uint64 registers. It works only with uint15.

func CountOnesArrayGroup8 ¶

func CountOnesArrayGroup8(A []uint32) uint32

CountOnesArrayGroup8 utilizes CSA for faster count ones computation. It is executing with 3x less instructions than direct version with CountOnes per uint32.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	vs := []uint32{
		0b0000_0100_0100_01110,
		0b0001_0100_0100_01110,
		0b0000_0100_0100_01110,
		0b0011_0100_0100_01110,
		0b0000_0100_0100_01110,
		0b0111_0100_0100_01110,
		0b1001_0100_0100_01110,
		0b0000_0100_0100_01110,
		0b0101_0100_0100_01110,
		0b0101_0100_0100_01110,
	}
	fmt.Println(hd.CountOnesArrayGroup8(vs))
}

Output:

62

func CycleThreeIdentifier ¶

func CycleThreeIdentifier(vs [3]int32, n1, n2 uint8) [2][3]int32

CycleThreeIdentifier constructs identifier byte from bits stored at positions n1 and n2.

func CycleThreeValues ¶

func CycleThreeValues(a, b, c, x int32) int32

CycleThreeValues is an ingenious and very efficient (8 branch-free instructions) way of cycling in three different constants. It requires pre-computing constants that can be done at compile time. It relies on the fact that among three different constants there are always two bit positions where they differ and each time one-odd-out.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	var c int32 = 0b10101 // 21
	var a int32 = 0b11111 // 31
	var b int32 = 0b10100 // 20

	out := []int32{a}
	for range 10 {
		out = append(out, hd.CycleThreeValues(a, b, c, out[len(out)-1]))
	}

	fmt.Println(out)
}

Output:

[31 20 21 31 20 21 31 20 21 31 20]

func DifferenceOrZero ¶

func DifferenceOrZero[T Signed](x, y T) T

DifferenceOrZero (doz) returns x - y if x > y else 0. It computes in 7-10 branch-free RISC instructions. Note, smaller DifferenceOrZero version with comparison is skipped as Go lacks comparison returning int.

func DifferenceOrZeroRanges ¶

func DifferenceOrZeroRanges[T int32 | uint32](x, y T) T

DifferenceOrZeroRanges requires signed x and y be in range [-2^30, 2^30-1] and unsigned x and y be in range [0, 2^31-1]

func DifferenceOrZeroUnsigned ¶

func DifferenceOrZeroUnsigned(x, y uint32) uint32

DifferenceOrZeroUnsigned (dozu) computes in 7-10 branch-free RISC instructions. Note, smaller DOZ version with comparison is skipped as Go lacks comparison returning int.

func Div1000ShiftSigned ¶

func Div1000ShiftSigned(n int32) int32

Div1000ShiftSigned (divs1000)

func Div1000ShiftUnsigned ¶

func Div1000ShiftUnsigned(n uint32) uint32

Div1000ShiftUnsigned (div1000) is 23 instructions.

func Div100ShiftSigned ¶

func Div100ShiftSigned(n int32) int32

Div100ShiftSigned (divs100)

func Div100ShiftUnsigned ¶

func Div100ShiftUnsigned(n uint32) uint32

Div100ShiftUnsigned (div100) is 25 instructions.

func Div10ShiftSigned ¶

func Div10ShiftSigned(n int32) int32

Div10ShiftSigned (divs10)

func Div10ShiftUnsigned ¶

func Div10ShiftUnsigned(n uint32) uint32

Div10ShiftUnsigned (div10)

func Div11ShiftSigned ¶

func Div11ShiftSigned(n int32) int32

Div11ShiftSigned (divs11)

func Div11ShiftUnsigned ¶

func Div11ShiftUnsigned(n uint32) uint32

Div11ShiftUnsigned (div11)

func Div12ShiftSigned ¶

func Div12ShiftSigned(n int32) int32

Div12ShiftSigned (divs12)

func Div12ShiftUnsigned ¶

func Div12ShiftUnsigned(n uint32) uint32

Div12ShiftUnsigned (div12)

func Div13ShiftSigned ¶

func Div13ShiftSigned(n int32) int32

Div13ShiftSigned (divs13)

func Div13ShiftUnsigned ¶

func Div13ShiftUnsigned(n uint32) uint32

Div13ShiftUnsigned (div13)

func Div3ShiftSigned ¶

func Div3ShiftSigned(n int32) int32

Div3ShiftSigned (divs3)

func Div3ShiftUnsigned ¶

func Div3ShiftUnsigned(n uint32) uint32

Div3ShiftUnsigned (divu3) uses shifts and adds to approximate through residual. It is 18 instructions.

func Div3Signed ¶

func Div3Signed(n int32) int32

Div3Signed this computes in 6 instructions. This is 9 or 10 cycles. Meanwhile, divide operation can take 20 cycles.

func Div5ShiftSigned ¶

func Div5ShiftSigned(n int32) int32

Div5ShiftSigned (divs5)

func Div5ShiftUnsigned ¶

func Div5ShiftUnsigned(n uint32) uint32

Div5ShiftUnsigned (divu5a)

func Div5Signed ¶

func Div5Signed(n int32) int32

Div5Signed is similar to Div3Signed, but error terms is too large, and thus it needs slight variation of magic constant and correction with shift right.

func Div6ShiftSigned ¶

func Div6ShiftSigned(n int32) int32

Div6ShiftSigned (divs6)

func Div6ShiftUnsigned ¶

func Div6ShiftUnsigned(n uint32) uint32

Div6ShiftUnsigned (divu6a)

func Div7ShiftSigned ¶

func Div7ShiftSigned(n int32) int32

Div7ShiftSigned (divs7)

func Div7ShiftUnsigned ¶

func Div7ShiftUnsigned(n uint32) uint32

Div7ShiftUnsigned (divu7)

func Div7Signed ¶

func Div7Signed(n int32) int32

Div7Signed is similar to Div3Signed, but error terms is too large, and thus it needs slight variation of magic constant and correction with shift right. The magic is still too large, and thus it needs flipping sign and addition.

func Div9ShiftSigned ¶

func Div9ShiftSigned(n int32) int32

Div9ShiftSigned (divs9)

func Div9ShiftUnsigned ¶

func Div9ShiftUnsigned(n uint32) uint32

Div9ShiftUnsigned (div9)

func DivExact ¶

func DivExact[T Integer](n, d T) T

DivExact uses multiplicative inverse that can be computed at compile time. It relies on theorem "if a and m are relatively prime integers, then there exists an integer ā such that a*ā = 1 (mod m)".

func DivExact7 ¶

func DivExact7(n int32) int32

func DivMod3Signed ¶

func DivMod3Signed(n int32) (q, r int32)

func DivMod3Signed2 ¶

func DivMod3Signed2(n int32) (q, r int32)

DivMod3Signed2 calculates reminder first.

func DivMod5Signed ¶

func DivMod5Signed(n int32) (q, r int32)

func DivMod7Signed ¶

func DivMod7Signed(n int32) (q, r int32)

func DivModConstSigned ¶

func DivModConstSigned(n, d int32) (q, r int32)

DivModConstSigned illustrates division by constant. This code should be generated at compile time depending on the value of compile time constant k and result of MagicSigned execution. The basic trick is to multiply by magic number 2**32/d and then extract leftmost 32 bits of the product.

func DivModConstSignedMagic ¶

func DivModConstSignedMagic(d int32) (M, s int32)

DivModConstSignedMagic computes magic number and shift amount for signed integer division. This values can be computed at compile time. Code using big numbers can be simplified, such as in the case of Python or math.Big, it is no listed here.

func DivModConstUnsigned ¶

func DivModConstUnsigned(n, d uint32) (q, r uint32)

func DivModConstUnsignedMagic ¶

func DivModConstUnsignedMagic(d uint32) (M uint32, a, s int32)

DivModConstUnsignedMagic computes magic for unsigned constant multiplication. This values can be computed at compile time. Code using big numbers can be simplified, such as in the case of Python or math.Big, it is no listed here.

func DivModLongUnsigned64b32b ¶

func DivModLongUnsigned64b32b(x uint64, y uint32) (q, r uint32)

DivModLongUnsigned64b32b (divlu) performs long division of 64-bit unsigned integer by 32-bit unsigned integer. This algorithm is slightly modified to store both lower and higher 32 bits of dividend into 64-bit number. This algorithm uses shift-and-subtract operations. It illustrates how hardware is doing such division. It does not work for overflow cases. This executes in 321 to 385 RISC instructions.

func DivModLongUnsigned64b32b2 ¶

func DivModLongUnsigned64b32b2(x uint64, y uint32) (q, r uint32)

DivModLongUnsigned64b32b2 is alternative version based on multiword division. If overflow, it returns maximum quotient and reminder.

func DivModMultiWordUnsigned ¶

func DivModMultiWordUnsigned(q, r, u, v []uint16)

DivModMultiWordUnsigned is Knuth algorithm for integer division. It stores quotient in q and remainder in r.

func DivModSignedPowTwo ¶

func DivModSignedPowTwo(n int32, k int) (q, r int32)

DivModSignedPowTwo takes 7 instructions, 5 instructions for division and 2 instructions for reminder.

func DivModUnsignedPowTwo ¶

func DivModUnsignedPowTwo(n uint32, k int) (q, r uint32)

func DivModUnsignedSeven ¶

func DivModUnsignedSeven(n uint32) (q, r uint32)

func DivModUnsignedThree ¶

func DivModUnsignedThree(n uint32) (q, r uint32)

func DivSignedPowTwo ¶

func DivSignedPowTwo(n int32, k int) int32

DivSignedPowTwo returns n / 2 ** k for 1 <= k < 31. This is illustration, for performance k should be fixed at compile time. This is four branch-free instructions.

func DivSignedPowTwo2 ¶

func DivSignedPowTwo2(n int32, k int) int32

DivSignedPowTwo2 is alternative version that uses branch.

func DivSignedPowTwo_fixed5 ¶

func DivSignedPowTwo_fixed5(n int32) int32

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.DivSignedPowTwo_fixed5(96))
}

Output:

3

func DoubleLengthInt32FromUint64 ¶

func DoubleLengthInt32FromUint64(x uint64) [2]uint32

DoubleLengthInt32FromUint64 unpacks two uint32 from single uint64. This is known as Double-Length arithmetics. This can be implemented in five instructions by using only 31 bits and storing carry in most significant bit. Here is only simplified versions.

func Equal ¶

func Equal[T Signed](x, y T) T

Example (Equal) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.Equal(10, 10) >> 31)
}

Output:

-1

Example (Not_equal) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.Equal(10, 11) >> 31)
}

Output:

0

func Equal2 ¶

func Equal2(x, y int32) int32

func Equal3 ¶

func Equal3(x, y int32) int32

func Equal4 ¶

func Equal4(x, y int32) int32

func Equal5 ¶

func Equal5[T Integer](x, y T) T

func EqualZero ¶

func EqualZero[T Signed](x T) T

func EqualZero2 ¶

func EqualZero2(x int32) int32

func EqualZero3 ¶

func EqualZero3(x int32) int32

func EqualZero4 ¶

func EqualZero4[T Integer](x T) T

func EqualZero5 ¶

func EqualZero5[T Integer](x T) T

func ExchangeBitsInRegister ¶

func ExchangeBitsInRegister[T Unsigned](x, md, mo T, k int) T

ExchangeBitsInRegister swaps two regions of bits within single register without affecting other bits. For example, swapping B and D regions. B and D should be same size, but A C E and B/D can be different sizes. k is number of bits of C + D md is mask for field D mo is mask for fields A C and E (all fields to not swap) [aaaa bbbb cccc dddd eeee] -> [aaaa dddd cccc bbbb eeee] This is straight forward approach and requires eleven instructions and six cycles.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	var xx uint32 = 0xABCDEF12
	var mb uint32 = 0x0000F000
	var mo uint32 = 0xF0FF0FFF
	fmt.Printf("%0X", hd.ExchangeBitsInRegister(xx, mb, mo, 4*3))
}

Output:

AECDBF12

func ExchangeBitsInRegisterFast ¶

func ExchangeBitsInRegisterFast[T Unsigned](x, md, mo T, k int) T

ExchangeBitsInRegisterFast requires eight instructions and five cycles.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	var xx uint32 = 0xABCDEF12
	var mb uint32 = 0x0000F000
	var mo uint32 = 0xF0FF0FFF
	fmt.Printf("%0X", hd.ExchangeBitsInRegisterFast(xx, mb, mo, 4*3))
}

Output:

AECDBF12

func ExchangeRegisters ¶

func ExchangeRegisters[T Integer](x, y T) (T, T)

ExchangeRegisters illustrates very old trick on how to exchange two registers without using third one. This is swap operation. Also known as multiple assignment in Go.

func ExchangeRegistersMasked ¶

func ExchangeRegistersMasked[T Integer](x, y, m T) (T, T)

ExchangeRegistersMasked illustrates how to exchange only masked bits between two registers. This can be done in 3 cycles, given parallelism and and-not instructions.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	var x uint32 = 0xABCDEF12
	var y uint32 = 0x12345678
	var m uint32 = 0x0F0F0F0F
	x, y = hd.ExchangeRegistersMasked(x, y, m)
	fmt.Printf("\n%0X\n%0X", x, y)
}

Output:

A2C4E618
1B3D5F72

Example (Bits) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	var x uint32 = 0b11110000111100001111000011110000
	var y uint32 = 0b00001111000011110000111100001111
	var m uint32 = 0b00001111111100000000000000000000

	x, y = hd.ExchangeRegistersMasked(x, y, m)
	fmt.Printf("\n%032b\n%032b", x, y)
}

Output:

11111111000000001111000011110000
00000000111111110000111100001111

func ExchangeRegistersMasked2 ¶

func ExchangeRegistersMasked2[T Integer](x, y, m T) (T, T)

func ExchangeRegistersMasked3 ¶

func ExchangeRegistersMasked3[T Integer](x, y, m T) (T, T)

ExchangeRegistersMasked3 is heavily using bitwise equality, but in Go there is no such operator, so expanding that notation.

func ExchangeRegistersMasked4 ¶

func ExchangeRegistersMasked4[T Integer](x, y, m T) (T, T)

ExchangeRegistersMasked4 also executes in three cycles, given sufficient instruction parallelism in the CPU.

func Expand ¶

func Expand(x, m uint32) uint32

Expand (Generalized Insert, unpack, scatter, deposit) selects bits from the x where m is 1 and puts them in order to least significant bits. This is reverse of Compress steps. This is 168 RISC instructions, or 200 instructions on 64bit RISC. There is also hardware focused version that does not work well in general RISC.

func ExtendSign7 ¶

func ExtendSign7(x uint32) uint32

ExtendSign8 sign-extends a 8-bit number to a 32-bit number. Sign of 8-bit number is stored in 8th-bit. Sign extension is treating n-th least significant bit as sign bit and copying it to all more significant bits. This is usually implemented with shift-left-logical followed by shift-right-arithmetic, but alternative may be useful if you don't have shift.

Example (Extended) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.ExtendSign7(0b1110_1010))
}

Output:

11111111111111111111111111101010

Example (NotExtended) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.ExtendSign7(0b0010_1010))
}

Output:

00101010

func ExtendSign7Three ¶

func ExtendSign7Three(x uint32) uint32

ExtendSign7Three is alternative version

Example (Extended) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.ExtendSign7Three(0b1110_1010))
}

Output:

11111111111111111111111111101010

Example (NotExtended) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.ExtendSign7Three(0b0010_1010))
}

Output:

00101010

func ExtendSign7Two ¶

func ExtendSign7Two(x uint32) uint32

ExtendSign7Two is alternative version. If you know all higher order bits are zero, then `and` can be omitted.

Example (Extended) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.ExtendSign7Two(0b1110_1010))
}

Output:

11111111111111111111111111101010

Example (NotExtended) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.ExtendSign7Two(0b0010_1010))
}

Output:

00101010

func FLPTwo ¶

func FLPTwo(x uint32) uint32

FLPTwo is Floor to nearest Power of Two. Values >= 2^31 will be zero. This formula works for x > 0.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.FLPTwo(310))
}

Output:

256

func FLPTwo2 ¶

func FLPTwo2(x uint32) uint32

FLPTwo2 is alternative version that works for x == 0 as well.

func FLPTwo3 ¶

func FLPTwo3(x uint32) uint32

func FLPTwo4 ¶

func FLPTwo4(x uint32) uint32

FLPTwo4 is alternative branch-free version when NLZ is not available.

func FLPTwo5 ¶

func FLPTwo5(x uint32) uint32

FLPTwo5 uses simple loop that executes in 4 * NLZ(x) + 3 instructions.

func FindFirstStringOnes ¶

func FindFirstStringOnes(x uint32, n int) int

FindFirstStringOnes (FFStr1) finds index of first string of 1s of length n or 32 if none found. This uses divide and conquer and executes in 3 to 36 RISC instructions. This loop can be effectively unrolled.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.FindFirstStringOnes(0b0000_0000_0000_0000_1100_1110_1111_1000, 5))
}

Output:

24

func FindFirstStringOnes1 ¶

func FindFirstStringOnes1(x uint32, n int) int

FindFirstStringOnes1 is is direct algorithm. For worst case this is not good, it is 178 RISC instructions for 0x5555_5555.

func FindInByte ¶

func FindInByte(x uint32, m byte) int

FindInByte m in x illustrates that to find specific byte you need to XOR with repeated value.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.FindInByte(0x12_00_FF_00, 0xFF))
}

Output:

2

func FindInByteEq ¶

func FindInByteEq(x, y uint32) int

FindInByteEq finds index of first byte that x and y are equal.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.FindInByteEq(0x12_00_F9_00, 0x23_01_F9_00))
}

Output:

2

func FirstOneOffDifferentBits ¶

func FirstOneOffDifferentBits[T int32 | uint32](vs [3]T) [3]int8

FirstOneOffDifferentBits computes 1-based position of first least significant bit that is one-off among three values for corresponding value or -1 if there is no such bit.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	var a int32 = 0b11111
	var b int32 = 0b10100
	var c int32 = 0b10101
	fmt.Println(hd.FirstOneOffDifferentBits([3]int32{a, b, c}))
}

Output:

[1 0 -1]

func FourUint8FromUint32 ¶

func FourUint8FromUint32(x uint32) [4]uint8

FourUint8FromUint32 unpacks 4 uint8 from single uint32.

func FourUint8ToUint32 ¶

func FourUint8ToUint32(x [4]uint8) uint32

FourUint8ToUint32 packs 4 uint4 into single uint32.

func GrayCodeFromUint ¶

func GrayCodeFromUint[T Unsigned](n T) T

GrayCodeFromUint converts a binary number to Gray code. Gray Code is sequence of binary numbers where each successive differs only by one bit. It is possible to iterate through all 2ⁿ numbers in single Gray code. There are many versions of Gray Codes, here we use most common one, "reflected binary Gray code". Gray codes named after Frank Gray, physicist at Bell Labs, who invented it in 1930s for TV.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	for i := range 10 {
		fmt.Printf("%08b -> %08b\n", i, hd.GrayCodeFromUint(uint64(i)))
	}
}

Output:

00000000 -> 00000000
00000001 -> 00000001
00000010 -> 00000011
00000011 -> 00000010
00000100 -> 00000110
00000101 -> 00000111
00000110 -> 00000101
00000111 -> 00000100
00001000 -> 00001100
00001001 -> 00001101

func GrayCodeToUint16 ¶

func GrayCodeToUint16(g uint16) (b uint16)

func GrayCodeToUint32 ¶

func GrayCodeToUint32(g uint32) (b uint32)

func GrayCodeToUint64 ¶

func GrayCodeToUint64(g uint64) (b uint64)

func HigherEqualZero ¶

func HigherEqualZero[T Signed](x T) T

func HigherZero ¶

func HigherZero[T Signed](x T) T

func HigherZero2 ¶

func HigherZero2[T Signed](x T) T

func HigherZero3 ¶

func HigherZero3[T Signed](x T) T

func ISIGN ¶

func ISIGN[T Signed](x, y T) T

ISIGN is sign-transfer function, as known in FORTRAN.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.ISIGN(10, -100000), hd.ISIGN(-10, 100000))
}

Output:

-10 10

func ISIGN2 ¶

func ISIGN2[T Signed](x, y T) T

func ISIGN3 ¶

func ISIGN3[T Signed](x, y T) T

func ISIGN4 ¶

func ISIGN4[T Signed](x, y T) T

func IncrReversed ¶

func IncrReversed(x uint32) uint32

IncrReversed value is used in Fast Furier Transform (FFT) when computing reversed and incrementing is used in the loop. However, computing reversed takes 29 instructions, and lookup table for large values of is not practical. Thus storing previous reversed value and incrementing reversed is useful. This executes in 5 RISC instructions.

func Int64FromNx16b ¶

func Int64FromNx16b(v []uint16) int64

func IntToNx16b ¶

func IntToNx16b[T int | int32 | int64 | uint | uint32 | uint64](v T) []uint16

Example (Small) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.IntToNx16b(5))
}

Output:

[5]

func IsAddOverflow ¶

func IsAddOverflow[T Signed](x, y T) T

IsAddOverflow sets most significant bit if overflow occurs. This version does not use carry bit and is efficient.

Example ¶

package main

import (
	"fmt"
	"math"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.IsAddOverflow(int32(math.MaxInt32), 1)>>31, hd.IsAddOverflow(int32(10), 1)>>31)
}

Output:

-1 0

func IsAddOverflow2 ¶

func IsAddOverflow2[T Signed](x, y T) T

func IsAddOverflowUnsigned ¶

func IsAddOverflowUnsigned[T Unsigned](x, y T) T

Example ¶

package main

import (
	"fmt"
	"math"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.IsAddOverflowUnsigned(uint32(math.MaxUint32), 1)>>31, hd.IsAddOverflowUnsigned(uint32(10), 1)>>31)
}

Output:

1 0

func IsAddOverflowUnsigned2 ¶

func IsAddOverflowUnsigned2[T Unsigned](x, y T) T

func IsAddOverflowUnsigned3 ¶

func IsAddOverflowUnsigned3[T Unsigned](x, y T) bool

func IsAddOverflowUnsigned4 ¶

func IsAddOverflowUnsigned4[T Unsigned](x, y, sum T) bool

Example ¶

package main

import (
	"fmt"
	"math"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	var x uint32 = 1
	fmt.Println(hd.IsAddOverflowUnsigned4(math.MaxUint32, 1, math.MaxUint32+x), hd.IsAddOverflowUnsigned4(10, 1, 10+x))
}

Output:

true false

func IsDivExactSigned ¶

func IsDivExactSigned(n, d int32) bool

IsDivExactSigned is simplified version with abs. it has single branch. however, it is not minimal number of instructions.

func IsDivExactUnsigned ¶

func IsDivExactUnsigned(n, d uint32) bool

IsDivExactUnsigned is similar to odd version but transforms divisor d0 * 2^k and performs rotate right trick. It has single branch.

func IsDivExactUnsignedOdd ¶

func IsDivExactUnsignedOdd(n, d uint32) bool

IsDivExactUnsignedOdd tests if n is multiple of d, for odd d. It has single branch.

func IsDivOverflow ¶

func IsDivOverflow(x, y int32) bool

IsDivOverflow uses around seven instructions with branches, and it is not easy to improve this.

func IsDivOverflowUnsigned ¶

func IsDivOverflowUnsigned[T Unsigned](x, y T) bool

func IsInRange ¶

func IsInRange[T Integer](x, a, b T) bool

IsInRange illustrates how to perform signed integer bounds checking (both ends inclusive) in single comparison. This is useful for array bounds checking. Using uint conversion, to force unsigned comparison on signed integers, given Go lacks unsigned comparison operator on signed integers.

func IsInRangeClosedOpen ¶

func IsInRangeClosedOpen[T Integer](x, a, b T) bool

IsInRangeClosedOpen: a <= x < b

func IsInRangeOpen ¶

func IsInRangeOpen[T Integer](x, a, b T) bool

IsInRangeOpen: a < x < b

func IsInRangeOpen2 ¶

func IsInRangeOpen2[T Integer](x, a, b T) bool

func IsInRangeOpenClosed ¶

func IsInRangeOpenClosed[T Integer](x, a, b T) bool

IsInRangeOpenClosed: a < x <= b

func IsInRangePowerTwo ¶

func IsInRangePowerTwo[T Unsigned](x T, n int) bool

IsInRangePowerTwo: 0 <= x <= 2^n - 1

func IsInRangePowerTwoOffset ¶

func IsInRangePowerTwoOffset[T Unsigned](x, a T, n int) bool

IsInRangePowerTwoOffset: a <= x <= a + 2^n - 1

func IsMostSignificantSet ¶

func IsMostSignificantSet[T int32 | uint32](x T) bool

Example (Int32) ¶

package main

import (
	"fmt"
	"math"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.IsMostSignificantSet(int32(-1)), hd.IsMostSignificantSet(int32(1)), hd.IsMostSignificantSet(int32(math.MaxInt32)))
}

Output:

true false false

Example (Uint32) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.IsMostSignificantSet(uint32(0xFFFFFFFF)), hd.IsMostSignificantSet(uint32(10)))
}

Output:

true false

func IsMulOverflow ¶

func IsMulOverflow(x, y int32) bool

func IsMulOverflowUnsigned ¶

func IsMulOverflowUnsigned(x, y uint32) bool

func IsMulOverflowUnsignedNLZ ¶

func IsMulOverflowUnsignedNLZ(x, y uint32) bool

IsMulOverflowUnsignedNLZ counts number of leading zeros to detect overflow.

func IsPowerOfTwoBoundaryCrossed ¶

func IsPowerOfTwoBoundaryCrossed[T Unsigned](a, l, b T) bool

IsPowerOfTwoBoundaryCrossed checks if adding l to a crosses power of block of size b. b size has to be power of two. This and versions bellow are five or six RISC instructions. b has to be power of two and likely to be a constant.

Example (Crossed) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.IsPowerOfTwoBoundaryCrossed[uint32](100, 300, 256))
}

Output:

true

Example (Not_crossed) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.IsPowerOfTwoBoundaryCrossed[uint32](100, 100, 256))
}

Output:

false

func IsPowerOfTwoBoundaryCrossed2 ¶

func IsPowerOfTwoBoundaryCrossed2[T Unsigned](a, l, b T) bool

func IsPowerOfTwoBoundaryCrossed3 ¶

func IsPowerOfTwoBoundaryCrossed3[T Unsigned](a, l, b T) bool

func IsPowerOfTwoBoundaryCrossed4 ¶

func IsPowerOfTwoBoundaryCrossed4[T Unsigned](a, l, b T) bool

func IsSubOverflow ¶

func IsSubOverflow[T Signed](x, y T) T

IsSubOverflow sets most significant bit if overflow occurs. This version does not use carry bit and is efficient.

Example ¶

package main

import (
	"fmt"
	"math"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.IsSubOverflow2(int32(math.MinInt32), 1)>>31, hd.IsSubOverflow2(int32(-10), 1)>>31)
}

Output:

-1 0

func IsSubOverflow2 ¶

func IsSubOverflow2[T Signed](x, y T) T

func IsSubOverflowUnsigned ¶

func IsSubOverflowUnsigned[T Unsigned](x, y T) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.IsSubOverflowUnsigned(uint32(0), 1)>>31, hd.IsSubOverflowUnsigned(uint32(10), 1)>>31)
}

Output:

1 0

func IsSubOverflowUnsigned2 ¶

func IsSubOverflowUnsigned2[T Unsigned](x, y T) T

func IsSubOverflowUnsigned3 ¶

func IsSubOverflowUnsigned3[T Unsigned](x, y T) bool

func IsSubOverflowUnsigned4 ¶

func IsSubOverflowUnsigned4[T Unsigned](x, y, sub T) bool

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	var x uint32 = 1
	fmt.Println(hd.IsSubOverflowUnsigned4(0, 1, 0-x), hd.IsSubOverflowUnsigned4(10, 1, 10-x))
}

Output:

true false

func IsolateRightmostOneBit ¶

func IsolateRightmostOneBit[T Signed](x T) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.IsolateRightmostOneBit(0b01011000))
}

Output:

00001000

func LeadingZerosEqual ¶

func LeadingZerosEqual[T Unsigned](x, y T) bool

func LeadingZerosLess ¶

func LeadingZerosLess[T Unsigned](x, y T) bool

func LeadingZerosLessOrEqual ¶

func LeadingZerosLessOrEqual[T Unsigned](x, y T) bool

func LeadingZerosUint32 ¶

func LeadingZerosUint32(x uint32) uint8

LeadingZerosUint32 (nlz) is algorithm from Robert Harley. It consists of 14 instructions branch-free. It uses Julius Goryavsky variation for smaller lookup table size. LeadingZerosUint32 has direct relationship of log2 as well, and can be used to compute it directly. Some instruction sets, such as ARM M1 chips, include single assembly instruction for this operation.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.LeadingZerosUint32(255))
}

Output:

24

Example (Long) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.LeadingZerosUint32(0b00111111111111111111111111101010))
}

Output:

2

func LeadingZerosUint32BinarySearch ¶

func LeadingZerosUint32BinarySearch(x uint32) uint8

func LeadingZerosUint64 ¶

func LeadingZerosUint64(x uint64) uint8

func LenLongestStringOnes ¶

func LenLongestStringOnes[T Unsigned](x T) uint8

LenLongestStringOnes (maxstr1) returns longest length of string of ones. Executes in 131 RISC instructions in worst case. TODO: it has optimization to reduce to 39 RISC instructions in worst case.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.LenLongestStringOnes(uint64(0b0000_0000_0000_0000_1100_1110_1111_1000)))
}

Output:

5

func LenShortestStringOnes ¶

func LenShortestStringOnes(x uint32) (p, n uint8)

LenShortestStringOnes (fminstr1) returns shortest length of string of ones.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.LenShortestStringOnes(0xFF0FF0))
}

Output:

8 8

Example (Zero) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.LenShortestStringOnes(0))
}

Output:

32 0

func Less ¶

func Less[T Signed](x, y T) T

func Less2 ¶

func Less2[T Signed](x, y T) T

func Less3 ¶

func Less3[T Signed](x, y T) T

func Less4 ¶

func Less4[T Integer](x, y T) T

Less4 utilizes the fact that x/2 - y/2 never overflows. Stores result in most significant bit. Exactly same formula works for uint32. This is because Go uses signed shift right for int32 and unsigned shift right for uint32. This takes 6 or seven instructions.

func LessOrEqual ¶

func LessOrEqual[T Signed](x, y T) T

func LessOrEqual2 ¶

func LessOrEqual2[T Signed](x, y T) T

func LessOrEqualUnsigned ¶

func LessOrEqualUnsigned[T Unsigned](x, y T) T

func LessOrEqualZero ¶

func LessOrEqualZero[T Signed](x T) T

func LessOrEqualZero2 ¶

func LessOrEqualZero2[T Signed](x T) T

func LessUnsigned ¶

func LessUnsigned[T Unsigned](x, y T) T

func LessUnsigned2 ¶

func LessUnsigned2[T Unsigned](x, y T) T

func LessZero ¶

func LessZero[T Signed](x T) T

func Log10x32 ¶

func Log10x32(x uint32) uint32

Log10x32 (ilog10) utilizes multiple techniques, but at its core it is number of leading zeroes with adjustments. It is 11 branch free RISC instructions.

func Log10x64 ¶

func Log10x64(x uint64) uint64

Log10x64 (ilog10) utilizes multiple techniques, but at its core it is number of leading zeroes with adjustments. It is 11 branch free RISC instructions.

func Log2 ¶

func Log2(x uint32) uint32

Log2 (ilog2)

Example (Four) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.Log2(4))
}

Output:

2

Example (Nine) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.Log2(9))
}

Output:

3

Example (One) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.Log2(4))
}

Output:

2

Example (Zero) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	// This is special condition. There is a case to extend this definition to be valid.
	// However, it is not universally accepted.
	fmt.Println(hd.Log2(0))
}

Output:

4294967295

func Log2x64 ¶

func Log2x64(x uint64) uint64

Log2x64 (ilog2). Interestingly, it is more accurate than float64 based version, since float64 overflows, but this code does not.

func LoopDetectionGosper ¶

func LoopDetectionGosper(f func(int) int, x0 int) (μLower, μUpper, λ int)

LoopDetectionGosper uses R.W.Gosper algorithm to detect start index of a loop and it's period. loop is defined on sequence: X_n+1 = f(X_n); X_0, X_1, ..., X_μ-1, X_μ, ... X_μ+λ. [HAK #132].

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	f := func(v int) int {
		if v < 100 {
			return v + 1
		}
		return 1
	}
	fmt.Println(hd.LoopDetectionGosper(f, 0))
}

Output:

0 63 100

func Max ¶

func Max[T Signed](x, y T) T

func MaxAND ¶

func MaxAND(minx, maxx, miny, maxy uint32) uint32

MaxAND is max(x|y) for x in [minx, maxx] and y in [miny, maxy]. This gives tighter bound than min(maxx, maxy).

func MaxOR ¶

func MaxOR(minx, maxx, miny, maxy uint32) uint32

MaxOR is max(x|y) for x in [minx, maxx] and y in [miny, maxy]. This gives tighter bound than maxx + maxy.

func MaxRanges ¶

func MaxRanges[T int32 | uint32](x, y T) T

MaxRanges requires signed x and y be in range [-2^30, 2^30-1] and unsigned x and y be in range [0, 2^31-1]

func Min ¶

func Min[T Signed](x, y T) T

func MinAND ¶

func MinAND(minx, maxx, miny, maxy uint32) uint32

MinAND is min(x&y) for x in [minx, maxx] and y in [miny, maxy]. This gives tighter bound than 0.

func MinOR ¶

func MinOR(minx, maxx, miny, maxy uint32) uint32

MinOR is min(x|y) for x in [minx, maxx] and y in [miny, maxy]. This gives tighter bound than max(minx, miny).

func MinRanges ¶

func MinRanges[T int32 | uint32](x, y T) T

MinRanges requires signed x and y be in range [-2^30, 2^30-1] and unsigned x and y be in range [0, 2^31-1]

func Mod10Signed ¶

func Mod10Signed(n int32) int32

Mod10Signed (rems10) uses multiplication method.

func Mod10Unsigned ¶

func Mod10Unsigned(n uint32) uint32

Mod10Unsigned (remu10) is using multiplication method similar to Mod3Unsigned5.

func Mod3Signed ¶

func Mod3Signed(n int32) int32

Mod3Signed (rems3) is similar to unsigned version, but remaps output of it differently.

func Mod3Signed2 ¶

func Mod3Signed2(n int32) int32

Mod3Signed2 (rems3) is using multiply method.

func Mod3Unsigned ¶

func Mod3Unsigned(n uint32) uint32

Mod3Unsigned (remu3) is using CountOnes (pop).

func Mod3Unsigned2 ¶

func Mod3Unsigned2(n uint32) uint32

Mod3Unsigned2 (remu3) is using CountOnes (pop) and table lookup.

func Mod3Unsigned3 ¶

func Mod3Unsigned3(n uint32) uint32

Mod3Unsigned3 (remu3) is using digit summing and in-register lookup.

func Mod3Unsigned4 ¶

func Mod3Unsigned4(n uint32) uint32

Mod3Unsigned4 (remu3) is using digit summing and in-memory lookup.

func Mod3Unsigned5 ¶

func Mod3Unsigned5(n uint32) uint32

Mod3Unsigned5 (remu3) is using the fact that n % 3 == floor(4/3 * n) % 3. This code is discovered in trial and error. This uses only shift and multiply operations. It executes in 13 instructions.

func Mod3Unsigned6 ¶

func Mod3Unsigned6(n uint32) uint32

Mod3Unsigned6 (remu3) is same as Mod3Unsigned5 but expands multiplications to shifts.

func Mod5Signed ¶

func Mod5Signed(n int32) int32

Mod5Signed (rems5) is similar to unsigned version, but remaps output of it differently.

func Mod5Signed2 ¶

func Mod5Signed2(n int32) int32

Mod5Signed2 (rems5) is using multiply method.

func Mod5Unsigned ¶

func Mod5Unsigned(n uint32) uint32

Mod5Unsigned (remu5) is using CountOnes (pop) and in-memory lookup.

func Mod5Unsigned2 ¶

func Mod5Unsigned2(n uint32) uint32

Mod5Unsigned2 (remu5) is using multiplication method similar to Mod3Unsigned5. It executes in 11 instructions.

func Mod63Unsigned ¶

func Mod63Unsigned(n uint32) uint32

Mod63Unsigned (remu63) is mysterious code from Joe Keane that works in 12 instructions.

func Mod63Unsigned2 ¶

func Mod63Unsigned2(n uint32) uint32

Mod63Unsigned2 (remu63) is using multiplication method similar to Mod3Unsigned5. This is not as fast as Joe Keane method, unless there is fast multiplication instruction on the machine.

func Mod7Signed ¶

func Mod7Signed(n int32) int32

Mod7Signed (rems7) is similar to unsigned version, but remaps output of it differently.

func Mod7Signed2 ¶

func Mod7Signed2(n int32) int32

Mod7Signed2 (rems7) uses multiplication method.

func Mod7Unsigned ¶

func Mod7Unsigned(n uint32) uint32

Mod7Unsigned (remu7) is using summing method and in-memory lookup.

func Mod7Unsigned2 ¶

func Mod7Unsigned2(n uint32) uint32

Mod75Unsigned2 (remu7) is using multiplication method similar to Mod3Unsigned5.

func Mod9Signed ¶

func Mod9Signed(n int32) int32

Mod9Signed (rems7) is similar to unsigned version, but remaps output of it differently.

func Mod9Signed2 ¶

func Mod9Signed2(n int32) int32

Mod9Signed2 (rems7) uses multiplication method.

func Mod9Unsigned ¶

func Mod9Unsigned(n uint32) uint32

Mod9Unsigned (remu9) is using summing method and in-memory lookup.

func Mod9Unsigned2 ¶

func Mod9Unsigned2(n uint32) uint32

Mod9Unsigned2 (remu9) is using multiplication method similar to Mod3Unsigned5.

func MulMultiWord ¶

func MulMultiWord(w, u, v []uint16)

MulMultiWord (mulmns) multiplies two multiwords word-wise. w = u * v This does not overflow. We are using uint16 and uint32 to avoid overflow in word multiplication. Most important word can be negative when converted to int16. Refer to routines Int64To4x16b and Int64From4x16b for conversion.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	u := []uint16{1, 2, 3, 4}
	v := []uint16{5, 6, 7}
	w := make([]uint16, len(u)+len(v))

	hd.MulMultiWord(w, u, v)

	fmt.Println(w)
}

Output:

[5 16 34 52 45 28 0]

func MultiplicativeInverseEuclid ¶

func MultiplicativeInverseEuclid(d uint32) uint32

MultiplicativeInverseEuclid uses extended Euclidian algorithm.

func MultiplicativeInverseNewton ¶

func MultiplicativeInverseNewton[T Integer](d T) T

MultiplicativeInverseNewton uses Newton method of multiplicative inverse. It follows from the well-known fact that sequence of xn_1 = xn * (2 - d * xn) converges to 1/d (mod d) given good starting xn. This also works with any inverse module of power of 2. Each iteration doubles number of correct bits.

func MultiplyHighOrder32 ¶

func MultiplyHighOrder32[T uint32 | int32](u, v T) T

MultiplyHighOrder32 (mulhs/mulhu) multiplies two integers and returns the high-order half of the product. This executes in 16 RISC instructions. Go has math/bits.Mul32 that returns higher order bits, however it does uint64 cast and works only for uint32. Remarkably, Go math/bits.Mul64 uses same algorithm as this function but uses 32bit words.

func MultiplyHighOrder64 ¶

func MultiplyHighOrder64[T uint64 | int64](u, v T) T

MultiplyHighOrder64 (mulhs/mulhu) multiplies two integers and returns the high-order half of the product. This executes in 16 RISC instructions. Go has math/bits.Mul64 that returns higher order bits, however it works only for uint64. Remarkably, Go math/bits.Mul64 uses same algorithm as this function but uses 32bit words. Algorithm in math/bits.Mul64 is the same, but it performs better because it does not use generics.

func NAbs ¶

func NAbs[T Signed](x T) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Print(hd.NAbs(-42))
}

Output:

-42

func NAbs2 ¶

func NAbs2[T Signed](x T) T

func NAbs3 ¶

func NAbs3[T Signed](x T) T

func NewCRC32TableLookupTable ¶

func NewCRC32TableLookupTable() [256]uint32

func NextHigherNumberWithSameNumberOfOnes ¶

func NextHigherNumberWithSameNumberOfOnes[T Unsigned](x T) T

NextHigherNumberWithSameNumberOfOnes is useful in bitsets stored in integers to find next set of same cardinality. This function utilizes the theorem of Right-to-Left Computability.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.NextHigherNumberWithSameNumberOfOnes(uint64(0b0011110000)))
}

Output:

100000111

func NotEqual ¶

func NotEqual[T Signed](x, y T) T

func NotEqual2 ¶

func NotEqual2(x, y int32) int32

func NotEqual3 ¶

func NotEqual3[T Integer](x, y T) T

func NotEqualZero ¶

func NotEqualZero[T Signed](x T) T

func NotEqualZero2 ¶

func NotEqualZero2(x int32) int32

func NotEqualZero3 ¶

func NotEqualZero3[T Integer](x T) T

func NotEqualZero4 ¶

func NotEqualZero4[T int32 | uint32](x T) T

func Parity ¶

func Parity(x uint32) uint32

Parity of number of 1-bits in bit-sequence. This executes in 10 instructions. This is example of "parallel prefix" operation, that very efficient for parallel computing.

Example (Even) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.Parity(0b0010_1101))
}

Output:

0

Example (Odd) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.Parity(0b0000_1101))
}

Output:

1

func Parity2 ¶

func Parity2(x uint32) uint32

Parity2 executes in 9 instructions. It avoids computing higher-order parity bits that will not be used.

func Parity3 ¶

func Parity3(x uint32) uint32

Parity3 computes parity of uint7 and sets 8th bit to 1/0 based on parity. Here we wrap into same signature as other Parity methods for convenience.

func Parity4 ¶

func Parity4(x uint32) uint32

Parity4 computes parity of uint7 and sets 8th bit to 1/0 based on parity, but flipped. Here we wrap into same signature as other Parity methods for convenience.

func Pow ¶

func Pow[T Integer](x T, n uint) T

Pow (iexp) takes x to the power of n. This utilizes binary expression of exponent. For example: x^13, and binary of 13 = 0b1101 = 0b1000 + 0b0100 + 0b0001 = 8 + 4 + 1. Thus, we can compute exponent in log number of multiplications. However, this is not always optimal code (e.g. n=27 there is better decomposition ((x^3)^3)^3)). Note, this function is not resistant to overflows.

func RSqrtFloat32 ¶

func RSqrtFloat32(x float32) float32

RSqrtFloat32 (rsqrt) algorithm is fast floating point approximation of reciprocal square root (1 / sqrt). It does not work +/- Inf, NaN, denormals, negative numbers. This algorithm caused some buzz in early 2000s. It produces result with 3.5% error rate.

func Reverse8Bit ¶

func Reverse8Bit(x uint32) uint32

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.Reverse8Bit(0b1101_0101))
}

Output:

10101011

func ReverseBits ¶

func ReverseBits(x uint32) uint32

ReverseBits (rev) reverses bits in x using divide and conquer.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("0x%X", hd.ReverseBits(0x01234567))
}

Output:

0xE6A2C480

func ReverseBits2 ¶

func ReverseBits2(x uint32) uint32

ReverseBits2 uses Knuth algorithm. It is using 21 RISC instructions.

func ReverseBits64Knuth ¶

func ReverseBits64Knuth(x uint64) uint64

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("0x%X", hd.ReverseBits64Knuth(0x01234567_89ABCDEF))
}

Output:

0xF7B3D1D1E6A2C480

func ReverseByte ¶

func ReverseByte(b byte) byte

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%0b", hd.ReverseByte(0b1101_0101))
}

Output:

10101011

func RotateLeft ¶

func RotateLeft(x uint32, n int) uint32

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%032b", hd.RotateLeft(0b11101111111100001111111111111101, 8))
}

Output:

11110000111111111111110111101111

func RotateRight ¶

func RotateRight(x uint32, n int) uint32

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%032b", hd.RotateRight(0b11101111111100001111111111111101, 8))
}

Output:

11111101111011111111000011111111

func RoundDownBlockPowerOfTwo ¶

func RoundDownBlockPowerOfTwo[T Unsigned](x T, k uint8) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.RoundDownBlockPowerOfTwo[uint](257, 8))
}

Output:

256

func RoundDownBlockPowerOfTwo2 ¶

func RoundDownBlockPowerOfTwo2[T Unsigned](x T, k uint8) T

func RoundUpBlockPowerOfTwo ¶

func RoundUpBlockPowerOfTwo[T Unsigned](x T, k uint8) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.RoundUpBlockPowerOfTwo[uint](210, 8))
}

Output:

256

func RoundUpBlockPowerOfTwo2 ¶

func RoundUpBlockPowerOfTwo2[T Unsigned](x T, k uint8) T

func SetBitLastZero ¶

func SetBitLastZero[T Signed](x T) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.SetBitLastZero(0b10100111))
}

Output:

00001000

func SetTrailingOnesWithRightMostOne ¶

func SetTrailingOnesWithRightMostOne[T Signed](x T) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.SetTrailingOnesWithRightMostOne(0b01010111))
}

Output:

00001111

func SetTrailingZeros ¶

func SetTrailingZeros[T Signed](x T) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.SetTrailingZeros(0b01011000))
}

Output:

00000111

func SetTrailingZeros2 ¶

func SetTrailingZeros2[T Signed](x T) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.SetTrailingZeros2(0b01011000))
}

Output:

00000111

func SetTrailingZeros3 ¶

func SetTrailingZeros3[T Signed](x T) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.SetTrailingZeros3(0b01011000))
}

Output:

00000111

func SetTrailingZerosWithRightMostOne ¶

func SetTrailingZerosWithRightMostOne[T Signed](x T) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.SetTrailingZerosWithRightMostOne(0b01011000))
}

Output:

00001111

func SetupCycleThreeValues ¶

func SetupCycleThreeValues(a, b, c int32) (c1, c2, c3 int32, n1, n2 uint8)

SetupCycleThreeValues is final routine that prepares constants for cycling. This can be computed at compile time.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	var c int32 = 0b10101 // 21
	var a int32 = 0b11111 // 31
	var b int32 = 0b10100 // 20
	fmt.Println(hd.SetupCycleThreeValues(a, b, c))
}

Output:

-11 10 21 1 0

func SetupCycleThreeValuesN1N2 ¶

func SetupCycleThreeValuesN1N2(a, b, c int32) (na, nb, nc int32, n1, n2 uint8)

SetupCycleThreeValuesN1N2 rearranges a,b,c in order required for cycling and defines n1 and n2 positions.

func ShiftLeftDoubleLength ¶

func ShiftLeftDoubleLength(x [2]uint32, n int) [2]uint32

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	x := [2]uint32{0b1111, 0b1111}
	y := hd.ShiftLeftDoubleLength(x, 3)
	fmt.Printf("%032b_%032b", y[1], y[0])
}

Output:

00000000000000000000000001111000_00000000000000000000000001111000

func ShiftRightSigned32 ¶

func ShiftRightSigned32[T int32 | uint32](x T, v int) T

ShiftRightUnsigned32 does arithmetic shift right that Go is missing. This compiles to single RISC instruction and is going to be inlined.

func ShiftRightSigned64 ¶

func ShiftRightSigned64[T int64 | uint64](x T, v int) T

ShiftRightUnsigned64 does arithmetic shift right that Go is missing. This compiles to single RISC instruction and is going to be inlined.

func ShiftRightSignedDoubleLength ¶

func ShiftRightSignedDoubleLength(x [2]uint32, n int) [2]uint32

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	x := [2]uint32{0b1111, 0b1111}
	y := hd.ShiftRightSignedDoubleLength(x, 3)
	fmt.Printf("%032b_%032b", y[1], y[0])
}

Output:

00000000000000000000000000000001_11100000000000000000000000000001

Example (Negative) ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	x := [2]uint32{0b1111, 0b10000000000000000000000000000001}
	y := hd.ShiftRightSignedDoubleLength(x, 3)
	fmt.Printf("%032b_%032b", y[1], y[0])
}

Output:

11110000000000000000000000000000_00100000000000000000000000000001

func ShiftRightSignedFromUnsigned ¶

func ShiftRightSignedFromUnsigned(x uint32, n int) uint32

ShiftRightSignedFromUnsigned computes signed shift from unsigned shift. This compiles to five or six RISC instructions. If n is fixed, then this compiles to three or four RISC instructions.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.ShiftRightSignedFromUnsigned(0b11111111111111111111111111101010, 2))
}

Output:

11111111111111111111111111111010

func ShiftRightSignedFromUnsigned2 ¶

func ShiftRightSignedFromUnsigned2(x uint32, n int) uint32

ShiftRightSignedFromUnsigned2 is alternative version. If n is fixed, then this compiles to three or four RISC instructions.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.ShiftRightSignedFromUnsigned2(0b11111111111111111111111111101010, 2))
}

Output:

11111111111111111111111111111010

func ShiftRightSignedFromUnsigned3 ¶

func ShiftRightSignedFromUnsigned3(x uint32, n int) uint32

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.ShiftRightSignedFromUnsigned3(0b11111111111111111111111111101010, 2))
}

Output:

11111111111111111111111111111010

func ShiftRightSignedFromUnsigned4 ¶

func ShiftRightSignedFromUnsigned4(x uint32, n int) uint32

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.ShiftRightSignedFromUnsigned4(0b11111111111111111111111111101010, 2))
}

Output:

11111111111111111111111111111010

func ShiftRightSignedFromUnsigned5 ¶

func ShiftRightSignedFromUnsigned5(x uint32, n int) uint32

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.ShiftRightSignedFromUnsigned5(0b11111111111111111111111111101010, 2))
}

Output:

11111111111111111111111111111010

func ShiftRightUnsigned32 ¶

func ShiftRightUnsigned32[T int32 | uint32](x T, v int) T

ShiftRightUnsigned32 does logical shift right that Go is missing. This compiles to single RISC instruction and is going to be inlined.

func ShiftRightUnsigned64 ¶

func ShiftRightUnsigned64[T int64 | uint64](x T, v int) T

ShiftRightUnsigned64 does logical shift right that Go is missing. This compiles to single RISC instruction and is going to be inlined.

func ShiftRightUnsignedDoubleLength ¶

func ShiftRightUnsignedDoubleLength(x [2]uint32, n int) [2]uint32

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	x := [2]uint32{0b1111, 0b1111}
	y := hd.ShiftRightUnsignedDoubleLength(x, 3)
	fmt.Printf("%032b_%032b", y[1], y[0])
}

Output:

00000000000000000000000000000001_11100000000000000000000000000001

func ShuffleInner ¶

func ShuffleInner(x uint32) uint32

ShuffleInner performs perfect inner shuffle of bits.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	// each symbol is a bit
	// in: abcd_efgh_ijkl_mnop_ABCD_EFGH_IJKL_MNPO
	// out: AaBb_CcDd_EeFf_GgHh_IiJj_KkLl_MmNn_OoPp
	fmt.Printf("%032b", hd.ShuffleInner(0b1111_1111_1111_1111_0000_0000_0000_0000))
}

Output:

01010101010101010101010101010101

func ShuffleOuter ¶

func ShuffleOuter(x uint32) uint32

ShuffleOuter performs perfect outer shuffle of bits. This function is used in cryptography. This is 32 RISC instructions.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	// each symbol is a bit
	// in: abcd_efgh_ijkl_mnop_ABCD_EFGH_IJKL_MNPO
	// out: aAbB_cCdD_eEfF_gGhH_iI_jJ_kKlL_mMnN_oOpP
	fmt.Printf("%032b", hd.ShuffleOuter(0b1111_1111_0000_0000_0000_0000_1111_1111))
}

Output:

10101010101010100101010101010101

func Sign ¶

func Sign(x int32) int32

Sign can be calculated in four RISC instructions with comparison operators. Note, there is five and four instruction formula, but in Go it would not fit, since we need explicit logical right shift and conversion of full signed to unsigned bits. Note, there is three instruction formula using comparison operators, but in Go strong typing booleans not converted to ints, which would require branches and more instructions.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.Sign(-10), hd.Sign(10), hd.Sign(0))
}

Output:

-1 1 0

func SqrtBinarySearch ¶

func SqrtBinarySearch(x uint32) uint32

SqrtBinarySearch (isqrt) is similar to Newton method with estimate, but performs fully estimation.

func SqrtNewton ¶

func SqrtNewton(x uint32) uint32

SqrtNewton (isqrt) converges quadratically on series of better approximations. This version uses lookup table coded in branches that is used for better starting values.

func SqrtShiftAndSubtract ¶

func SqrtShiftAndSubtract(x uint32) uint32

SqrtShiftAndSubtract (isqrt) is similar to some hardware implementation due to its effective use of shift instructions.

func SubDoubleLength ¶

func SubDoubleLength(x [2]uint32, y [2]uint32) [2]uint32

SubDoubleLength of uint64 numbers encoded in two uint32. This takes eight instructions.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	x := [2]uint32{310, 405}
	y := [2]uint32{100, 200}
	fmt.Println(hd.SubDoubleLength(x, y))
}

Output:

[210 205]

func SubFourUint8 ¶

func SubFourUint8(x, y uint32) uint32

SubFourUint8 executes in eight branch-free instructions

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	x := hd.FourUint8ToUint32([4]uint8{2, 3, 254, 17})
	y := hd.FourUint8ToUint32([4]uint8{2, 1, 253, 0})

	sum := hd.SubFourUint8(x, y)

	fmt.Println(hd.FourUint8FromUint32(sum))
}

Output:

[0 2 1 17]

func SubTwoUint16 ¶

func SubTwoUint16(x, y uint32) uint32

SubFourUint8 executes in seven branch-free instructions

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	x := hd.TwoUint16ToUint32([2]uint16{65535 - 1, 5})
	y := hd.TwoUint16ToUint32([2]uint16{65535 - 5, 1})

	sum := hd.SubTwoUint16(x, y)

	fmt.Println(hd.TwoUint16FromUint32(sum))
}

Output:

[4 4]

func Syndrome ¶

func Syndrome(pr, u uint32) uint32

Syndrome in Hamming based error checking code

func TrailingZerosUint32 ¶

func TrailingZerosUint32(x uint32) uint8

TrailingZerosUint32 (ntz) uses John Reiser variant of David Seal method. Among applications of TrailingZerosUint32 is R.W.Gosper Loop Detection Algorithm.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.TrailingZerosUint32(0b0000_1101_0000))
}

Output:

4

func TransposeMatrix8bx8b ¶

func TransposeMatrix8bx8b(A, B []byte, m, n int)

TransposeMatrix8bx8b (transpose8) uses divide and conquer method. Each matrix i,j value is encoded in i-th byte and j-th bit of a byte. This is 2x fewer calculating RISC instructions than straightforward masking method. This is 21 calculating RISC instructions. A: m rows * n columns is 1-based coordinates of source submatrix. B: n rows * m columns is 1-based coordinates of destination submatrix.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	A := []byte{
		0b10101011,
		0b01011110,
		0b00010101,
		0b00001000,
		0b00010000,
		0b00110000,
		0b01010000,
		0b10010000,
	}
	B := make([]byte, 8)

	hd.TransposeMatrix8bx8b(A, B, 1, 1)

	for _, q := range B {
		fmt.Printf("%08b\n", q)
	}
}

func TurnOffRightMostBit ¶

func TurnOffRightMostBit[T Signed](x T) T

TurnOffRightMostBit this can be used to test if integer is power of 2

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.TurnOffRightMostBit(0b01011000))
}

Output:

01010000

func TurnOffRightmostOnes ¶

func TurnOffRightmostOnes[T Signed](x T) T

TurnOffRightmostOnes this can be sued to determine if a nonnegative integer is of the for 2^j - 2^k for some j >= k >= 0

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.TurnOffRightmostOnes(0b01011100))
}

Output:

01000000

func TurnOffRightmostOnes2 ¶

func TurnOffRightmostOnes2[T Signed](x T) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.TurnOffRightmostOnes2(0b01011100))
}

Output:

01000000

func TurnOffTrailingOnes ¶

func TurnOffTrailingOnes[T Signed](x T) T

TurnOffTrailingOnes this can be used to test if integer if of the form 2^n - 1

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.TurnOffTrailingOnes(0b10100111))
}

Output:

10100000

func TurnOnRightMostBit ¶

func TurnOnRightMostBit[T Signed](x T) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.TurnOnRightMostBit(0b10100111))
}

Output:

10101111

func TurnOnTrailingZeros ¶

func TurnOnTrailingZeros[T Signed](x T) T

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Printf("%08b", hd.TurnOnTrailingZeros(0b10101000))
}

Output:

10101111

func TwoUint16FromUint32 ¶

func TwoUint16FromUint32(x uint32) [2]uint16

TwoUint16FromUint32 unpacks two uint16 from single uint32.

func TwoUint16ToUint32 ¶

func TwoUint16ToUint32(x [2]uint16) uint32

TwoUint16ToUint32 packs two uint16 into single uint32.

func Uint64FromNx16b ¶

func Uint64FromNx16b(v []uint16) uint64

func UnShuffleInner ¶

func UnShuffleInner(x uint32) uint32

UnShuffleInner is reverse of ShuffleInner.

func UnShuffleOuter ¶

func UnShuffleOuter(x uint32) uint32

UnShuffleOuter is reverse of ShuffleOuter.

func ZByteL ¶

func ZByteL(x uint32) int

ZByteL finds index of left most zero byte. This is branch-free code in 11 RISC instructions. This is useful for `strlen` in C strings, which use 0 byte for string termination.

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.ZByteL(0x12_00_FF_00))
}

Output:

1

func ZByteL1 ¶

func ZByteL1(x uint32) int

ZByteL1 is direct algorithm.

func ZByteL64 ¶

func ZByteL64(x uint64) int

Example ¶

package main

import (
	"fmt"

	hd "github.com/nikolaydubina/go-hackers-delight"
)

func main() {
	fmt.Println(hd.ZByteL64(0x11_12_00_00_FF_FF_00_11))
}

Output:

2

Types ¶

type Integer ¶

type Integer interface {
	Signed | Unsigned
}

type LRUCache ¶

type LRUCache struct {
	// contains filtered or unexported fields
}

LRUCache is eight-way set associative cache with least-recently-used replacement policy that uses reference matrix method. The whole structure fits into single 64bit register. Internally, least significant byte of uint64 holds row 0 of reference matrix.

func (*LRUCache) Hit ¶

func (c *LRUCache) Hit(i uint8)

Hit value i as most recently used. This is five or six instructions on 64bit RISC. Values of i should be in [0, 7].

func (*LRUCache) LeastRecentlyUsed ¶

func (c *LRUCache) LeastRecentlyUsed() uint8

type Signed ¶

type Signed interface {
	int | int8 | int16 | int32 | int64
}

type Unsigned ¶

type Unsigned interface {
	uint | uint8 | uint16 | uint32 | uint64
}

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL