Documentation
¶
Overview ¶
Chapter 2: Basics ¶
For better performance, it is advisable to avoid branches in simple functions.
Right-to-Left Computability — a function mapping words to words can be implemented with word-parallel `add`, `substract`, `and`, `or` and `not` instructions if and only if each bit of of the result depends only on the bits at and to the right of each input operand. All these operands themselves depend only on right bits, so any of their composition also depends on the right bits.
It can be shown that in the ordinary addition of binary numbers with each bit independently equally likely to be 0 or 1, a carry occurs at each position with probability about 0.5.
If instruction set includes an instruction for each of 16 Boolean functions of two variables, then any Boolean function of three variables can be implemented with four or fewer instructions (and four variables with seven instructions).
Comparison functions are a couple branch-free instructions that store result in the most significant bit.
Overflow means result of operation is too large or too small to fit into the variable. Many hardware supply bits for overflow, but high-lever languages may not have access to it (rejected Go overflow proposal).
Chapter 8: Multiplication ¶
Any multiplication can be decomposed in a summation of left shifts. For example,
- 13 = 0b1101
- x * 13 = 8x + 4x + x
- t1 := x << 2
- t2 := x << 3
- result := t1 + t2 + x
For every multiplication there are multiple paths possible that utilize registers and shifts. These paths can have fewer or more instructions and registers to accomplish this. In general, it is nontrivial to find minimum number instructions required and which these instructions are. Following theorem gives upper bound: "Multiplication of a variable x by an n-bit constant m, m >= 1, can be accomplished with at most n instructions of the type add, subtract, and shift left by any given amount."
Chapter 9: Integer Division ¶
- Short Division is division of single word by another (e.g. 32bit / 32bit => 32bit). This is typical division operator available in high level languages.
- Long Division is division of multi-word by single word (e.g. 64bit / 32bit => 32bit).
Chapter 10: Integer Division by Constants ¶
On many computers, division is very time consuming and is to be avoided. A value of 20 or more elementary Add instructions is usually needed and execution time is usually large. When the divisor is a constant, the division can be replaced by faster operations.
Certain divisors have repetitive patterns in binary representation which allows to use few (~20) shift, add, subtract instructions to approximate residual and get quotient. It is not trivial how to generate such optimal code.
Chapter 12: Unusual Bases for Number Systems ¶
Standard binary representation of an non-negative integer number is base 2.
- (aₙ, ... a₃ a₂, a₁, a₀) = aₙ ⋅ 2ⁿ + ... + a₃ ⋅ 2³ + a₂ ⋅ 2² + a₁ ⋅ 2¹ + a₀ ⋅ 2⁰
However, there are alternative bases, including complex numbers.
- (aₙ, ... a₃ a₂, a₁, a₀) = aₙ ⋅ -2ⁿ + ... + a₃ ⋅ -2³ + a₂ ⋅ -2² + a₁ ⋅ -2¹ + a₀ ⋅ -2⁰
- (aₙ, ... a₃ a₂, a₁, a₀) = aₙ ⋅ (-1 + 𝑖)ⁿ + ... + a₃ ⋅ (-1 + 𝑖)³ + a₂ ⋅ (-1 + 𝑖)² + a₁ ⋅ (-1 + 𝑖)¹ + a₀ ⋅ (-1 + 𝑖)⁰
- (aₙ, ... a₃ a₂, a₁, a₀) = aₙ ⋅ (-1 - 𝑖)ⁿ + ... + a₃ ⋅ (-1 - 𝑖)³ + a₂ ⋅ (-1 - 𝑖)² + a₁ ⋅ (-1 - 𝑖)¹ + a₀ ⋅ (-1 - 𝑖)⁰
Which base is the most efficient? It can be argued that under standard assumptions of hardware design (e.g. "cost of b-state circuit" is proportional to b) the most efficient base is 𝑒. However, it is not practical. Meanwhile, base 2 is only 5% more costly than base 3 and only 6% more costly than base 𝑒. Which is an interesting result, given base 2 is both convenient to use and is theoretically optimal.
Chapter 18: Primes ¶
It is known, that there is no polynomial f(n) that produces a prime for every n.
Index ¶
- Constants
- Variables
- func Abs[T Signed](x T) T
- func Abs2[T Signed](x T) T
- func Abs3[T Signed](x T) T
- func Abs4[T Signed](x T) T
- func AbsDiff[T Integer](x, y T) T
- func AbsDiff2[T Signed](x, y T) T
- func AbsDiffUnsigned(x, y uint32) uint32
- func AbsFastMul(x int32) int32
- func AddDoubleLength(x [2]uint32, y [2]uint32) [2]uint32
- func AddFourUint8(x, y uint32) uint32
- func AddTwoUint16(x, y uint32) uint32
- func AvgCeil[T Integer](x, y T) T
- func AvgFloor[T Integer](x, y T) T
- func BitSize32(x int32) uint8
- func CLPTwo(x uint32) uint32
- func CLPTwo2(x uint32) uint32
- func CLPTwo3(x uint32) uint32
- func CRC32Basic(data []byte) uint32
- func CRC32TableLookup(data []byte) uint32
- func CSA[T Unsigned](a, b, c T) (h, l T)
- func Cbrt(x uint32) uint32
- func CheckBits(u uint32) uint32
- func CompareCountOnes(x, y uint32) int
- func Compress(x, m uint32) uint32
- func Compress2[T Unsigned](x, m T) T
- func Correct(pr, u uint32) (cr uint32, errb uint8)
- func CountOnes(x uint32) uint32
- func CountOnes1(x uint32) uint32
- func CountOnes3(x uint32) uint32
- func CountOnes4[T Unsigned](x T) T
- func CountOnes5(x uint32) uint32
- func CountOnesArrayGroup8(A []uint32) uint32
- func CycleThreeIdentifier(vs [3]int32, n1, n2 uint8) [2][3]int32
- func CycleThreeValues(a, b, c, x int32) int32
- func DifferenceOrZero[T Signed](x, y T) T
- func DifferenceOrZeroRanges[T int32 | uint32](x, y T) T
- func DifferenceOrZeroUnsigned(x, y uint32) uint32
- func Div1000ShiftSigned(n int32) int32
- func Div1000ShiftUnsigned(n uint32) uint32
- func Div100ShiftSigned(n int32) int32
- func Div100ShiftUnsigned(n uint32) uint32
- func Div10ShiftSigned(n int32) int32
- func Div10ShiftUnsigned(n uint32) uint32
- func Div11ShiftSigned(n int32) int32
- func Div11ShiftUnsigned(n uint32) uint32
- func Div12ShiftSigned(n int32) int32
- func Div12ShiftUnsigned(n uint32) uint32
- func Div13ShiftSigned(n int32) int32
- func Div13ShiftUnsigned(n uint32) uint32
- func Div3ShiftSigned(n int32) int32
- func Div3ShiftUnsigned(n uint32) uint32
- func Div3Signed(n int32) int32
- func Div5ShiftSigned(n int32) int32
- func Div5ShiftUnsigned(n uint32) uint32
- func Div5Signed(n int32) int32
- func Div6ShiftSigned(n int32) int32
- func Div6ShiftUnsigned(n uint32) uint32
- func Div7ShiftSigned(n int32) int32
- func Div7ShiftUnsigned(n uint32) uint32
- func Div7Signed(n int32) int32
- func Div9ShiftSigned(n int32) int32
- func Div9ShiftUnsigned(n uint32) uint32
- func DivExact[T Integer](n, d T) T
- func DivExact7(n int32) int32
- func DivMod3Signed(n int32) (q, r int32)
- func DivMod3Signed2(n int32) (q, r int32)
- func DivMod5Signed(n int32) (q, r int32)
- func DivMod7Signed(n int32) (q, r int32)
- func DivModConstSigned(n, d int32) (q, r int32)
- func DivModConstSignedMagic(d int32) (M, s int32)
- func DivModConstUnsigned(n, d uint32) (q, r uint32)
- func DivModConstUnsignedMagic(d uint32) (M uint32, a, s int32)
- func DivModLongUnsigned64b32b(x uint64, y uint32) (q, r uint32)
- func DivModLongUnsigned64b32b2(x uint64, y uint32) (q, r uint32)
- func DivModMultiWordUnsigned(q, r, u, v []uint16)
- func DivModSignedPowTwo(n int32, k int) (q, r int32)
- func DivModUnsignedPowTwo(n uint32, k int) (q, r uint32)
- func DivModUnsignedSeven(n uint32) (q, r uint32)
- func DivModUnsignedThree(n uint32) (q, r uint32)
- func DivSignedPowTwo(n int32, k int) int32
- func DivSignedPowTwo2(n int32, k int) int32
- func DivSignedPowTwo_fixed5(n int32) int32
- func DoubleLengthInt32FromUint64(x uint64) [2]uint32
- func Equal[T Signed](x, y T) T
- func Equal2(x, y int32) int32
- func Equal3(x, y int32) int32
- func Equal4(x, y int32) int32
- func Equal5[T Integer](x, y T) T
- func EqualZero[T Signed](x T) T
- func EqualZero2(x int32) int32
- func EqualZero3(x int32) int32
- func EqualZero4[T Integer](x T) T
- func EqualZero5[T Integer](x T) T
- func ExchangeBitsInRegister[T Unsigned](x, md, mo T, k int) T
- func ExchangeBitsInRegisterFast[T Unsigned](x, md, mo T, k int) T
- func ExchangeRegisters[T Integer](x, y T) (T, T)
- func ExchangeRegistersMasked[T Integer](x, y, m T) (T, T)
- func ExchangeRegistersMasked2[T Integer](x, y, m T) (T, T)
- func ExchangeRegistersMasked3[T Integer](x, y, m T) (T, T)
- func ExchangeRegistersMasked4[T Integer](x, y, m T) (T, T)
- func Expand(x, m uint32) uint32
- func ExtendSign7(x uint32) uint32
- func ExtendSign7Three(x uint32) uint32
- func ExtendSign7Two(x uint32) uint32
- func FLPTwo(x uint32) uint32
- func FLPTwo2(x uint32) uint32
- func FLPTwo3(x uint32) uint32
- func FLPTwo4(x uint32) uint32
- func FLPTwo5(x uint32) uint32
- func FindFirstStringOnes(x uint32, n int) int
- func FindFirstStringOnes1(x uint32, n int) int
- func FindInByte(x uint32, m byte) int
- func FindInByteEq(x, y uint32) int
- func FirstOneOffDifferentBits[T int32 | uint32](vs [3]T) [3]int8
- func FourUint8FromUint32(x uint32) [4]uint8
- func FourUint8ToUint32(x [4]uint8) uint32
- func GrayCodeFromUint[T Unsigned](n T) T
- func GrayCodeToUint16(g uint16) (b uint16)
- func GrayCodeToUint32(g uint32) (b uint32)
- func GrayCodeToUint64(g uint64) (b uint64)
- func HigherEqualZero[T Signed](x T) T
- func HigherZero[T Signed](x T) T
- func HigherZero2[T Signed](x T) T
- func HigherZero3[T Signed](x T) T
- func ISIGN[T Signed](x, y T) T
- func ISIGN2[T Signed](x, y T) T
- func ISIGN3[T Signed](x, y T) T
- func ISIGN4[T Signed](x, y T) T
- func IncrReversed(x uint32) uint32
- func Int64FromNx16b(v []uint16) int64
- func IntToNx16b[T int | int32 | int64 | uint | uint32 | uint64](v T) []uint16
- func IsAddOverflow[T Signed](x, y T) T
- func IsAddOverflow2[T Signed](x, y T) T
- func IsAddOverflowUnsigned[T Unsigned](x, y T) T
- func IsAddOverflowUnsigned2[T Unsigned](x, y T) T
- func IsAddOverflowUnsigned3[T Unsigned](x, y T) bool
- func IsAddOverflowUnsigned4[T Unsigned](x, y, sum T) bool
- func IsDivExactSigned(n, d int32) bool
- func IsDivExactUnsigned(n, d uint32) bool
- func IsDivExactUnsignedOdd(n, d uint32) bool
- func IsDivOverflow(x, y int32) bool
- func IsDivOverflowUnsigned[T Unsigned](x, y T) bool
- func IsInRange[T Integer](x, a, b T) bool
- func IsInRangeClosedOpen[T Integer](x, a, b T) bool
- func IsInRangeOpen[T Integer](x, a, b T) bool
- func IsInRangeOpen2[T Integer](x, a, b T) bool
- func IsInRangeOpenClosed[T Integer](x, a, b T) bool
- func IsInRangePowerTwo[T Unsigned](x T, n int) bool
- func IsInRangePowerTwoOffset[T Unsigned](x, a T, n int) bool
- func IsMostSignificantSet[T int32 | uint32](x T) bool
- func IsMulOverflow(x, y int32) bool
- func IsMulOverflowUnsigned(x, y uint32) bool
- func IsMulOverflowUnsignedNLZ(x, y uint32) bool
- func IsPowerOfTwoBoundaryCrossed[T Unsigned](a, l, b T) bool
- func IsPowerOfTwoBoundaryCrossed2[T Unsigned](a, l, b T) bool
- func IsPowerOfTwoBoundaryCrossed3[T Unsigned](a, l, b T) bool
- func IsPowerOfTwoBoundaryCrossed4[T Unsigned](a, l, b T) bool
- func IsSubOverflow[T Signed](x, y T) T
- func IsSubOverflow2[T Signed](x, y T) T
- func IsSubOverflowUnsigned[T Unsigned](x, y T) T
- func IsSubOverflowUnsigned2[T Unsigned](x, y T) T
- func IsSubOverflowUnsigned3[T Unsigned](x, y T) bool
- func IsSubOverflowUnsigned4[T Unsigned](x, y, sub T) bool
- func IsolateRightmostOneBit[T Signed](x T) T
- func LeadingZerosEqual[T Unsigned](x, y T) bool
- func LeadingZerosLess[T Unsigned](x, y T) bool
- func LeadingZerosLessOrEqual[T Unsigned](x, y T) bool
- func LeadingZerosUint32(x uint32) uint8
- func LeadingZerosUint32BinarySearch(x uint32) uint8
- func LeadingZerosUint64(x uint64) uint8
- func LenLongestStringOnes[T Unsigned](x T) uint8
- func LenShortestStringOnes(x uint32) (p, n uint8)
- func Less[T Signed](x, y T) T
- func Less2[T Signed](x, y T) T
- func Less3[T Signed](x, y T) T
- func Less4[T Integer](x, y T) T
- func LessOrEqual[T Signed](x, y T) T
- func LessOrEqual2[T Signed](x, y T) T
- func LessOrEqualUnsigned[T Unsigned](x, y T) T
- func LessOrEqualZero[T Signed](x T) T
- func LessOrEqualZero2[T Signed](x T) T
- func LessUnsigned[T Unsigned](x, y T) T
- func LessUnsigned2[T Unsigned](x, y T) T
- func LessZero[T Signed](x T) T
- func Log10x32(x uint32) uint32
- func Log10x64(x uint64) uint64
- func Log2(x uint32) uint32
- func Log2x64(x uint64) uint64
- func LoopDetectionGosper(f func(int) int, x0 int) (μLower, μUpper, λ int)
- func Max[T Signed](x, y T) T
- func MaxAND(minx, maxx, miny, maxy uint32) uint32
- func MaxOR(minx, maxx, miny, maxy uint32) uint32
- func MaxRanges[T int32 | uint32](x, y T) T
- func Min[T Signed](x, y T) T
- func MinAND(minx, maxx, miny, maxy uint32) uint32
- func MinOR(minx, maxx, miny, maxy uint32) uint32
- func MinRanges[T int32 | uint32](x, y T) T
- func Mod10Signed(n int32) int32
- func Mod10Unsigned(n uint32) uint32
- func Mod3Signed(n int32) int32
- func Mod3Signed2(n int32) int32
- func Mod3Unsigned(n uint32) uint32
- func Mod3Unsigned2(n uint32) uint32
- func Mod3Unsigned3(n uint32) uint32
- func Mod3Unsigned4(n uint32) uint32
- func Mod3Unsigned5(n uint32) uint32
- func Mod3Unsigned6(n uint32) uint32
- func Mod5Signed(n int32) int32
- func Mod5Signed2(n int32) int32
- func Mod5Unsigned(n uint32) uint32
- func Mod5Unsigned2(n uint32) uint32
- func Mod63Unsigned(n uint32) uint32
- func Mod63Unsigned2(n uint32) uint32
- func Mod7Signed(n int32) int32
- func Mod7Signed2(n int32) int32
- func Mod7Unsigned(n uint32) uint32
- func Mod7Unsigned2(n uint32) uint32
- func Mod9Signed(n int32) int32
- func Mod9Signed2(n int32) int32
- func Mod9Unsigned(n uint32) uint32
- func Mod9Unsigned2(n uint32) uint32
- func MulMultiWord(w, u, v []uint16)
- func MultiplicativeInverseEuclid(d uint32) uint32
- func MultiplicativeInverseNewton[T Integer](d T) T
- func MultiplyHighOrder32[T uint32 | int32](u, v T) T
- func MultiplyHighOrder64[T uint64 | int64](u, v T) T
- func NAbs[T Signed](x T) T
- func NAbs2[T Signed](x T) T
- func NAbs3[T Signed](x T) T
- func NewCRC32TableLookupTable() [256]uint32
- func NextHigherNumberWithSameNumberOfOnes[T Unsigned](x T) T
- func NotEqual[T Signed](x, y T) T
- func NotEqual2(x, y int32) int32
- func NotEqual3[T Integer](x, y T) T
- func NotEqualZero[T Signed](x T) T
- func NotEqualZero2(x int32) int32
- func NotEqualZero3[T Integer](x T) T
- func NotEqualZero4[T int32 | uint32](x T) T
- func Parity(x uint32) uint32
- func Parity2(x uint32) uint32
- func Parity3(x uint32) uint32
- func Parity4(x uint32) uint32
- func Pow[T Integer](x T, n uint) T
- func RSqrtFloat32(x float32) float32
- func Reverse8Bit(x uint32) uint32
- func ReverseBits(x uint32) uint32
- func ReverseBits2(x uint32) uint32
- func ReverseBits64Knuth(x uint64) uint64
- func ReverseByte(b byte) byte
- func RotateLeft(x uint32, n int) uint32
- func RotateRight(x uint32, n int) uint32
- func RoundDownBlockPowerOfTwo[T Unsigned](x T, k uint8) T
- func RoundDownBlockPowerOfTwo2[T Unsigned](x T, k uint8) T
- func RoundUpBlockPowerOfTwo[T Unsigned](x T, k uint8) T
- func RoundUpBlockPowerOfTwo2[T Unsigned](x T, k uint8) T
- func SetBitLastZero[T Signed](x T) T
- func SetTrailingOnesWithRightMostOne[T Signed](x T) T
- func SetTrailingZeros[T Signed](x T) T
- func SetTrailingZeros2[T Signed](x T) T
- func SetTrailingZeros3[T Signed](x T) T
- func SetTrailingZerosWithRightMostOne[T Signed](x T) T
- func SetupCycleThreeValues(a, b, c int32) (c1, c2, c3 int32, n1, n2 uint8)
- func SetupCycleThreeValuesN1N2(a, b, c int32) (na, nb, nc int32, n1, n2 uint8)
- func ShiftLeftDoubleLength(x [2]uint32, n int) [2]uint32
- func ShiftRightSigned32[T int32 | uint32](x T, v int) T
- func ShiftRightSigned64[T int64 | uint64](x T, v int) T
- func ShiftRightSignedDoubleLength(x [2]uint32, n int) [2]uint32
- func ShiftRightSignedFromUnsigned(x uint32, n int) uint32
- func ShiftRightSignedFromUnsigned2(x uint32, n int) uint32
- func ShiftRightSignedFromUnsigned3(x uint32, n int) uint32
- func ShiftRightSignedFromUnsigned4(x uint32, n int) uint32
- func ShiftRightSignedFromUnsigned5(x uint32, n int) uint32
- func ShiftRightUnsigned32[T int32 | uint32](x T, v int) T
- func ShiftRightUnsigned64[T int64 | uint64](x T, v int) T
- func ShiftRightUnsignedDoubleLength(x [2]uint32, n int) [2]uint32
- func ShuffleInner(x uint32) uint32
- func ShuffleOuter(x uint32) uint32
- func Sign(x int32) int32
- func SqrtBinarySearch(x uint32) uint32
- func SqrtNewton(x uint32) uint32
- func SqrtShiftAndSubtract(x uint32) uint32
- func SubDoubleLength(x [2]uint32, y [2]uint32) [2]uint32
- func SubFourUint8(x, y uint32) uint32
- func SubTwoUint16(x, y uint32) uint32
- func Syndrome(pr, u uint32) uint32
- func TrailingZerosUint32(x uint32) uint8
- func TransposeMatrix8bx8b(A, B []byte, m, n int)
- func TurnOffRightMostBit[T Signed](x T) T
- func TurnOffRightmostOnes[T Signed](x T) T
- func TurnOffRightmostOnes2[T Signed](x T) T
- func TurnOffTrailingOnes[T Signed](x T) T
- func TurnOnRightMostBit[T Signed](x T) T
- func TurnOnTrailingZeros[T Signed](x T) T
- func TwoUint16FromUint32(x uint32) [2]uint16
- func TwoUint16ToUint32(x [2]uint16) uint32
- func Uint64FromNx16b(v []uint16) uint64
- func UnShuffleInner(x uint32) uint32
- func UnShuffleOuter(x uint32) uint32
- func ZByteL(x uint32) int
- func ZByteL1(x uint32) int
- func ZByteL64(x uint64) int
- type Integer
- type LRUCache
- type Signed
- type Unsigned
Examples ¶
- Abs
- AbsDiff
- AddDoubleLength
- AddFourUint8
- AddTwoUint16
- AvgCeil
- AvgFloor
- BitSize32 (Bit32)
- BitSize32 (Bit6)
- BitSize32 (Bit7)
- BitSize32 (Zero)
- CLPTwo
- Cbrt
- CheckBits (No_error)
- CompareCountOnes (Left_bigger)
- CompareCountOnes (Left_smaller)
- CompareCountOnes (Same)
- Compress
- CountOnes
- CountOnesArrayGroup8
- CycleThreeValues
- DivSignedPowTwo_fixed5
- Equal (Equal)
- Equal (Not_equal)
- ExchangeBitsInRegister
- ExchangeBitsInRegisterFast
- ExchangeRegistersMasked
- ExchangeRegistersMasked (Bits)
- ExtendSign7 (Extended)
- ExtendSign7 (NotExtended)
- ExtendSign7Three (Extended)
- ExtendSign7Three (NotExtended)
- ExtendSign7Two (Extended)
- ExtendSign7Two (NotExtended)
- FLPTwo
- FindFirstStringOnes
- FindInByte
- FindInByteEq
- FirstOneOffDifferentBits
- GrayCodeFromUint
- ISIGN
- IntToNx16b (Small)
- IsAddOverflow
- IsAddOverflowUnsigned
- IsAddOverflowUnsigned4
- IsMostSignificantSet (Int32)
- IsMostSignificantSet (Uint32)
- IsPowerOfTwoBoundaryCrossed (Crossed)
- IsPowerOfTwoBoundaryCrossed (Not_crossed)
- IsSubOverflow
- IsSubOverflowUnsigned
- IsSubOverflowUnsigned4
- IsolateRightmostOneBit
- LeadingZerosUint32
- LeadingZerosUint32 (Long)
- LenLongestStringOnes
- LenShortestStringOnes
- LenShortestStringOnes (Zero)
- Log2 (Four)
- Log2 (Nine)
- Log2 (One)
- Log2 (Zero)
- LoopDetectionGosper
- MulMultiWord
- NAbs
- NextHigherNumberWithSameNumberOfOnes
- Parity (Even)
- Parity (Odd)
- Reverse8Bit
- ReverseBits
- ReverseBits64Knuth
- ReverseByte
- RotateLeft
- RotateRight
- RoundDownBlockPowerOfTwo
- RoundUpBlockPowerOfTwo
- SetBitLastZero
- SetTrailingOnesWithRightMostOne
- SetTrailingZeros
- SetTrailingZeros2
- SetTrailingZeros3
- SetTrailingZerosWithRightMostOne
- SetupCycleThreeValues
- ShiftLeftDoubleLength
- ShiftRightSignedDoubleLength
- ShiftRightSignedDoubleLength (Negative)
- ShiftRightSignedFromUnsigned
- ShiftRightSignedFromUnsigned2
- ShiftRightSignedFromUnsigned3
- ShiftRightSignedFromUnsigned4
- ShiftRightSignedFromUnsigned5
- ShiftRightUnsignedDoubleLength
- ShuffleInner
- ShuffleOuter
- Sign
- SubDoubleLength
- SubFourUint8
- SubTwoUint16
- TrailingZerosUint32
- TransposeMatrix8bx8b
- TurnOffRightMostBit
- TurnOffRightmostOnes
- TurnOffRightmostOnes2
- TurnOffTrailingOnes
- TurnOnRightMostBit
- TurnOnTrailingZeros
- ZByteL
- ZByteL64
Constants ¶
const SqrtFloat32ErrorRate float32 = 0.035
SqrtFloat32ErrorRate is error rate (abs(true_value - result) / result) that RSqrtFloat32 produces.
Variables ¶
var CRC32Generator uint32 = 0x4C11DB7
CRC32Generator is Cycle Redundancy Check generator polynomial. Each bit i corresponds to the 1/0 of coefficient at x^i in polynomial. Among others, it is used in following standards: ISO 3309 (HDLC), ANSI X3.66 (ADCCP), ISO/IEC/IEEE 802-3 (Ethernet), MPEG-2, PKZIP, Gzip, Bzip2, POSIX cksum, PNG.
var CRC32GeneratorReversed uint32 = 0xEDB88320
var InterestingPrimes = [...]uint{
(1 << 32) - 5,
}
var PowerOfTwo = [...]uint{
1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192,
16384, 32768, 65536, 131072, 262144, 524288, 1048576, 2097152,
4194304, 8388608, 16777216, 33554432, 67108864, 134217728, 268435456,
536870912, 1073741824, 2147483648, 4294967296, 8589934592,
17179869184, 34359738368, 68719476736, 137438953472, 274877906944,
549755813888, 1099511627776, 2199023255552, 4398046511104, 8796093022208,
17592186044416, 35184372088832, 70368744177664, 140737488355328,
281474976710656, 562949953421312, 1125899906842624, 2251799813685248,
4503599627370496, 9007199254740992, 18014398509481984, 36028797018963968,
72057594037927936, 144115188075855872, 288230376151711744, 576460752303423488,
1152921504606846976, 2305843009213693952, 4611686018427387904, 9223372036854775808,
}
PowerOfTwo that fit into uint64.
Functions ¶
func Abs ¶
func Abs[T Signed](x T) T
Abs can be computed in three or four branch-free instructions.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Print(hd.Abs(-42)) }
Output: 42
func AbsDiff ¶
func AbsDiff[T Integer](x, y T) T
AbsDiff is absolute difference that does not overflow.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Print(hd.AbsDiff(1, 100)) }
Output: 99
func AbsDiffUnsigned ¶
func AbsFastMul ¶
AbsFastMul when you have fast instruction for +1/-1 multiplication.
func AddDoubleLength ¶
AddDoubleLength of uint64 numbers encoded in two uint32. This takes nine instructions.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { x := [2]uint32{310, 410} y := [2]uint32{100, 200} fmt.Println(hd.AddDoubleLength(x, y)) }
Output: [410 610]
func AddFourUint8 ¶
AddFourUint8 executes in eight branch-free instructions
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { x := hd.FourUint8ToUint32([4]uint8{1, 2, 254, 4}) y := hd.FourUint8ToUint32([4]uint8{2, 3, 1, 5}) sum := hd.AddFourUint8(x, y) fmt.Println(hd.FourUint8FromUint32(sum)) }
Output: [3 5 255 9]
func AddTwoUint16 ¶
AddFourUint8 executes in seven branch-free instructions
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { x := hd.TwoUint16ToUint32([2]uint16{65535 - 1, 4}) y := hd.TwoUint16ToUint32([2]uint16{1, 5}) sum := hd.AddTwoUint16(x, y) fmt.Println(hd.TwoUint16FromUint32(sum)) }
Output: [65535 9]
func AvgCeil ¶
func AvgCeil[T Integer](x, y T) T
AvgCeil does not cause overflow. Rounded up. In Go, when left operand of shift operator is signed int, then arithmetic shift is performed. Arithmetic shift is same thing as signed shift. Since in Go right shift is unsigned, then formula for int32 is the same as for uint32.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Print(hd.AvgCeil[int32](-101, -200)) }
Output: -150
func AvgFloor ¶
func AvgFloor[T Integer](x, y T) T
AvgFloor does not cause overflow. Rounded down. This is same as unsigned version, but uses signed-shift right. In Go, when left operand of shift operator is signed int, then arithmetic shift is performed. Since in Go right shift is unsigned, then formula for int32 is the same as for uint32.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Print(hd.AvgFloor[int32](-101, -200)) }
Output: -151
func BitSize32 ¶
BitSize32 returns minimum number of bits requires to represent number in two's complement signed number.
Example (Bit32) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.BitSize32(0xFFFF_FFFF >> 1)) }
Output: 32
Example (Bit6) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.BitSize32(0b0000_1101)) }
Output: 5
Example (Bit7) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.BitSize32(0b0011_1101)) }
Output: 7
Example (Zero) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.BitSize32(0)) }
Output: 0
func CLPTwo ¶
CLPTwo is Ceil to nearest Power of Two. Values >= 2^31 will be zero.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.CLPTwo(200)) }
Output: 256
func CRC32Basic ¶
CRC32Basic computes CRC 32bit checksum one bit a time.
func CRC32TableLookup ¶
CRC32TableLookup uses precomputed values.
func Cbrt ¶
Cbrt is cube root, it starts of with shift based Newton algorithm for Sqrt and modifies it to cuber root
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.Cbrt(64)) }
Output: 4
func CheckBits ¶
CheckBits produces 6bits of parity checks. It is simple Hamming Code based error detection and correction scheme.
Example (No_error) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { v := uint32(0xABCD_EF01) p := hd.CheckBits(v) u, errb := hd.Correct(p, v) fmt.Printf("%0x %v", u, errb) }
Output: abcdef01 0
func CompareCountOnes ¶
Example (Left_bigger) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { var x uint32 = 0b0000_0100_0100_01111 var y uint32 = 0b0000_0100_0100_01110 fmt.Println(hd.CompareCountOnes(x, y)) }
Output: 1
Example (Left_smaller) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { var x uint32 = 0b0000_0100_0100_01110 var y uint32 = 0b0000_0100_0100_01111 fmt.Println(hd.CompareCountOnes(x, y)) }
Output: -1
Example (Same) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { var x uint32 = 0b0000_0100_0100_01110 var y uint32 = 0b0000_0100_0100_01110 fmt.Println(hd.CompareCountOnes(x, y)) }
Output: 0
func Compress ¶
Compress (Generalized Extract) selects bits from the x where m is 1 and puts them in order to least significant bits. This uses divide and conquer. This executes in 169 instructions on 64bit RISC. There is also hardware focused version that does not work well in general RISC.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { // x: abcd_efgh_ijkl_mnop_qrst_uvwx_yzAB_CDEF var x uint32 = 0b1010_1011_0101_1110_0001_0101_0000_1000 var m uint32 = 0b0000_1111_0011_0011_1010_1010_0101_0101 // exp: 0000_0000_0000_0000_efgh_klop_qsuw_zBDF fmt.Printf("%032b", hd.Compress(x, m)) }
Output: 00000000000000001011011000000000
func Compress2 ¶
func Compress2[T Unsigned](x, m T) T
Compress2 is direct version of Compress. This is 260 RISC instructions in loop brach-free.
func Correct ¶
Correct bits in pr that contain 32 "information" bits stored in ur and 7 "check" bits stored in pr. Check bits are extracted and error is corrected in returned result. This can correct up to 1 wrong bit. This is based on Hamming error correcting code. It returns corrected bits and number of error bits detected (0,1,2). TODO: 2 bits error detection is not working.
func CountOnes ¶
CountOnes (pop) uses divide and conquer to count number set bits. Folklore says counting ones is important for NSA, but nobody knows why. Applications of this function include Hamming Distance.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.CountOnes5(uint32(0b0000_0100_0100_01110))) }
Output: 5
func CountOnes1 ¶
CountOnes1 is optimized version of divide and conquer that executes in 21 instructions and is branch-free.
func CountOnes3 ¶
CountOnes3 it executes in 19 RISC instructions, but works well on machines with two addresses.
func CountOnes4 ¶
func CountOnes4[T Unsigned](x T) T
CountOnes4 is very fast for if number of 1 bits is small.
func CountOnes5 ¶
CountOnes5 uses uint64 registers. It works only with uint15.
func CountOnesArrayGroup8 ¶
CountOnesArrayGroup8 utilizes CSA for faster count ones computation. It is executing with 3x less instructions than direct version with CountOnes per uint32.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { vs := []uint32{ 0b0000_0100_0100_01110, 0b0001_0100_0100_01110, 0b0000_0100_0100_01110, 0b0011_0100_0100_01110, 0b0000_0100_0100_01110, 0b0111_0100_0100_01110, 0b1001_0100_0100_01110, 0b0000_0100_0100_01110, 0b0101_0100_0100_01110, 0b0101_0100_0100_01110, } fmt.Println(hd.CountOnesArrayGroup8(vs)) }
Output: 62
func CycleThreeIdentifier ¶
CycleThreeIdentifier constructs identifier byte from bits stored at positions n1 and n2.
func CycleThreeValues ¶
CycleThreeValues is an ingenious and very efficient (8 branch-free instructions) way of cycling in three different constants. It requires pre-computing constants that can be done at compile time. It relies on the fact that among three different constants there are always two bit positions where they differ and each time one-odd-out.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { var c int32 = 0b10101 // 21 var a int32 = 0b11111 // 31 var b int32 = 0b10100 // 20 out := []int32{a} for range 10 { out = append(out, hd.CycleThreeValues(a, b, c, out[len(out)-1])) } fmt.Println(out) }
Output: [31 20 21 31 20 21 31 20 21 31 20]
func DifferenceOrZero ¶
func DifferenceOrZero[T Signed](x, y T) T
DifferenceOrZero (doz) returns x - y if x > y else 0. It computes in 7-10 branch-free RISC instructions. Note, smaller DifferenceOrZero version with comparison is skipped as Go lacks comparison returning int.
func DifferenceOrZeroRanges ¶
DifferenceOrZeroRanges requires signed x and y be in range [-2^30, 2^30-1] and unsigned x and y be in range [0, 2^31-1]
func DifferenceOrZeroUnsigned ¶
DifferenceOrZeroUnsigned (dozu) computes in 7-10 branch-free RISC instructions. Note, smaller DOZ version with comparison is skipped as Go lacks comparison returning int.
func Div1000ShiftUnsigned ¶
Div1000ShiftUnsigned (div1000) is 23 instructions.
func Div100ShiftUnsigned ¶
Div100ShiftUnsigned (div100) is 25 instructions.
func Div3ShiftUnsigned ¶
Div3ShiftUnsigned (divu3) uses shifts and adds to approximate through residual. It is 18 instructions.
func Div3Signed ¶
Div3Signed this computes in 6 instructions. This is 9 or 10 cycles. Meanwhile, divide operation can take 20 cycles.
func Div5Signed ¶
Div5Signed is similar to Div3Signed, but error terms is too large, and thus it needs slight variation of magic constant and correction with shift right.
func Div7Signed ¶
Div7Signed is similar to Div3Signed, but error terms is too large, and thus it needs slight variation of magic constant and correction with shift right. The magic is still too large, and thus it needs flipping sign and addition.
func DivExact ¶
func DivExact[T Integer](n, d T) T
DivExact uses multiplicative inverse that can be computed at compile time. It relies on theorem "if a and m are relatively prime integers, then there exists an integer ā such that a*ā = 1 (mod m)".
func DivMod3Signed ¶
func DivMod3Signed2 ¶
DivMod3Signed2 calculates reminder first.
func DivMod5Signed ¶
func DivMod7Signed ¶
func DivModConstSigned ¶
DivModConstSigned illustrates division by constant. This code should be generated at compile time depending on the value of compile time constant k and result of MagicSigned execution. The basic trick is to multiply by magic number 2**32/d and then extract leftmost 32 bits of the product.
func DivModConstSignedMagic ¶
DivModConstSignedMagic computes magic number and shift amount for signed integer division. This values can be computed at compile time. Code using big numbers can be simplified, such as in the case of Python or math.Big, it is no listed here.
func DivModConstUnsigned ¶
func DivModConstUnsignedMagic ¶
DivModConstUnsignedMagic computes magic for unsigned constant multiplication. This values can be computed at compile time. Code using big numbers can be simplified, such as in the case of Python or math.Big, it is no listed here.
func DivModLongUnsigned64b32b ¶
DivModLongUnsigned64b32b (divlu) performs long division of 64-bit unsigned integer by 32-bit unsigned integer. This algorithm is slightly modified to store both lower and higher 32 bits of dividend into 64-bit number. This algorithm uses shift-and-subtract operations. It illustrates how hardware is doing such division. It does not work for overflow cases. This executes in 321 to 385 RISC instructions.
func DivModLongUnsigned64b32b2 ¶
DivModLongUnsigned64b32b2 is alternative version based on multiword division. If overflow, it returns maximum quotient and reminder.
func DivModMultiWordUnsigned ¶
func DivModMultiWordUnsigned(q, r, u, v []uint16)
DivModMultiWordUnsigned is Knuth algorithm for integer division. It stores quotient in q and remainder in r.
func DivModSignedPowTwo ¶
DivModSignedPowTwo takes 7 instructions, 5 instructions for division and 2 instructions for reminder.
func DivModUnsignedPowTwo ¶
func DivModUnsignedSeven ¶
func DivModUnsignedThree ¶
func DivSignedPowTwo ¶
DivSignedPowTwo returns n / 2 ** k for 1 <= k < 31. This is illustration, for performance k should be fixed at compile time. This is four branch-free instructions.
func DivSignedPowTwo2 ¶
DivSignedPowTwo2 is alternative version that uses branch.
func DivSignedPowTwo_fixed5 ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.DivSignedPowTwo_fixed5(96)) }
Output: 3
func DoubleLengthInt32FromUint64 ¶
DoubleLengthInt32FromUint64 unpacks two uint32 from single uint64. This is known as Double-Length arithmetics. This can be implemented in five instructions by using only 31 bits and storing carry in most significant bit. Here is only simplified versions.
func Equal ¶
func Equal[T Signed](x, y T) T
Example (Equal) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.Equal(10, 10) >> 31) }
Output: -1
Example (Not_equal) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.Equal(10, 11) >> 31) }
Output: 0
func EqualZero2 ¶
func EqualZero3 ¶
func EqualZero4 ¶
func EqualZero4[T Integer](x T) T
func EqualZero5 ¶
func EqualZero5[T Integer](x T) T
func ExchangeBitsInRegister ¶
ExchangeBitsInRegister swaps two regions of bits within single register without affecting other bits. For example, swapping B and D regions. B and D should be same size, but A C E and B/D can be different sizes. k is number of bits of C + D md is mask for field D mo is mask for fields A C and E (all fields to not swap) [aaaa bbbb cccc dddd eeee] -> [aaaa dddd cccc bbbb eeee] This is straight forward approach and requires eleven instructions and six cycles.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { var xx uint32 = 0xABCDEF12 var mb uint32 = 0x0000F000 var mo uint32 = 0xF0FF0FFF fmt.Printf("%0X", hd.ExchangeBitsInRegister(xx, mb, mo, 4*3)) }
Output: AECDBF12
func ExchangeBitsInRegisterFast ¶
ExchangeBitsInRegisterFast requires eight instructions and five cycles.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { var xx uint32 = 0xABCDEF12 var mb uint32 = 0x0000F000 var mo uint32 = 0xF0FF0FFF fmt.Printf("%0X", hd.ExchangeBitsInRegisterFast(xx, mb, mo, 4*3)) }
Output: AECDBF12
func ExchangeRegisters ¶
func ExchangeRegisters[T Integer](x, y T) (T, T)
ExchangeRegisters illustrates very old trick on how to exchange two registers without using third one. This is swap operation. Also known as multiple assignment in Go.
func ExchangeRegistersMasked ¶
func ExchangeRegistersMasked[T Integer](x, y, m T) (T, T)
ExchangeRegistersMasked illustrates how to exchange only masked bits between two registers. This can be done in 3 cycles, given parallelism and and-not instructions.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { var x uint32 = 0xABCDEF12 var y uint32 = 0x12345678 var m uint32 = 0x0F0F0F0F x, y = hd.ExchangeRegistersMasked(x, y, m) fmt.Printf("\n%0X\n%0X", x, y) }
Output: A2C4E618 1B3D5F72
Example (Bits) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { var x uint32 = 0b11110000111100001111000011110000 var y uint32 = 0b00001111000011110000111100001111 var m uint32 = 0b00001111111100000000000000000000 x, y = hd.ExchangeRegistersMasked(x, y, m) fmt.Printf("\n%032b\n%032b", x, y) }
Output: 11111111000000001111000011110000 00000000111111110000111100001111
func ExchangeRegistersMasked2 ¶
func ExchangeRegistersMasked2[T Integer](x, y, m T) (T, T)
func ExchangeRegistersMasked3 ¶
func ExchangeRegistersMasked3[T Integer](x, y, m T) (T, T)
ExchangeRegistersMasked3 is heavily using bitwise equality, but in Go there is no such operator, so expanding that notation.
func ExchangeRegistersMasked4 ¶
func ExchangeRegistersMasked4[T Integer](x, y, m T) (T, T)
ExchangeRegistersMasked4 also executes in three cycles, given sufficient instruction parallelism in the CPU.
func Expand ¶
Expand (Generalized Insert, unpack, scatter, deposit) selects bits from the x where m is 1 and puts them in order to least significant bits. This is reverse of Compress steps. This is 168 RISC instructions, or 200 instructions on 64bit RISC. There is also hardware focused version that does not work well in general RISC.
func ExtendSign7 ¶
ExtendSign8 sign-extends a 8-bit number to a 32-bit number. Sign of 8-bit number is stored in 8th-bit. Sign extension is treating n-th least significant bit as sign bit and copying it to all more significant bits. This is usually implemented with shift-left-logical followed by shift-right-arithmetic, but alternative may be useful if you don't have shift.
Example (Extended) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.ExtendSign7(0b1110_1010)) }
Output: 11111111111111111111111111101010
Example (NotExtended) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.ExtendSign7(0b0010_1010)) }
Output: 00101010
func ExtendSign7Three ¶
ExtendSign7Three is alternative version
Example (Extended) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.ExtendSign7Three(0b1110_1010)) }
Output: 11111111111111111111111111101010
Example (NotExtended) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.ExtendSign7Three(0b0010_1010)) }
Output: 00101010
func ExtendSign7Two ¶
ExtendSign7Two is alternative version. If you know all higher order bits are zero, then `and` can be omitted.
Example (Extended) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.ExtendSign7Two(0b1110_1010)) }
Output: 11111111111111111111111111101010
Example (NotExtended) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.ExtendSign7Two(0b0010_1010)) }
Output: 00101010
func FLPTwo ¶
FLPTwo is Floor to nearest Power of Two. Values >= 2^31 will be zero. This formula works for x > 0.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.FLPTwo(310)) }
Output: 256
func FindFirstStringOnes ¶
FindFirstStringOnes (FFStr1) finds index of first string of 1s of length n or 32 if none found. This uses divide and conquer and executes in 3 to 36 RISC instructions. This loop can be effectively unrolled.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.FindFirstStringOnes(0b0000_0000_0000_0000_1100_1110_1111_1000, 5)) }
Output: 24
func FindFirstStringOnes1 ¶
FindFirstStringOnes1 is is direct algorithm. For worst case this is not good, it is 178 RISC instructions for 0x5555_5555.
func FindInByte ¶
FindInByte m in x illustrates that to find specific byte you need to XOR with repeated value.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.FindInByte(0x12_00_FF_00, 0xFF)) }
Output: 2
func FindInByteEq ¶
FindInByteEq finds index of first byte that x and y are equal.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.FindInByteEq(0x12_00_F9_00, 0x23_01_F9_00)) }
Output: 2
func FirstOneOffDifferentBits ¶
FirstOneOffDifferentBits computes 1-based position of first least significant bit that is one-off among three values for corresponding value or -1 if there is no such bit.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { var a int32 = 0b11111 var b int32 = 0b10100 var c int32 = 0b10101 fmt.Println(hd.FirstOneOffDifferentBits([3]int32{a, b, c})) }
Output: [1 0 -1]
func FourUint8FromUint32 ¶
FourUint8FromUint32 unpacks 4 uint8 from single uint32.
func FourUint8ToUint32 ¶
FourUint8ToUint32 packs 4 uint4 into single uint32.
func GrayCodeFromUint ¶
func GrayCodeFromUint[T Unsigned](n T) T
GrayCodeFromUint converts a binary number to Gray code. Gray Code is sequence of binary numbers where each successive differs only by one bit. It is possible to iterate through all 2ⁿ numbers in single Gray code. There are many versions of Gray Codes, here we use most common one, "reflected binary Gray code". Gray codes named after Frank Gray, physicist at Bell Labs, who invented it in 1930s for TV.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { for i := range 10 { fmt.Printf("%08b -> %08b\n", i, hd.GrayCodeFromUint(uint64(i))) } }
Output: 00000000 -> 00000000 00000001 -> 00000001 00000010 -> 00000011 00000011 -> 00000010 00000100 -> 00000110 00000101 -> 00000111 00000110 -> 00000101 00000111 -> 00000100 00001000 -> 00001100 00001001 -> 00001101
func GrayCodeToUint16 ¶
func GrayCodeToUint32 ¶
func GrayCodeToUint64 ¶
func HigherEqualZero ¶
func HigherEqualZero[T Signed](x T) T
func HigherZero ¶
func HigherZero[T Signed](x T) T
func HigherZero2 ¶
func HigherZero2[T Signed](x T) T
func HigherZero3 ¶
func HigherZero3[T Signed](x T) T
func ISIGN ¶
func ISIGN[T Signed](x, y T) T
ISIGN is sign-transfer function, as known in FORTRAN.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.ISIGN(10, -100000), hd.ISIGN(-10, 100000)) }
Output: -10 10
func IncrReversed ¶
IncrReversed value is used in Fast Furier Transform (FFT) when computing reversed and incrementing is used in the loop. However, computing reversed takes 29 instructions, and lookup table for large values of is not practical. Thus storing previous reversed value and incrementing reversed is useful. This executes in 5 RISC instructions.
func Int64FromNx16b ¶
func IntToNx16b ¶
Example (Small) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.IntToNx16b(5)) }
Output: [5]
func IsAddOverflow ¶
func IsAddOverflow[T Signed](x, y T) T
IsAddOverflow sets most significant bit if overflow occurs. This version does not use carry bit and is efficient.
Example ¶
package main import ( "fmt" "math" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.IsAddOverflow(int32(math.MaxInt32), 1)>>31, hd.IsAddOverflow(int32(10), 1)>>31) }
Output: -1 0
func IsAddOverflow2 ¶
func IsAddOverflow2[T Signed](x, y T) T
func IsAddOverflowUnsigned ¶
func IsAddOverflowUnsigned[T Unsigned](x, y T) T
Example ¶
package main import ( "fmt" "math" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.IsAddOverflowUnsigned(uint32(math.MaxUint32), 1)>>31, hd.IsAddOverflowUnsigned(uint32(10), 1)>>31) }
Output: 1 0
func IsAddOverflowUnsigned2 ¶
func IsAddOverflowUnsigned2[T Unsigned](x, y T) T
func IsAddOverflowUnsigned3 ¶
func IsAddOverflowUnsigned4 ¶
Example ¶
package main import ( "fmt" "math" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { var x uint32 = 1 fmt.Println(hd.IsAddOverflowUnsigned4(math.MaxUint32, 1, math.MaxUint32+x), hd.IsAddOverflowUnsigned4(10, 1, 10+x)) }
Output: true false
func IsDivExactSigned ¶
IsDivExactSigned is simplified version with abs. it has single branch. however, it is not minimal number of instructions.
func IsDivExactUnsigned ¶
IsDivExactUnsigned is similar to odd version but transforms divisor d0 * 2^k and performs rotate right trick. It has single branch.
func IsDivExactUnsignedOdd ¶
IsDivExactUnsignedOdd tests if n is multiple of d, for odd d. It has single branch.
func IsDivOverflow ¶
IsDivOverflow uses around seven instructions with branches, and it is not easy to improve this.
func IsDivOverflowUnsigned ¶
func IsInRange ¶
IsInRange illustrates how to perform signed integer bounds checking (both ends inclusive) in single comparison. This is useful for array bounds checking. Using uint conversion, to force unsigned comparison on signed integers, given Go lacks unsigned comparison operator on signed integers.
func IsInRangeClosedOpen ¶
IsInRangeClosedOpen: a <= x < b
func IsInRangeOpen2 ¶
func IsInRangeOpenClosed ¶
IsInRangeOpenClosed: a < x <= b
func IsInRangePowerTwo ¶
IsInRangePowerTwo: 0 <= x <= 2^n - 1
func IsInRangePowerTwoOffset ¶
IsInRangePowerTwoOffset: a <= x <= a + 2^n - 1
func IsMostSignificantSet ¶
Example (Int32) ¶
package main import ( "fmt" "math" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.IsMostSignificantSet(int32(-1)), hd.IsMostSignificantSet(int32(1)), hd.IsMostSignificantSet(int32(math.MaxInt32))) }
Output: true false false
Example (Uint32) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.IsMostSignificantSet(uint32(0xFFFFFFFF)), hd.IsMostSignificantSet(uint32(10))) }
Output: true false
func IsMulOverflow ¶
func IsMulOverflowUnsigned ¶
func IsMulOverflowUnsignedNLZ ¶
IsMulOverflowUnsignedNLZ counts number of leading zeros to detect overflow.
func IsPowerOfTwoBoundaryCrossed ¶
IsPowerOfTwoBoundaryCrossed checks if adding l to a crosses power of block of size b. b size has to be power of two. This and versions bellow are five or six RISC instructions. b has to be power of two and likely to be a constant.
Example (Crossed) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.IsPowerOfTwoBoundaryCrossed[uint32](100, 300, 256)) }
Output: true
Example (Not_crossed) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.IsPowerOfTwoBoundaryCrossed[uint32](100, 100, 256)) }
Output: false
func IsSubOverflow ¶
func IsSubOverflow[T Signed](x, y T) T
IsSubOverflow sets most significant bit if overflow occurs. This version does not use carry bit and is efficient.
Example ¶
package main import ( "fmt" "math" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.IsSubOverflow2(int32(math.MinInt32), 1)>>31, hd.IsSubOverflow2(int32(-10), 1)>>31) }
Output: -1 0
func IsSubOverflow2 ¶
func IsSubOverflow2[T Signed](x, y T) T
func IsSubOverflowUnsigned ¶
func IsSubOverflowUnsigned[T Unsigned](x, y T) T
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.IsSubOverflowUnsigned(uint32(0), 1)>>31, hd.IsSubOverflowUnsigned(uint32(10), 1)>>31) }
Output: 1 0
func IsSubOverflowUnsigned2 ¶
func IsSubOverflowUnsigned2[T Unsigned](x, y T) T
func IsSubOverflowUnsigned3 ¶
func IsSubOverflowUnsigned4 ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { var x uint32 = 1 fmt.Println(hd.IsSubOverflowUnsigned4(0, 1, 0-x), hd.IsSubOverflowUnsigned4(10, 1, 10-x)) }
Output: true false
func IsolateRightmostOneBit ¶
func IsolateRightmostOneBit[T Signed](x T) T
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.IsolateRightmostOneBit(0b01011000)) }
Output: 00001000
func LeadingZerosEqual ¶
func LeadingZerosLess ¶
func LeadingZerosLessOrEqual ¶
func LeadingZerosUint32 ¶
LeadingZerosUint32 (nlz) is algorithm from Robert Harley. It consists of 14 instructions branch-free. It uses Julius Goryavsky variation for smaller lookup table size. LeadingZerosUint32 has direct relationship of log2 as well, and can be used to compute it directly. Some instruction sets, such as ARM M1 chips, include single assembly instruction for this operation.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.LeadingZerosUint32(255)) }
Output: 24
Example (Long) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.LeadingZerosUint32(0b00111111111111111111111111101010)) }
Output: 2
func LeadingZerosUint64 ¶
func LenLongestStringOnes ¶
LenLongestStringOnes (maxstr1) returns longest length of string of ones. Executes in 131 RISC instructions in worst case. TODO: it has optimization to reduce to 39 RISC instructions in worst case.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.LenLongestStringOnes(uint64(0b0000_0000_0000_0000_1100_1110_1111_1000))) }
Output: 5
func LenShortestStringOnes ¶
LenShortestStringOnes (fminstr1) returns shortest length of string of ones.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.LenShortestStringOnes(0xFF0FF0)) }
Output: 8 8
Example (Zero) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.LenShortestStringOnes(0)) }
Output: 32 0
func Less4 ¶
func Less4[T Integer](x, y T) T
Less4 utilizes the fact that x/2 - y/2 never overflows. Stores result in most significant bit. Exactly same formula works for uint32. This is because Go uses signed shift right for int32 and unsigned shift right for uint32. This takes 6 or seven instructions.
func LessOrEqual ¶
func LessOrEqual[T Signed](x, y T) T
func LessOrEqual2 ¶
func LessOrEqual2[T Signed](x, y T) T
func LessOrEqualUnsigned ¶
func LessOrEqualUnsigned[T Unsigned](x, y T) T
func LessOrEqualZero ¶
func LessOrEqualZero[T Signed](x T) T
func LessOrEqualZero2 ¶
func LessOrEqualZero2[T Signed](x T) T
func LessUnsigned ¶
func LessUnsigned[T Unsigned](x, y T) T
func LessUnsigned2 ¶
func LessUnsigned2[T Unsigned](x, y T) T
func Log10x32 ¶
Log10x32 (ilog10) utilizes multiple techniques, but at its core it is number of leading zeroes with adjustments. It is 11 branch free RISC instructions.
func Log10x64 ¶
Log10x64 (ilog10) utilizes multiple techniques, but at its core it is number of leading zeroes with adjustments. It is 11 branch free RISC instructions.
func Log2 ¶
Log2 (ilog2)
Example (Four) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.Log2(4)) }
Output: 2
Example (Nine) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.Log2(9)) }
Output: 3
Example (One) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.Log2(4)) }
Output: 2
Example (Zero) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { // This is special condition. There is a case to extend this definition to be valid. // However, it is not universally accepted. fmt.Println(hd.Log2(0)) }
Output: 4294967295
func Log2x64 ¶
Log2x64 (ilog2). Interestingly, it is more accurate than float64 based version, since float64 overflows, but this code does not.
func LoopDetectionGosper ¶
LoopDetectionGosper uses R.W.Gosper algorithm to detect start index of a loop and it's period. loop is defined on sequence: X_n+1 = f(X_n); X_0, X_1, ..., X_μ-1, X_μ, ... X_μ+λ. [HAK #132].
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { f := func(v int) int { if v < 100 { return v + 1 } return 1 } fmt.Println(hd.LoopDetectionGosper(f, 0)) }
Output: 0 63 100
func MaxAND ¶
MaxAND is max(x|y) for x in [minx, maxx] and y in [miny, maxy]. This gives tighter bound than min(maxx, maxy).
func MaxOR ¶
MaxOR is max(x|y) for x in [minx, maxx] and y in [miny, maxy]. This gives tighter bound than maxx + maxy.
func MaxRanges ¶
MaxRanges requires signed x and y be in range [-2^30, 2^30-1] and unsigned x and y be in range [0, 2^31-1]
func MinAND ¶
MinAND is min(x&y) for x in [minx, maxx] and y in [miny, maxy]. This gives tighter bound than 0.
func MinOR ¶
MinOR is min(x|y) for x in [minx, maxx] and y in [miny, maxy]. This gives tighter bound than max(minx, miny).
func MinRanges ¶
MinRanges requires signed x and y be in range [-2^30, 2^30-1] and unsigned x and y be in range [0, 2^31-1]
func Mod10Unsigned ¶
Mod10Unsigned (remu10) is using multiplication method similar to Mod3Unsigned5.
func Mod3Signed ¶
Mod3Signed (rems3) is similar to unsigned version, but remaps output of it differently.
func Mod3Unsigned ¶
Mod3Unsigned (remu3) is using CountOnes (pop).
func Mod3Unsigned2 ¶
Mod3Unsigned2 (remu3) is using CountOnes (pop) and table lookup.
func Mod3Unsigned3 ¶
Mod3Unsigned3 (remu3) is using digit summing and in-register lookup.
func Mod3Unsigned4 ¶
Mod3Unsigned4 (remu3) is using digit summing and in-memory lookup.
func Mod3Unsigned5 ¶
Mod3Unsigned5 (remu3) is using the fact that n % 3 == floor(4/3 * n) % 3. This code is discovered in trial and error. This uses only shift and multiply operations. It executes in 13 instructions.
func Mod3Unsigned6 ¶
Mod3Unsigned6 (remu3) is same as Mod3Unsigned5 but expands multiplications to shifts.
func Mod5Signed ¶
Mod5Signed (rems5) is similar to unsigned version, but remaps output of it differently.
func Mod5Unsigned ¶
Mod5Unsigned (remu5) is using CountOnes (pop) and in-memory lookup.
func Mod5Unsigned2 ¶
Mod5Unsigned2 (remu5) is using multiplication method similar to Mod3Unsigned5. It executes in 11 instructions.
func Mod63Unsigned ¶
Mod63Unsigned (remu63) is mysterious code from Joe Keane that works in 12 instructions.
func Mod63Unsigned2 ¶
Mod63Unsigned2 (remu63) is using multiplication method similar to Mod3Unsigned5. This is not as fast as Joe Keane method, unless there is fast multiplication instruction on the machine.
func Mod7Signed ¶
Mod7Signed (rems7) is similar to unsigned version, but remaps output of it differently.
func Mod7Unsigned ¶
Mod7Unsigned (remu7) is using summing method and in-memory lookup.
func Mod7Unsigned2 ¶
Mod75Unsigned2 (remu7) is using multiplication method similar to Mod3Unsigned5.
func Mod9Signed ¶
Mod9Signed (rems7) is similar to unsigned version, but remaps output of it differently.
func Mod9Unsigned ¶
Mod9Unsigned (remu9) is using summing method and in-memory lookup.
func Mod9Unsigned2 ¶
Mod9Unsigned2 (remu9) is using multiplication method similar to Mod3Unsigned5.
func MulMultiWord ¶
func MulMultiWord(w, u, v []uint16)
MulMultiWord (mulmns) multiplies two multiwords word-wise. w = u * v This does not overflow. We are using uint16 and uint32 to avoid overflow in word multiplication. Most important word can be negative when converted to int16. Refer to routines Int64To4x16b and Int64From4x16b for conversion.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { u := []uint16{1, 2, 3, 4} v := []uint16{5, 6, 7} w := make([]uint16, len(u)+len(v)) hd.MulMultiWord(w, u, v) fmt.Println(w) }
Output: [5 16 34 52 45 28 0]
func MultiplicativeInverseEuclid ¶
MultiplicativeInverseEuclid uses extended Euclidian algorithm.
func MultiplicativeInverseNewton ¶
func MultiplicativeInverseNewton[T Integer](d T) T
MultiplicativeInverseNewton uses Newton method of multiplicative inverse. It follows from the well-known fact that sequence of xn_1 = xn * (2 - d * xn) converges to 1/d (mod d) given good starting xn. This also works with any inverse module of power of 2. Each iteration doubles number of correct bits.
func MultiplyHighOrder32 ¶
MultiplyHighOrder32 (mulhs/mulhu) multiplies two integers and returns the high-order half of the product. This executes in 16 RISC instructions. Go has math/bits.Mul32 that returns higher order bits, however it does uint64 cast and works only for uint32. Remarkably, Go math/bits.Mul64 uses same algorithm as this function but uses 32bit words.
func MultiplyHighOrder64 ¶
MultiplyHighOrder64 (mulhs/mulhu) multiplies two integers and returns the high-order half of the product. This executes in 16 RISC instructions. Go has math/bits.Mul64 that returns higher order bits, however it works only for uint64. Remarkably, Go math/bits.Mul64 uses same algorithm as this function but uses 32bit words. Algorithm in math/bits.Mul64 is the same, but it performs better because it does not use generics.
func NAbs ¶
func NAbs[T Signed](x T) T
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Print(hd.NAbs(-42)) }
Output: -42
func NewCRC32TableLookupTable ¶
func NewCRC32TableLookupTable() [256]uint32
func NextHigherNumberWithSameNumberOfOnes ¶
func NextHigherNumberWithSameNumberOfOnes[T Unsigned](x T) T
NextHigherNumberWithSameNumberOfOnes is useful in bitsets stored in integers to find next set of same cardinality. This function utilizes the theorem of Right-to-Left Computability.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.NextHigherNumberWithSameNumberOfOnes(uint64(0b0011110000))) }
Output: 100000111
func NotEqualZero ¶
func NotEqualZero[T Signed](x T) T
func NotEqualZero2 ¶
func NotEqualZero3 ¶
func NotEqualZero3[T Integer](x T) T
func NotEqualZero4 ¶
func Parity ¶
Parity of number of 1-bits in bit-sequence. This executes in 10 instructions. This is example of "parallel prefix" operation, that very efficient for parallel computing.
Example (Even) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.Parity(0b0010_1101)) }
Output: 0
Example (Odd) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.Parity(0b0000_1101)) }
Output: 1
func Parity2 ¶
Parity2 executes in 9 instructions. It avoids computing higher-order parity bits that will not be used.
func Parity3 ¶
Parity3 computes parity of uint7 and sets 8th bit to 1/0 based on parity. Here we wrap into same signature as other Parity methods for convenience.
func Parity4 ¶
Parity4 computes parity of uint7 and sets 8th bit to 1/0 based on parity, but flipped. Here we wrap into same signature as other Parity methods for convenience.
func Pow ¶
Pow (iexp) takes x to the power of n. This utilizes binary expression of exponent. For example: x^13, and binary of 13 = 0b1101 = 0b1000 + 0b0100 + 0b0001 = 8 + 4 + 1. Thus, we can compute exponent in log number of multiplications. However, this is not always optimal code (e.g. n=27 there is better decomposition ((x^3)^3)^3)). Note, this function is not resistant to overflows.
func RSqrtFloat32 ¶
RSqrtFloat32 (rsqrt) algorithm is fast floating point approximation of reciprocal square root (1 / sqrt). It does not work +/- Inf, NaN, denormals, negative numbers. This algorithm caused some buzz in early 2000s. It produces result with 3.5% error rate.
func Reverse8Bit ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.Reverse8Bit(0b1101_0101)) }
Output: 10101011
func ReverseBits ¶
ReverseBits (rev) reverses bits in x using divide and conquer.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("0x%X", hd.ReverseBits(0x01234567)) }
Output: 0xE6A2C480
func ReverseBits2 ¶
ReverseBits2 uses Knuth algorithm. It is using 21 RISC instructions.
func ReverseBits64Knuth ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("0x%X", hd.ReverseBits64Knuth(0x01234567_89ABCDEF)) }
Output: 0xF7B3D1D1E6A2C480
func ReverseByte ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%0b", hd.ReverseByte(0b1101_0101)) }
Output: 10101011
func RotateLeft ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%032b", hd.RotateLeft(0b11101111111100001111111111111101, 8)) }
Output: 11110000111111111111110111101111
func RotateRight ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%032b", hd.RotateRight(0b11101111111100001111111111111101, 8)) }
Output: 11111101111011111111000011111111
func RoundDownBlockPowerOfTwo ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.RoundDownBlockPowerOfTwo[uint](257, 8)) }
Output: 256
func RoundUpBlockPowerOfTwo ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.RoundUpBlockPowerOfTwo[uint](210, 8)) }
Output: 256
func RoundUpBlockPowerOfTwo2 ¶
func SetBitLastZero ¶
func SetBitLastZero[T Signed](x T) T
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.SetBitLastZero(0b10100111)) }
Output: 00001000
func SetTrailingOnesWithRightMostOne ¶
func SetTrailingOnesWithRightMostOne[T Signed](x T) T
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.SetTrailingOnesWithRightMostOne(0b01010111)) }
Output: 00001111
func SetTrailingZeros ¶
func SetTrailingZeros[T Signed](x T) T
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.SetTrailingZeros(0b01011000)) }
Output: 00000111
func SetTrailingZeros2 ¶
func SetTrailingZeros2[T Signed](x T) T
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.SetTrailingZeros2(0b01011000)) }
Output: 00000111
func SetTrailingZeros3 ¶
func SetTrailingZeros3[T Signed](x T) T
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.SetTrailingZeros3(0b01011000)) }
Output: 00000111
func SetTrailingZerosWithRightMostOne ¶
func SetTrailingZerosWithRightMostOne[T Signed](x T) T
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.SetTrailingZerosWithRightMostOne(0b01011000)) }
Output: 00001111
func SetupCycleThreeValues ¶
SetupCycleThreeValues is final routine that prepares constants for cycling. This can be computed at compile time.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { var c int32 = 0b10101 // 21 var a int32 = 0b11111 // 31 var b int32 = 0b10100 // 20 fmt.Println(hd.SetupCycleThreeValues(a, b, c)) }
Output: -11 10 21 1 0
func SetupCycleThreeValuesN1N2 ¶
SetupCycleThreeValuesN1N2 rearranges a,b,c in order required for cycling and defines n1 and n2 positions.
func ShiftLeftDoubleLength ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { x := [2]uint32{0b1111, 0b1111} y := hd.ShiftLeftDoubleLength(x, 3) fmt.Printf("%032b_%032b", y[1], y[0]) }
Output: 00000000000000000000000001111000_00000000000000000000000001111000
func ShiftRightSigned32 ¶
ShiftRightUnsigned32 does arithmetic shift right that Go is missing. This compiles to single RISC instruction and is going to be inlined.
func ShiftRightSigned64 ¶
ShiftRightUnsigned64 does arithmetic shift right that Go is missing. This compiles to single RISC instruction and is going to be inlined.
func ShiftRightSignedDoubleLength ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { x := [2]uint32{0b1111, 0b1111} y := hd.ShiftRightSignedDoubleLength(x, 3) fmt.Printf("%032b_%032b", y[1], y[0]) }
Output: 00000000000000000000000000000001_11100000000000000000000000000001
Example (Negative) ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { x := [2]uint32{0b1111, 0b10000000000000000000000000000001} y := hd.ShiftRightSignedDoubleLength(x, 3) fmt.Printf("%032b_%032b", y[1], y[0]) }
Output: 11110000000000000000000000000000_00100000000000000000000000000001
func ShiftRightSignedFromUnsigned ¶
ShiftRightSignedFromUnsigned computes signed shift from unsigned shift. This compiles to five or six RISC instructions. If n is fixed, then this compiles to three or four RISC instructions.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.ShiftRightSignedFromUnsigned(0b11111111111111111111111111101010, 2)) }
Output: 11111111111111111111111111111010
func ShiftRightSignedFromUnsigned2 ¶
ShiftRightSignedFromUnsigned2 is alternative version. If n is fixed, then this compiles to three or four RISC instructions.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.ShiftRightSignedFromUnsigned2(0b11111111111111111111111111101010, 2)) }
Output: 11111111111111111111111111111010
func ShiftRightSignedFromUnsigned3 ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.ShiftRightSignedFromUnsigned3(0b11111111111111111111111111101010, 2)) }
Output: 11111111111111111111111111111010
func ShiftRightSignedFromUnsigned4 ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.ShiftRightSignedFromUnsigned4(0b11111111111111111111111111101010, 2)) }
Output: 11111111111111111111111111111010
func ShiftRightSignedFromUnsigned5 ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.ShiftRightSignedFromUnsigned5(0b11111111111111111111111111101010, 2)) }
Output: 11111111111111111111111111111010
func ShiftRightUnsigned32 ¶
ShiftRightUnsigned32 does logical shift right that Go is missing. This compiles to single RISC instruction and is going to be inlined.
func ShiftRightUnsigned64 ¶
ShiftRightUnsigned64 does logical shift right that Go is missing. This compiles to single RISC instruction and is going to be inlined.
func ShiftRightUnsignedDoubleLength ¶
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { x := [2]uint32{0b1111, 0b1111} y := hd.ShiftRightUnsignedDoubleLength(x, 3) fmt.Printf("%032b_%032b", y[1], y[0]) }
Output: 00000000000000000000000000000001_11100000000000000000000000000001
func ShuffleInner ¶
ShuffleInner performs perfect inner shuffle of bits.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { // each symbol is a bit // in: abcd_efgh_ijkl_mnop_ABCD_EFGH_IJKL_MNPO // out: AaBb_CcDd_EeFf_GgHh_IiJj_KkLl_MmNn_OoPp fmt.Printf("%032b", hd.ShuffleInner(0b1111_1111_1111_1111_0000_0000_0000_0000)) }
Output: 01010101010101010101010101010101
func ShuffleOuter ¶
ShuffleOuter performs perfect outer shuffle of bits. This function is used in cryptography. This is 32 RISC instructions.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { // each symbol is a bit // in: abcd_efgh_ijkl_mnop_ABCD_EFGH_IJKL_MNPO // out: aAbB_cCdD_eEfF_gGhH_iI_jJ_kKlL_mMnN_oOpP fmt.Printf("%032b", hd.ShuffleOuter(0b1111_1111_0000_0000_0000_0000_1111_1111)) }
Output: 10101010101010100101010101010101
func Sign ¶
Sign can be calculated in four RISC instructions with comparison operators. Note, there is five and four instruction formula, but in Go it would not fit, since we need explicit logical right shift and conversion of full signed to unsigned bits. Note, there is three instruction formula using comparison operators, but in Go strong typing booleans not converted to ints, which would require branches and more instructions.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.Sign(-10), hd.Sign(10), hd.Sign(0)) }
Output: -1 1 0
func SqrtBinarySearch ¶
SqrtBinarySearch (isqrt) is similar to Newton method with estimate, but performs fully estimation.
func SqrtNewton ¶
SqrtNewton (isqrt) converges quadratically on series of better approximations. This version uses lookup table coded in branches that is used for better starting values.
func SqrtShiftAndSubtract ¶
SqrtShiftAndSubtract (isqrt) is similar to some hardware implementation due to its effective use of shift instructions.
func SubDoubleLength ¶
SubDoubleLength of uint64 numbers encoded in two uint32. This takes eight instructions.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { x := [2]uint32{310, 405} y := [2]uint32{100, 200} fmt.Println(hd.SubDoubleLength(x, y)) }
Output: [210 205]
func SubFourUint8 ¶
SubFourUint8 executes in eight branch-free instructions
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { x := hd.FourUint8ToUint32([4]uint8{2, 3, 254, 17}) y := hd.FourUint8ToUint32([4]uint8{2, 1, 253, 0}) sum := hd.SubFourUint8(x, y) fmt.Println(hd.FourUint8FromUint32(sum)) }
Output: [0 2 1 17]
func SubTwoUint16 ¶
SubFourUint8 executes in seven branch-free instructions
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { x := hd.TwoUint16ToUint32([2]uint16{65535 - 1, 5}) y := hd.TwoUint16ToUint32([2]uint16{65535 - 5, 1}) sum := hd.SubTwoUint16(x, y) fmt.Println(hd.TwoUint16FromUint32(sum)) }
Output: [4 4]
func TrailingZerosUint32 ¶
TrailingZerosUint32 (ntz) uses John Reiser variant of David Seal method. Among applications of TrailingZerosUint32 is R.W.Gosper Loop Detection Algorithm.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.TrailingZerosUint32(0b0000_1101_0000)) }
Output: 4
func TransposeMatrix8bx8b ¶
TransposeMatrix8bx8b (transpose8) uses divide and conquer method. Each matrix i,j value is encoded in i-th byte and j-th bit of a byte. This is 2x fewer calculating RISC instructions than straightforward masking method. This is 21 calculating RISC instructions. A: m rows * n columns is 1-based coordinates of source submatrix. B: n rows * m columns is 1-based coordinates of destination submatrix.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { A := []byte{ 0b10101011, 0b01011110, 0b00010101, 0b00001000, 0b00010000, 0b00110000, 0b01010000, 0b10010000, } B := make([]byte, 8) hd.TransposeMatrix8bx8b(A, B, 1, 1) for _, q := range B { fmt.Printf("%08b\n", q) } }
Output: 10000001 01000010 10000100 01101111 11010000 01100000 11000000 10100000
func TurnOffRightMostBit ¶
func TurnOffRightMostBit[T Signed](x T) T
TurnOffRightMostBit this can be used to test if integer is power of 2
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.TurnOffRightMostBit(0b01011000)) }
Output: 01010000
func TurnOffRightmostOnes ¶
func TurnOffRightmostOnes[T Signed](x T) T
TurnOffRightmostOnes this can be sued to determine if a nonnegative integer is of the for 2^j - 2^k for some j >= k >= 0
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.TurnOffRightmostOnes(0b01011100)) }
Output: 01000000
func TurnOffRightmostOnes2 ¶
func TurnOffRightmostOnes2[T Signed](x T) T
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.TurnOffRightmostOnes2(0b01011100)) }
Output: 01000000
func TurnOffTrailingOnes ¶
func TurnOffTrailingOnes[T Signed](x T) T
TurnOffTrailingOnes this can be used to test if integer if of the form 2^n - 1
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.TurnOffTrailingOnes(0b10100111)) }
Output: 10100000
func TurnOnRightMostBit ¶
func TurnOnRightMostBit[T Signed](x T) T
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.TurnOnRightMostBit(0b10100111)) }
Output: 10101111
func TurnOnTrailingZeros ¶
func TurnOnTrailingZeros[T Signed](x T) T
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Printf("%08b", hd.TurnOnTrailingZeros(0b10101000)) }
Output: 10101111
func TwoUint16FromUint32 ¶
TwoUint16FromUint32 unpacks two uint16 from single uint32.
func TwoUint16ToUint32 ¶
TwoUint16ToUint32 packs two uint16 into single uint32.
func Uint64FromNx16b ¶
func UnShuffleInner ¶
UnShuffleInner is reverse of ShuffleInner.
func UnShuffleOuter ¶
UnShuffleOuter is reverse of ShuffleOuter.
func ZByteL ¶
ZByteL finds index of left most zero byte. This is branch-free code in 11 RISC instructions. This is useful for `strlen` in C strings, which use 0 byte for string termination.
Example ¶
package main import ( "fmt" hd "github.com/nikolaydubina/go-hackers-delight" ) func main() { fmt.Println(hd.ZByteL(0x12_00_FF_00)) }
Output: 1
Types ¶
type LRUCache ¶
type LRUCache struct {
// contains filtered or unexported fields
}
LRUCache is eight-way set associative cache with least-recently-used replacement policy that uses reference matrix method. The whole structure fits into single 64bit register. Internally, least significant byte of uint64 holds row 0 of reference matrix.
func (*LRUCache) Hit ¶
Hit value i as most recently used. This is five or six instructions on 64bit RISC. Values of i should be in [0, 7].
func (*LRUCache) LeastRecentlyUsed ¶
Source Files
¶
- 00_bit_packing.go
- 00_constants.go
- 00_shift.go
- 00_types.go
- 02-00.go
- 02-01_manipulating_rightmost_bits.go
- 02-01_next_higher_number_with_same_num_ones.go
- 02-04_abs.go
- 02-05_avg.go
- 02-06_sign.go
- 02-12_comparison.go
- 02-13_overflow.go
- 02-15_rotate_shifts.go
- 02-16_double_length_add_sub.go
- 02-17_double_length_shifts.go
- 02-18_multibyte_arithmetics.go
- 02-19_doz_max_min.go
- 02-20_exchanging_registers.go
- 02-21_alternating_values.go
- 03-00_power_of_two_boundaries.go
- 04-00_arithmetic_bounds.go
- 04-03_arithmetic_bounds_propagating_logical.go
- 05-01_counting_ones.go
- 05-02_parity.go
- 05-03_counting_zeros.go
- 06-00_searching_words.go
- 07-01_reversing_bits_and_bytes.go
- 07-02_shuffle.go
- 07-03_transposing_bit_matrix.go
- 07-04_compress.go
- 07-09_lru.go
- 08-00_multiplication.go
- 09-00_integer_division.go
- 10-00_integer_division_by_constant.go
- 10-01_integer_division_by_constant_signed.go
- 10-08_integer_division_by_constant_unsigned.go
- 10-16_integer_division_by_constant_exact.go
- 10-17_integer_division_by_constant_shifts.go
- 10-19_integer_division_by_constant_reminder.go
- 11-00_elementary_functions.go
- 11-01_sqrt.go
- 12-00_unusual_bases_number_systems.go
- 13-00_gray_code.go
- 14-00_cyclic_redundancy_check.go
- 15-00_error_correcting_codes.go
- 17-00_floating_point.go
- 18-00_primes.go