Documentation
¶
Overview ¶
Package kernels provides Go wrappers for custom HIP kernels via purego dlopen. Build libhipkernels.so first: cd internal/hip/kernels && make. No build tags required; use kernels.Available() to check runtime availability.
Index ¶
- func Add(a, b, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func AddScalar(a unsafe.Pointer, scalar float32, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func Available() bool
- func Div(a, b, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func DivScalar(a unsafe.Pointer, scalar float32, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func Exp(a, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func Fill(data unsafe.Pointer, value float32, n int, s unsafe.Pointer) error
- func FlashAttentionForward(Q, K, V, O unsafe.Pointer, batch, heads, seqLen, headDim int, causal bool, ...) error
- func Log(a, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func Mul(a, b, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func MulScalar(a unsafe.Pointer, scalar float32, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func Pow(base, exp, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func Rsqrt(a, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func Softmax(input, output unsafe.Pointer, outer, inner, axisSize int, s unsafe.Pointer) error
- func Sqrt(a, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func Sub(a, b, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func SumAxis(input, output unsafe.Pointer, outer, inner, axisSize int, s unsafe.Pointer) error
- func Tanh(a, c unsafe.Pointer, n int, s unsafe.Pointer) error
- func TanhPrime(a, upstream, c unsafe.Pointer, n int, s unsafe.Pointer) error
- type KernelLib
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Available ¶
func Available() bool
Available returns true if the HIP kernel library is loadable.
func FlashAttentionForward ¶
func FlashAttentionForward( Q, K, V, O unsafe.Pointer, batch, heads, seqLen, headDim int, causal bool, stream unsafe.Pointer, ) error
FlashAttentionForward computes scaled dot-product attention using a fused tiled kernel. All tensors are in [batch, heads, seq_len, head_dim] layout. When causal is true, an upper-triangular mask is applied.
Types ¶
Click to show internal directories.
Click to hide internal directories.