Documentation
¶
Overview ¶
Package rocblas provides low-level bindings for the AMD rocBLAS library using purego dlopen. No build tags required; use rocblas.Available() to check runtime availability.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Available ¶
func Available() bool
Available returns true if rocBLAS is loadable on this machine. The result is cached after the first call.
func Sgemm ¶
func Sgemm(h *Handle, m, n, k int, alpha float32, a unsafe.Pointer, b unsafe.Pointer, beta float32, c unsafe.Pointer, ) error
Sgemm performs single-precision general matrix multiplication.
This function handles the row-major to column-major conversion internally. rocBLAS uses column-major order, but Go uses row-major. The trick:
For row-major C = A * B (m x n = m x k * k x n): Call rocblas_sgemm with B as first arg and A as second, swapping m/n, because in column-major: B^T * A^T = (A * B)^T, and since rocBLAS reads row-major data as the transpose of what it expects, this yields the correct row-major result in C.
Types ¶
type Handle ¶
type Handle struct {
// contains filtered or unexported fields
}
Handle wraps a rocBLAS handle.
func CreateHandle ¶
CreateHandle creates a new rocBLAS context handle.
type RocBLASLib ¶
type RocBLASLib struct {
// contains filtered or unexported fields
}
RocBLASLib holds dlopen handles and resolved function pointers for rocBLAS functions. All function pointers are resolved at Open() time via dlsym. Calls go through cuda.Ccall which uses the platform-specific zero-CGo mechanism.
func Lib ¶
func Lib() *RocBLASLib
Lib returns the global RocBLASLib instance, or nil if rocBLAS is not available.
func Open ¶
func Open() (*RocBLASLib, error)
Open loads librocblas via dlopen and resolves all rocBLAS function pointers via dlsym. Returns an error if rocBLAS is not available.