functional

package

v1.41.0 Latest Latest Go to latest Published: Apr 4, 2026 License: Apache-2.0 Imports: 6 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/zerfoo/zerfoo

Links

Open Source Insights

Documentation ¶

Overview ¶

Package functional provides stateless, tensor-in tensor-out wrappers for common neural-network operations. Unlike the graph-aware layer types in sibling packages (e.g. layers/normalization), the functions here carry no parameter state, do no graph registration, and have no backward pass — they are pure forward-only computations suitable for scripting, testing, and composition.

Index ¶

func GELU[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (*tensor.TensorNumeric[T], error)
func GELUBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (*tensor.TensorNumeric[T], error)
func LayerNorm[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ...) (*tensor.TensorNumeric[T], error)
func LayerNormBackward[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ...) (dInput, dScale, dBias *tensor.TensorNumeric[T], err error)
func Linear[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ...) (*tensor.TensorNumeric[T], error)
func LinearBackward[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ...) (dInput, dWeight, dBias *tensor.TensorNumeric[T], err error)
func MLPBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (dInput, dWeight1, dBias1, dWeight2, dBias2 *tensor.TensorNumeric[T], err error)
func MultiHeadAttention[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ...) (*tensor.TensorNumeric[T], error)
func MultiHeadAttentionBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (dQ, dK, dV *tensor.TensorNumeric[T], err error)
func RMSNorm[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ...) (*tensor.TensorNumeric[T], error)
func ReLU[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (*tensor.TensorNumeric[T], error)
func SiLU[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (*tensor.TensorNumeric[T], error)
func Sigmoid[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (*tensor.TensorNumeric[T], error)
func Softmax[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], x *tensor.TensorNumeric[T], ...) (*tensor.TensorNumeric[T], error)
func SoftmaxBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (*tensor.TensorNumeric[T], error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func GELU ¶

func GELU[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	x *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

GELU applies the Gaussian Error Linear Unit activation using the tanh approximation: 0.5 * x * (1 + tanh(sqrt(2/pi) * (x + 0.044715*x^3))). All arithmetic is routed through Engine[T] primitives.

func GELUBackward ¶

func GELUBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	dOutput, input *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

GELUBackward computes the gradient of the GELU activation. dOutput: gradient from upstream input: original input to GELU Returns: dInput (same shape as input)

Using the tanh approximation GELU(x) = 0.5 * x * (1 + tanh(u)) where u = sqrt(2/pi) * (x + 0.044715*x^3), the derivative is: GELU'(x) = 0.5*(1+tanh(u)) + 0.5*x*(1-tanh^2(u))*sqrt(2/pi)*(1+3*0.044715*x^2)

func LayerNorm ¶

func LayerNorm[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	x, scale, bias *tensor.TensorNumeric[T], eps T) (*tensor.TensorNumeric[T], error)

LayerNorm applies layer normalization to x using the provided scale (gamma) and bias (beta) tensors. Normalization is performed over the last dimension.

output = (x - mean) / sqrt(variance + eps) * scale + bias

func LayerNormBackward ¶

func LayerNormBackward[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	dOutput, input, scale *tensor.TensorNumeric[T], eps T) (dInput, dScale, dBias *tensor.TensorNumeric[T], err error)

LayerNormBackward computes gradients for layer normalization. dOutput: gradient from upstream [*, features] input: original input [*, features] scale: gamma [features] eps: epsilon used in forward Returns: dInput [*, features], dScale [features], dBias [features]

func Linear ¶

func Linear[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	x, weight *tensor.TensorNumeric[T], bias *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

Linear computes y = x @ weight^T + bias. If bias is nil, computes y = x @ weight^T. x: [*, in_features], weight: [out_features, in_features], bias: [out_features]

func LinearBackward ¶

func LinearBackward[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	dOutput, input, weight *tensor.TensorNumeric[T]) (dInput, dWeight, dBias *tensor.TensorNumeric[T], err error)

LinearBackward computes gradients for y = x @ weight^T + bias. dOutput: gradient from upstream [batch, out_features] input: original input [batch, in_features] weight: weight matrix [out_features, in_features] Returns: dInput [batch, in_features], dWeight [out_features, in_features], dBias [out_features]

func MLPBackward ¶

func MLPBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	dOutput, input, weight1, bias1, weight2, bias2, hidden, activated *tensor.TensorNumeric[T],
	activation string) (dInput, dWeight1, dBias1, dWeight2, dBias2 *tensor.TensorNumeric[T], err error)

MLPBackward computes gradients for a 2-layer MLP: y = Linear2(activation(Linear1(x))). dOutput: gradient from upstream [batch, out_features] input: original input [batch, in_features] weight1: [hidden, in_features], bias1: [hidden] weight2: [out_features, hidden], bias2: [out_features] hidden: output of Linear1 (pre-activation) [batch, hidden] activated: output after activation [batch, hidden] activation: "relu" or "gelu" Returns: dInput, dWeight1, dBias1, dWeight2, dBias2

func MultiHeadAttention ¶

func MultiHeadAttention[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	q, k, v *tensor.TensorNumeric[T], nHeads int) (*tensor.TensorNumeric[T], error)

MultiHeadAttention computes multi-head scaled dot-product attention. q, k, v: [seq_len, d_model]. nHeads: number of attention heads. Returns: [seq_len, d_model]

func MultiHeadAttentionBackward ¶

func MultiHeadAttentionBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	dOutput, q, k, v *tensor.TensorNumeric[T], nHeads int) (dQ, dK, dV *tensor.TensorNumeric[T], err error)

MultiHeadAttentionBackward computes gradients for multi-head scaled dot-product attention. dOutput: gradient from upstream [seq_len, d_model] q, k, v: original inputs [seq_len, d_model] nHeads: number of attention heads Returns: dQ, dK, dV [seq_len, d_model]

func RMSNorm ¶

func RMSNorm[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	x, scale *tensor.TensorNumeric[T], eps T) (*tensor.TensorNumeric[T], error)

RMSNorm applies root-mean-square normalization to x using the provided scale (gain) tensor. Normalization is performed over the last dimension.

output = x * rsqrt(mean(x^2) + eps) * scale

func ReLU ¶

func ReLU[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	x *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

ReLU applies the Rectified Linear Unit activation: max(0, x).

func SiLU ¶

func SiLU[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	x *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

SiLU applies the Sigmoid Linear Unit (SiLU / Swish) activation: x * sigmoid(x). All arithmetic is routed through Engine[T] primitives.

func Sigmoid ¶

func Sigmoid[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	x *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

Sigmoid applies the sigmoid activation: exp(x) / (1 + exp(x)). All arithmetic is routed through Engine[T] primitives.

func Softmax ¶

func Softmax[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	x *tensor.TensorNumeric[T], axis int) (*tensor.TensorNumeric[T], error)

Softmax applies the softmax function along the given axis.

func SoftmaxBackward ¶

func SoftmaxBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	dOutput, softmaxOutput *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

SoftmaxBackward computes the gradient of the softmax function. dOutput: gradient from upstream [*, features] softmaxOutput: output of softmax forward pass [*, features] (already computed s_i values) Returns: dInput [*, features]

For each row: dInput_i = s_i * (dOutput_i - sum_j(dOutput_j * s_j))

Types ¶

This section is empty.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL