functional

package
v1.41.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 4, 2026 License: Apache-2.0 Imports: 6 Imported by: 0

Documentation

Overview

Package functional provides stateless, tensor-in tensor-out wrappers for common neural-network operations. Unlike the graph-aware layer types in sibling packages (e.g. layers/normalization), the functions here carry no parameter state, do no graph registration, and have no backward pass — they are pure forward-only computations suitable for scripting, testing, and composition.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func GELU

GELU applies the Gaussian Error Linear Unit activation using the tanh approximation: 0.5 * x * (1 + tanh(sqrt(2/pi) * (x + 0.044715*x^3))). All arithmetic is routed through Engine[T] primitives.

func GELUBackward

func GELUBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	dOutput, input *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

GELUBackward computes the gradient of the GELU activation. dOutput: gradient from upstream input: original input to GELU Returns: dInput (same shape as input)

Using the tanh approximation GELU(x) = 0.5 * x * (1 + tanh(u)) where u = sqrt(2/pi) * (x + 0.044715*x^3), the derivative is: GELU'(x) = 0.5*(1+tanh(u)) + 0.5*x*(1-tanh^2(u))*sqrt(2/pi)*(1+3*0.044715*x^2)

func LayerNorm

func LayerNorm[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	x, scale, bias *tensor.TensorNumeric[T], eps T) (*tensor.TensorNumeric[T], error)

LayerNorm applies layer normalization to x using the provided scale (gamma) and bias (beta) tensors. Normalization is performed over the last dimension.

output = (x - mean) / sqrt(variance + eps) * scale + bias

func LayerNormBackward

func LayerNormBackward[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	dOutput, input, scale *tensor.TensorNumeric[T], eps T) (dInput, dScale, dBias *tensor.TensorNumeric[T], err error)

LayerNormBackward computes gradients for layer normalization. dOutput: gradient from upstream [*, features] input: original input [*, features] scale: gamma [features] eps: epsilon used in forward Returns: dInput [*, features], dScale [features], dBias [features]

func Linear

func Linear[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	x, weight *tensor.TensorNumeric[T], bias *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

Linear computes y = x @ weight^T + bias. If bias is nil, computes y = x @ weight^T. x: [*, in_features], weight: [out_features, in_features], bias: [out_features]

func LinearBackward

func LinearBackward[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	dOutput, input, weight *tensor.TensorNumeric[T]) (dInput, dWeight, dBias *tensor.TensorNumeric[T], err error)

LinearBackward computes gradients for y = x @ weight^T + bias. dOutput: gradient from upstream [batch, out_features] input: original input [batch, in_features] weight: weight matrix [out_features, in_features] Returns: dInput [batch, in_features], dWeight [out_features, in_features], dBias [out_features]

func MLPBackward

func MLPBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	dOutput, input, weight1, bias1, weight2, bias2, hidden, activated *tensor.TensorNumeric[T],
	activation string) (dInput, dWeight1, dBias1, dWeight2, dBias2 *tensor.TensorNumeric[T], err error)

MLPBackward computes gradients for a 2-layer MLP: y = Linear2(activation(Linear1(x))). dOutput: gradient from upstream [batch, out_features] input: original input [batch, in_features] weight1: [hidden, in_features], bias1: [hidden] weight2: [out_features, hidden], bias2: [out_features] hidden: output of Linear1 (pre-activation) [batch, hidden] activated: output after activation [batch, hidden] activation: "relu" or "gelu" Returns: dInput, dWeight1, dBias1, dWeight2, dBias2

func MultiHeadAttention

func MultiHeadAttention[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	q, k, v *tensor.TensorNumeric[T], nHeads int) (*tensor.TensorNumeric[T], error)

MultiHeadAttention computes multi-head scaled dot-product attention. q, k, v: [seq_len, d_model]. nHeads: number of attention heads. Returns: [seq_len, d_model]

func MultiHeadAttentionBackward

func MultiHeadAttentionBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	dOutput, q, k, v *tensor.TensorNumeric[T], nHeads int) (dQ, dK, dV *tensor.TensorNumeric[T], err error)

MultiHeadAttentionBackward computes gradients for multi-head scaled dot-product attention. dOutput: gradient from upstream [seq_len, d_model] q, k, v: original inputs [seq_len, d_model] nHeads: number of attention heads Returns: dQ, dK, dV [seq_len, d_model]

func RMSNorm

func RMSNorm[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	x, scale *tensor.TensorNumeric[T], eps T) (*tensor.TensorNumeric[T], error)

RMSNorm applies root-mean-square normalization to x using the provided scale (gain) tensor. Normalization is performed over the last dimension.

output = x * rsqrt(mean(x^2) + eps) * scale

func ReLU

ReLU applies the Rectified Linear Unit activation: max(0, x).

func SiLU

SiLU applies the Sigmoid Linear Unit (SiLU / Swish) activation: x * sigmoid(x). All arithmetic is routed through Engine[T] primitives.

func Sigmoid

func Sigmoid[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	x *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

Sigmoid applies the sigmoid activation: exp(x) / (1 + exp(x)). All arithmetic is routed through Engine[T] primitives.

func Softmax

func Softmax[T tensor.Numeric](ctx context.Context, engine compute.Engine[T],
	x *tensor.TensorNumeric[T], axis int) (*tensor.TensorNumeric[T], error)

Softmax applies the softmax function along the given axis.

func SoftmaxBackward

func SoftmaxBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T],
	dOutput, softmaxOutput *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

SoftmaxBackward computes the gradient of the softmax function. dOutput: gradient from upstream [*, features] softmaxOutput: output of softmax forward pass [*, features] (already computed s_i values) Returns: dInput [*, features]

For each row: dInput_i = s_i * (dOutput_i - sum_j(dOutput_j * s_j))

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL