Documentation
¶
Overview ¶
Package functional provides stateless, tensor-in tensor-out wrappers for common neural-network operations. Unlike the graph-aware layer types in sibling packages (e.g. layers/normalization), the functions here carry no parameter state, do no graph registration, and have no backward pass — they are pure forward-only computations suitable for scripting, testing, and composition.
Index ¶
- func GELU[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (*tensor.TensorNumeric[T], error)
- func GELUBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (*tensor.TensorNumeric[T], error)
- func LayerNorm[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ...) (*tensor.TensorNumeric[T], error)
- func LayerNormBackward[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ...) (dInput, dScale, dBias *tensor.TensorNumeric[T], err error)
- func Linear[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ...) (*tensor.TensorNumeric[T], error)
- func LinearBackward[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ...) (dInput, dWeight, dBias *tensor.TensorNumeric[T], err error)
- func MLPBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (dInput, dWeight1, dBias1, dWeight2, dBias2 *tensor.TensorNumeric[T], err error)
- func MultiHeadAttention[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ...) (*tensor.TensorNumeric[T], error)
- func MultiHeadAttentionBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (dQ, dK, dV *tensor.TensorNumeric[T], err error)
- func RMSNorm[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ...) (*tensor.TensorNumeric[T], error)
- func ReLU[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (*tensor.TensorNumeric[T], error)
- func SiLU[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (*tensor.TensorNumeric[T], error)
- func Sigmoid[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (*tensor.TensorNumeric[T], error)
- func Softmax[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], x *tensor.TensorNumeric[T], ...) (*tensor.TensorNumeric[T], error)
- func SoftmaxBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], ...) (*tensor.TensorNumeric[T], error)
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func GELU ¶
func GELU[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], x *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
GELU applies the Gaussian Error Linear Unit activation using the tanh approximation: 0.5 * x * (1 + tanh(sqrt(2/pi) * (x + 0.044715*x^3))). All arithmetic is routed through Engine[T] primitives.
func GELUBackward ¶
func GELUBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], dOutput, input *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
GELUBackward computes the gradient of the GELU activation. dOutput: gradient from upstream input: original input to GELU Returns: dInput (same shape as input)
Using the tanh approximation GELU(x) = 0.5 * x * (1 + tanh(u)) where u = sqrt(2/pi) * (x + 0.044715*x^3), the derivative is: GELU'(x) = 0.5*(1+tanh(u)) + 0.5*x*(1-tanh^2(u))*sqrt(2/pi)*(1+3*0.044715*x^2)
func LayerNorm ¶
func LayerNorm[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], x, scale, bias *tensor.TensorNumeric[T], eps T) (*tensor.TensorNumeric[T], error)
LayerNorm applies layer normalization to x using the provided scale (gamma) and bias (beta) tensors. Normalization is performed over the last dimension.
output = (x - mean) / sqrt(variance + eps) * scale + bias
func LayerNormBackward ¶
func LayerNormBackward[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], dOutput, input, scale *tensor.TensorNumeric[T], eps T) (dInput, dScale, dBias *tensor.TensorNumeric[T], err error)
LayerNormBackward computes gradients for layer normalization. dOutput: gradient from upstream [*, features] input: original input [*, features] scale: gamma [features] eps: epsilon used in forward Returns: dInput [*, features], dScale [features], dBias [features]
func Linear ¶
func Linear[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], x, weight *tensor.TensorNumeric[T], bias *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
Linear computes y = x @ weight^T + bias. If bias is nil, computes y = x @ weight^T. x: [*, in_features], weight: [out_features, in_features], bias: [out_features]
func LinearBackward ¶
func LinearBackward[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], dOutput, input, weight *tensor.TensorNumeric[T]) (dInput, dWeight, dBias *tensor.TensorNumeric[T], err error)
LinearBackward computes gradients for y = x @ weight^T + bias. dOutput: gradient from upstream [batch, out_features] input: original input [batch, in_features] weight: weight matrix [out_features, in_features] Returns: dInput [batch, in_features], dWeight [out_features, in_features], dBias [out_features]
func MLPBackward ¶
func MLPBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], dOutput, input, weight1, bias1, weight2, bias2, hidden, activated *tensor.TensorNumeric[T], activation string) (dInput, dWeight1, dBias1, dWeight2, dBias2 *tensor.TensorNumeric[T], err error)
MLPBackward computes gradients for a 2-layer MLP: y = Linear2(activation(Linear1(x))). dOutput: gradient from upstream [batch, out_features] input: original input [batch, in_features] weight1: [hidden, in_features], bias1: [hidden] weight2: [out_features, hidden], bias2: [out_features] hidden: output of Linear1 (pre-activation) [batch, hidden] activated: output after activation [batch, hidden] activation: "relu" or "gelu" Returns: dInput, dWeight1, dBias1, dWeight2, dBias2
func MultiHeadAttention ¶
func MultiHeadAttention[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], q, k, v *tensor.TensorNumeric[T], nHeads int) (*tensor.TensorNumeric[T], error)
MultiHeadAttention computes multi-head scaled dot-product attention. q, k, v: [seq_len, d_model]. nHeads: number of attention heads. Returns: [seq_len, d_model]
func MultiHeadAttentionBackward ¶
func MultiHeadAttentionBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], dOutput, q, k, v *tensor.TensorNumeric[T], nHeads int) (dQ, dK, dV *tensor.TensorNumeric[T], err error)
MultiHeadAttentionBackward computes gradients for multi-head scaled dot-product attention. dOutput: gradient from upstream [seq_len, d_model] q, k, v: original inputs [seq_len, d_model] nHeads: number of attention heads Returns: dQ, dK, dV [seq_len, d_model]
func RMSNorm ¶
func RMSNorm[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], x, scale *tensor.TensorNumeric[T], eps T) (*tensor.TensorNumeric[T], error)
RMSNorm applies root-mean-square normalization to x using the provided scale (gain) tensor. Normalization is performed over the last dimension.
output = x * rsqrt(mean(x^2) + eps) * scale
func ReLU ¶
func ReLU[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], x *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
ReLU applies the Rectified Linear Unit activation: max(0, x).
func SiLU ¶
func SiLU[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], x *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
SiLU applies the Sigmoid Linear Unit (SiLU / Swish) activation: x * sigmoid(x). All arithmetic is routed through Engine[T] primitives.
func Sigmoid ¶
func Sigmoid[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], x *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
Sigmoid applies the sigmoid activation: exp(x) / (1 + exp(x)). All arithmetic is routed through Engine[T] primitives.
func Softmax ¶
func Softmax[T tensor.Numeric](ctx context.Context, engine compute.Engine[T], x *tensor.TensorNumeric[T], axis int) (*tensor.TensorNumeric[T], error)
Softmax applies the softmax function along the given axis.
func SoftmaxBackward ¶
func SoftmaxBackward[T tensor.Float](ctx context.Context, engine compute.Engine[T], ops numeric.Arithmetic[T], dOutput, softmaxOutput *tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
SoftmaxBackward computes the gradient of the softmax function. dOutput: gradient from upstream [*, features] softmaxOutput: output of softmax forward pass [*, features] (already computed s_i values) Returns: dInput [*, features]
For each row: dInput_i = s_i * (dOutput_i - sum_j(dOutput_j * s_j))
Types ¶
This section is empty.