loss

package
v1.38.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 30, 2026 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Overview

Package loss provides various loss functions for neural networks.

Package loss provides loss function implementations for training.

Stability: beta

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func QuantileLoss added in v1.5.0

func QuantileLoss[T tensor.Numeric](preds, targets *tensor.TensorNumeric[T], quantiles []float32) (float32, error)

QuantileLoss computes the pinball (quantile regression) loss. preds has shape [batch, num_quantiles], targets has shape [batch], and quantiles is a slice of quantile levels (e.g., 0.1, 0.5, 0.9).

For each quantile q and sample i:

error = target_i - pred_i_q
loss_q = q * error   if error >= 0
loss_q = (q-1) * error   if error < 0

Returns the mean loss over all samples and quantiles.

func SharpeLoss added in v1.5.0

func SharpeLoss[T tensor.Numeric](weights, returns_ *tensor.TensorNumeric[T]) (float32, error)

SharpeLoss computes the negative Sharpe ratio as a differentiable loss for portfolio optimization.

weights has shape [batch, num_assets] — interpreted as portfolio weights (softmax-normalized internally to ensure long-only, sum-to-one constraint). returns_ has shape [batch, num_assets] — per-asset log returns for each time step in the batch.

Portfolio return for time step i = sum_j(w_j * r_ij) Sharpe = mean(portfolio_returns) / std(portfolio_returns) SharpeLoss = -Sharpe (to minimize)

Types

type BCELoss added in v1.8.0

type BCELoss[T tensor.Numeric] struct {
	// contains filtered or unexported fields
}

BCELoss calculates binary cross-entropy loss between predictions and targets. BCE(y, p) = -[y*log(p) + (1-y)*log(1-p)] Predictions are clamped to [eps, 1-eps] for numerical stability.

func NewBCELoss added in v1.8.0

func NewBCELoss[T tensor.Numeric](engine compute.Engine[T], ops numeric.Arithmetic[T]) *BCELoss[T]

NewBCELoss creates a new BCELoss loss function.

func (*BCELoss[T]) Attributes added in v1.8.0

func (b *BCELoss[T]) Attributes() map[string]interface{}

Attributes returns the attributes of the BCELoss function.

func (*BCELoss[T]) Backward added in v1.8.0

func (b *BCELoss[T]) Backward(ctx context.Context, _ types.BackwardMode, dOut *tensor.TensorNumeric[T], inputs ...*tensor.TensorNumeric[T]) ([]*tensor.TensorNumeric[T], error)

Backward computes the gradients for BCELoss with respect to predictions. Gradient: -(y/p - (1-y)/(1-p)) / N, chained with upstream dOut.

func (*BCELoss[T]) Forward added in v1.8.0

func (b *BCELoss[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

Forward computes the mean binary cross-entropy loss.

func (*BCELoss[T]) OpType added in v1.8.0

func (b *BCELoss[T]) OpType() string

OpType returns the operation type of the BCELoss function.

func (*BCELoss[T]) OutputShape added in v1.8.0

func (b *BCELoss[T]) OutputShape() []int

OutputShape returns the output shape of the BCELoss function.

func (*BCELoss[T]) Parameters added in v1.8.0

func (b *BCELoss[T]) Parameters() []*graph.Parameter[T]

Parameters returns the parameters of the BCELoss function.

type CorrLoss added in v0.2.1

type CorrLoss[T tensor.Numeric] struct {
	// contains filtered or unexported fields
}

CorrLoss computes -PearsonCorrelation(predictions, targets) as a differentiable scalar loss. Minimizing this loss maximizes the Pearson correlation between predictions and targets. Since Numerai targets are rank-normalized, Pearson closely approximates Spearman rank correlation.

Forward: loss = -sum(p_c * t_c) / (sqrt(sum(p_c^2) * sum(t_c^2)) + eps)

where p_c = p - mean(p), t_c = t - mean(t)

Backward: grad_i = -(t_c_i / denom - corr * p_c_i / sum_pp) * dOut

All tensor operations use the engine, keeping data on GPU when available. Only scalar intermediate values (means, sums) are read back to CPU.

func NewCorrLoss added in v0.2.1

func NewCorrLoss[T tensor.Numeric](engine compute.Engine[T], ops numeric.Arithmetic[T]) *CorrLoss[T]

NewCorrLoss creates a new correlation loss function.

func (*CorrLoss[T]) Attributes added in v0.2.1

func (c *CorrLoss[T]) Attributes() map[string]any

Attributes returns nil (no configurable attributes).

func (*CorrLoss[T]) Backward added in v0.2.1

func (c *CorrLoss[T]) Backward(ctx context.Context, _ types.BackwardMode, dOut *tensor.TensorNumeric[T], inputs ...*tensor.TensorNumeric[T]) ([]*tensor.TensorNumeric[T], error)

Backward computes the gradient of -PearsonCorrelation with respect to predictions. Returns [dPredictions, dTargets(zeros)].

func (*CorrLoss[T]) Forward added in v0.2.1

func (c *CorrLoss[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

Forward computes -PearsonCorrelation(predictions, targets).

func (*CorrLoss[T]) OpType added in v0.2.1

func (c *CorrLoss[T]) OpType() string

OpType returns "CorrLoss".

func (*CorrLoss[T]) OutputShape added in v0.2.1

func (c *CorrLoss[T]) OutputShape() []int

OutputShape returns [1] (scalar loss).

func (*CorrLoss[T]) Parameters added in v0.2.1

func (c *CorrLoss[T]) Parameters() []*graph.Parameter[T]

Parameters returns nil (no trainable parameters).

type CrossEntropyLoss

type CrossEntropyLoss[T tensor.Numeric] struct {
	// contains filtered or unexported fields
}

CrossEntropyLoss computes the cross-entropy loss.

func NewCrossEntropyLoss

func NewCrossEntropyLoss[T tensor.Numeric](engine compute.Engine[T]) *CrossEntropyLoss[T]

NewCrossEntropyLoss creates a new CrossEntropyLoss layer.

func (*CrossEntropyLoss[T]) Attributes added in v0.2.1

func (cel *CrossEntropyLoss[T]) Attributes() map[string]interface{}

Attributes returns the attributes of the CrossEntropyLoss layer.

func (*CrossEntropyLoss[T]) Backward

func (cel *CrossEntropyLoss[T]) Backward(ctx context.Context, _ types.BackwardMode, dOut *tensor.TensorNumeric[T], _ ...*tensor.TensorNumeric[T]) ([]*tensor.TensorNumeric[T], error)

Backward computes the gradients for CrossEntropyLoss.

func (*CrossEntropyLoss[T]) Forward

func (cel *CrossEntropyLoss[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

Forward computes the cross-entropy loss. Inputs: predictions (logits as T), targets (labels as T that will be converted to int indices).

func (*CrossEntropyLoss[T]) OpType added in v0.2.1

func (cel *CrossEntropyLoss[T]) OpType() string

OpType returns the operation type of the CrossEntropyLoss layer.

func (*CrossEntropyLoss[T]) OutputShape

func (cel *CrossEntropyLoss[T]) OutputShape() []int

OutputShape returns the output shape of the loss (a scalar).

func (*CrossEntropyLoss[T]) Parameters

func (cel *CrossEntropyLoss[T]) Parameters() []*graph.Parameter[T]

Parameters returns an empty slice as CrossEntropyLoss has no trainable parameters.

type Loss

type Loss[T tensor.Numeric] interface {
	// Forward computes the loss and its gradient.
	Forward(ctx context.Context, predictions, targets *tensor.TensorNumeric[T]) (T, *tensor.TensorNumeric[T], error)
}

Loss defines the interface for loss functions.

type MSE

type MSE[T tensor.Numeric] struct {
	// contains filtered or unexported fields
}

MSE calculates the mean squared error between predictions and targets.

func NewMSE

func NewMSE[T tensor.Numeric](engine compute.Engine[T], ops numeric.Arithmetic[T]) *MSE[T]

NewMSE creates a new MSE loss function.

func (*MSE[T]) Attributes added in v0.2.1

func (m *MSE[T]) Attributes() map[string]interface{}

Attributes returns the attributes of the MSE loss function.

func (*MSE[T]) Backward

func (m *MSE[T]) Backward(ctx context.Context, _ types.BackwardMode, dOut *tensor.TensorNumeric[T], inputs ...*tensor.TensorNumeric[T]) ([]*tensor.TensorNumeric[T], error)

Backward computes the gradients for MSE with respect to inputs. Returns gradients in the order of inputs: [dPredictions, dTargets(nil)].

func (*MSE[T]) Forward

func (m *MSE[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

Forward computes the loss value.

func (*MSE[T]) OpType added in v0.2.1

func (m *MSE[T]) OpType() string

OpType returns the operation type of the MSE loss function.

func (*MSE[T]) OutputShape added in v0.2.1

func (m *MSE[T]) OutputShape() []int

OutputShape returns the output shape of the MSE loss function.

func (*MSE[T]) Parameters added in v0.2.1

func (m *MSE[T]) Parameters() []*graph.Parameter[T]

Parameters returns the parameters of the MSE loss function.

type RoutingContrastive added in v1.29.0

type RoutingContrastive[T tensor.Numeric] struct {
	// contains filtered or unexported fields
}

RoutingContrastive computes an auxiliary contrastive loss over routing scores from SparseRoutedAttention. It encourages routing diversity (different heads attend to different document regions) and routing specificity (each head specializes on a subset of documents).

Input: routing scores tensor with shape [batch, numHeads, seqLen].

The loss is the mean pairwise cosine similarity between head routing distributions. Minimizing this pushes heads apart so they specialize on different sequence regions.

Loss = scale * mean(cosineSim(head_i, head_j)) for all i < j.

func NewRoutingContrastive added in v1.29.0

func NewRoutingContrastive[T tensor.Numeric](engine compute.Engine[T], ops numeric.Arithmetic[T], scale float64) *RoutingContrastive[T]

NewRoutingContrastive creates a new contrastive routing loss. scale controls the loss magnitude (default recommendation: 0.01).

func (*RoutingContrastive[T]) Attributes added in v1.29.0

func (rc *RoutingContrastive[T]) Attributes() map[string]interface{}

Attributes returns the layer configuration.

func (*RoutingContrastive[T]) Backward added in v1.29.0

Backward computes gradients of the contrastive routing loss with respect to the routing scores input.

d(loss)/d(scores[b,h,s]) = scale / (numPairs*batch) *

sum over j!=h of d(cosineSim(h,j))/d(scores[b,h,s])

where d(cos(a,b))/d(a_i) = (b_i / (|a|*|b|)) - cos(a,b) * (a_i / |a|^2)

func (*RoutingContrastive[T]) Forward added in v1.29.0

func (rc *RoutingContrastive[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)

Forward computes the contrastive routing loss.

Inputs: exactly one tensor of shape [batch, numHeads, seqLen]. Returns a scalar loss tensor of shape [1].

func (*RoutingContrastive[T]) OpType added in v1.29.0

func (rc *RoutingContrastive[T]) OpType() string

OpType returns the operation type identifier.

func (*RoutingContrastive[T]) OutputShape added in v1.29.0

func (rc *RoutingContrastive[T]) OutputShape() []int

OutputShape returns the output shape of the loss (scalar).

func (*RoutingContrastive[T]) Parameters added in v1.29.0

func (rc *RoutingContrastive[T]) Parameters() []*graph.Parameter[T]

Parameters returns nil (no trainable parameters).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL