Documentation
¶
Overview ¶
Package loss provides various loss functions for neural networks.
Package loss provides loss function implementations for training.
Stability: beta
Index ¶
- func QuantileLoss[T tensor.Numeric](preds, targets *tensor.TensorNumeric[T], quantiles []float32) (float32, error)
- func SharpeLoss[T tensor.Numeric](weights, returns_ *tensor.TensorNumeric[T]) (float32, error)
- type BCELoss
- func (b *BCELoss[T]) Attributes() map[string]interface{}
- func (b *BCELoss[T]) Backward(ctx context.Context, _ types.BackwardMode, dOut *tensor.TensorNumeric[T], ...) ([]*tensor.TensorNumeric[T], error)
- func (b *BCELoss[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
- func (b *BCELoss[T]) OpType() string
- func (b *BCELoss[T]) OutputShape() []int
- func (b *BCELoss[T]) Parameters() []*graph.Parameter[T]
- type CorrLoss
- func (c *CorrLoss[T]) Attributes() map[string]any
- func (c *CorrLoss[T]) Backward(ctx context.Context, _ types.BackwardMode, dOut *tensor.TensorNumeric[T], ...) ([]*tensor.TensorNumeric[T], error)
- func (c *CorrLoss[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
- func (c *CorrLoss[T]) OpType() string
- func (c *CorrLoss[T]) OutputShape() []int
- func (c *CorrLoss[T]) Parameters() []*graph.Parameter[T]
- type CrossEntropyLoss
- func (cel *CrossEntropyLoss[T]) Attributes() map[string]interface{}
- func (cel *CrossEntropyLoss[T]) Backward(ctx context.Context, _ types.BackwardMode, dOut *tensor.TensorNumeric[T], ...) ([]*tensor.TensorNumeric[T], error)
- func (cel *CrossEntropyLoss[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
- func (cel *CrossEntropyLoss[T]) OpType() string
- func (cel *CrossEntropyLoss[T]) OutputShape() []int
- func (cel *CrossEntropyLoss[T]) Parameters() []*graph.Parameter[T]
- type Loss
- type MSE
- func (m *MSE[T]) Attributes() map[string]interface{}
- func (m *MSE[T]) Backward(ctx context.Context, _ types.BackwardMode, dOut *tensor.TensorNumeric[T], ...) ([]*tensor.TensorNumeric[T], error)
- func (m *MSE[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
- func (m *MSE[T]) OpType() string
- func (m *MSE[T]) OutputShape() []int
- func (m *MSE[T]) Parameters() []*graph.Parameter[T]
- type RoutingContrastive
- func (rc *RoutingContrastive[T]) Attributes() map[string]interface{}
- func (rc *RoutingContrastive[T]) Backward(_ context.Context, _ types.BackwardMode, _ *tensor.TensorNumeric[T], ...) ([]*tensor.TensorNumeric[T], error)
- func (rc *RoutingContrastive[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
- func (rc *RoutingContrastive[T]) OpType() string
- func (rc *RoutingContrastive[T]) OutputShape() []int
- func (rc *RoutingContrastive[T]) Parameters() []*graph.Parameter[T]
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func QuantileLoss ¶ added in v1.5.0
func QuantileLoss[T tensor.Numeric](preds, targets *tensor.TensorNumeric[T], quantiles []float32) (float32, error)
QuantileLoss computes the pinball (quantile regression) loss. preds has shape [batch, num_quantiles], targets has shape [batch], and quantiles is a slice of quantile levels (e.g., 0.1, 0.5, 0.9).
For each quantile q and sample i:
error = target_i - pred_i_q loss_q = q * error if error >= 0 loss_q = (q-1) * error if error < 0
Returns the mean loss over all samples and quantiles.
func SharpeLoss ¶ added in v1.5.0
SharpeLoss computes the negative Sharpe ratio as a differentiable loss for portfolio optimization.
weights has shape [batch, num_assets] — interpreted as portfolio weights (softmax-normalized internally to ensure long-only, sum-to-one constraint). returns_ has shape [batch, num_assets] — per-asset log returns for each time step in the batch.
Portfolio return for time step i = sum_j(w_j * r_ij) Sharpe = mean(portfolio_returns) / std(portfolio_returns) SharpeLoss = -Sharpe (to minimize)
Types ¶
type BCELoss ¶ added in v1.8.0
BCELoss calculates binary cross-entropy loss between predictions and targets. BCE(y, p) = -[y*log(p) + (1-y)*log(1-p)] Predictions are clamped to [eps, 1-eps] for numerical stability.
func NewBCELoss ¶ added in v1.8.0
NewBCELoss creates a new BCELoss loss function.
func (*BCELoss[T]) Attributes ¶ added in v1.8.0
Attributes returns the attributes of the BCELoss function.
func (*BCELoss[T]) Backward ¶ added in v1.8.0
func (b *BCELoss[T]) Backward(ctx context.Context, _ types.BackwardMode, dOut *tensor.TensorNumeric[T], inputs ...*tensor.TensorNumeric[T]) ([]*tensor.TensorNumeric[T], error)
Backward computes the gradients for BCELoss with respect to predictions. Gradient: -(y/p - (1-y)/(1-p)) / N, chained with upstream dOut.
func (*BCELoss[T]) Forward ¶ added in v1.8.0
func (b *BCELoss[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
Forward computes the mean binary cross-entropy loss.
func (*BCELoss[T]) OpType ¶ added in v1.8.0
OpType returns the operation type of the BCELoss function.
func (*BCELoss[T]) OutputShape ¶ added in v1.8.0
OutputShape returns the output shape of the BCELoss function.
func (*BCELoss[T]) Parameters ¶ added in v1.8.0
Parameters returns the parameters of the BCELoss function.
type CorrLoss ¶ added in v0.2.1
CorrLoss computes -PearsonCorrelation(predictions, targets) as a differentiable scalar loss. Minimizing this loss maximizes the Pearson correlation between predictions and targets. Since Numerai targets are rank-normalized, Pearson closely approximates Spearman rank correlation.
Forward: loss = -sum(p_c * t_c) / (sqrt(sum(p_c^2) * sum(t_c^2)) + eps)
where p_c = p - mean(p), t_c = t - mean(t)
Backward: grad_i = -(t_c_i / denom - corr * p_c_i / sum_pp) * dOut
All tensor operations use the engine, keeping data on GPU when available. Only scalar intermediate values (means, sums) are read back to CPU.
func NewCorrLoss ¶ added in v0.2.1
func NewCorrLoss[T tensor.Numeric](engine compute.Engine[T], ops numeric.Arithmetic[T]) *CorrLoss[T]
NewCorrLoss creates a new correlation loss function.
func (*CorrLoss[T]) Attributes ¶ added in v0.2.1
Attributes returns nil (no configurable attributes).
func (*CorrLoss[T]) Backward ¶ added in v0.2.1
func (c *CorrLoss[T]) Backward(ctx context.Context, _ types.BackwardMode, dOut *tensor.TensorNumeric[T], inputs ...*tensor.TensorNumeric[T]) ([]*tensor.TensorNumeric[T], error)
Backward computes the gradient of -PearsonCorrelation with respect to predictions. Returns [dPredictions, dTargets(zeros)].
func (*CorrLoss[T]) Forward ¶ added in v0.2.1
func (c *CorrLoss[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
Forward computes -PearsonCorrelation(predictions, targets).
func (*CorrLoss[T]) OutputShape ¶ added in v0.2.1
OutputShape returns [1] (scalar loss).
func (*CorrLoss[T]) Parameters ¶ added in v0.2.1
Parameters returns nil (no trainable parameters).
type CrossEntropyLoss ¶
CrossEntropyLoss computes the cross-entropy loss.
func NewCrossEntropyLoss ¶
func NewCrossEntropyLoss[T tensor.Numeric](engine compute.Engine[T]) *CrossEntropyLoss[T]
NewCrossEntropyLoss creates a new CrossEntropyLoss layer.
func (*CrossEntropyLoss[T]) Attributes ¶ added in v0.2.1
func (cel *CrossEntropyLoss[T]) Attributes() map[string]interface{}
Attributes returns the attributes of the CrossEntropyLoss layer.
func (*CrossEntropyLoss[T]) Backward ¶
func (cel *CrossEntropyLoss[T]) Backward(ctx context.Context, _ types.BackwardMode, dOut *tensor.TensorNumeric[T], _ ...*tensor.TensorNumeric[T]) ([]*tensor.TensorNumeric[T], error)
Backward computes the gradients for CrossEntropyLoss.
func (*CrossEntropyLoss[T]) Forward ¶
func (cel *CrossEntropyLoss[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
Forward computes the cross-entropy loss. Inputs: predictions (logits as T), targets (labels as T that will be converted to int indices).
func (*CrossEntropyLoss[T]) OpType ¶ added in v0.2.1
func (cel *CrossEntropyLoss[T]) OpType() string
OpType returns the operation type of the CrossEntropyLoss layer.
func (*CrossEntropyLoss[T]) OutputShape ¶
func (cel *CrossEntropyLoss[T]) OutputShape() []int
OutputShape returns the output shape of the loss (a scalar).
func (*CrossEntropyLoss[T]) Parameters ¶
func (cel *CrossEntropyLoss[T]) Parameters() []*graph.Parameter[T]
Parameters returns an empty slice as CrossEntropyLoss has no trainable parameters.
type Loss ¶
type Loss[T tensor.Numeric] interface { // Forward computes the loss and its gradient. Forward(ctx context.Context, predictions, targets *tensor.TensorNumeric[T]) (T, *tensor.TensorNumeric[T], error) }
Loss defines the interface for loss functions.
type MSE ¶
MSE calculates the mean squared error between predictions and targets.
func (*MSE[T]) Attributes ¶ added in v0.2.1
Attributes returns the attributes of the MSE loss function.
func (*MSE[T]) Backward ¶
func (m *MSE[T]) Backward(ctx context.Context, _ types.BackwardMode, dOut *tensor.TensorNumeric[T], inputs ...*tensor.TensorNumeric[T]) ([]*tensor.TensorNumeric[T], error)
Backward computes the gradients for MSE with respect to inputs. Returns gradients in the order of inputs: [dPredictions, dTargets(nil)].
func (*MSE[T]) Forward ¶
func (m *MSE[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
Forward computes the loss value.
func (*MSE[T]) OutputShape ¶ added in v0.2.1
OutputShape returns the output shape of the MSE loss function.
func (*MSE[T]) Parameters ¶ added in v0.2.1
Parameters returns the parameters of the MSE loss function.
type RoutingContrastive ¶ added in v1.29.0
RoutingContrastive computes an auxiliary contrastive loss over routing scores from SparseRoutedAttention. It encourages routing diversity (different heads attend to different document regions) and routing specificity (each head specializes on a subset of documents).
Input: routing scores tensor with shape [batch, numHeads, seqLen].
The loss is the mean pairwise cosine similarity between head routing distributions. Minimizing this pushes heads apart so they specialize on different sequence regions.
Loss = scale * mean(cosineSim(head_i, head_j)) for all i < j.
func NewRoutingContrastive ¶ added in v1.29.0
func NewRoutingContrastive[T tensor.Numeric](engine compute.Engine[T], ops numeric.Arithmetic[T], scale float64) *RoutingContrastive[T]
NewRoutingContrastive creates a new contrastive routing loss. scale controls the loss magnitude (default recommendation: 0.01).
func (*RoutingContrastive[T]) Attributes ¶ added in v1.29.0
func (rc *RoutingContrastive[T]) Attributes() map[string]interface{}
Attributes returns the layer configuration.
func (*RoutingContrastive[T]) Backward ¶ added in v1.29.0
func (rc *RoutingContrastive[T]) Backward(_ context.Context, _ types.BackwardMode, _ *tensor.TensorNumeric[T], inputs ...*tensor.TensorNumeric[T]) ([]*tensor.TensorNumeric[T], error)
Backward computes gradients of the contrastive routing loss with respect to the routing scores input.
d(loss)/d(scores[b,h,s]) = scale / (numPairs*batch) *
sum over j!=h of d(cosineSim(h,j))/d(scores[b,h,s])
where d(cos(a,b))/d(a_i) = (b_i / (|a|*|b|)) - cos(a,b) * (a_i / |a|^2)
func (*RoutingContrastive[T]) Forward ¶ added in v1.29.0
func (rc *RoutingContrastive[T]) Forward(ctx context.Context, inputs ...*tensor.TensorNumeric[T]) (*tensor.TensorNumeric[T], error)
Forward computes the contrastive routing loss.
Inputs: exactly one tensor of shape [batch, numHeads, seqLen]. Returns a scalar loss tensor of shape [1].
func (*RoutingContrastive[T]) OpType ¶ added in v1.29.0
func (rc *RoutingContrastive[T]) OpType() string
OpType returns the operation type identifier.
func (*RoutingContrastive[T]) OutputShape ¶ added in v1.29.0
func (rc *RoutingContrastive[T]) OutputShape() []int
OutputShape returns the output shape of the loss (scalar).
func (*RoutingContrastive[T]) Parameters ¶ added in v1.29.0
func (rc *RoutingContrastive[T]) Parameters() []*graph.Parameter[T]
Parameters returns nil (no trainable parameters).