Documentation
¶
Overview ¶
Experimental — this package is not yet wired into the main framework.
Package crossasset provides a cross-attention model for multi-source feature processing. Each source attends to features of all other sources via scaled dot-product multi-head attention, enabling the model to learn inter-source dependencies. This is useful for scenarios where multiple correlated data sources (e.g., different financial instruments or sensor streams) must be jointly analyzed.
The model architecture applies cross-attention layers where each source computes queries from its own features and keys/values from all sources' features. Layer normalization and residual connections stabilize training.
Index ¶
- type Config
- type Direction
- type Model
- func (m *Model) AttentionWeights(features [][]float64) ([][]float64, error)
- func (m *Model) Forward(features [][]float64) ([][]float64, error)
- func (m *Model) Predict(features [][]float64) ([]int, []float64, error)
- func (m *Model) Save(path string) error
- func (m *Model) Train(data [][][]float64, labels [][]int, tc TrainConfig) error
- func (m *Model) TrainGPU(data [][][]float64, labels [][]int, tc TrainConfig, _ compute.Engine[float32]) (*TrainResult, error)
- type TrainConfig
- type TrainResult
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct {
NSources int
FeaturesPerSource int
DModel int
NHeads int
NLayers int
DropoutRate float64
LearningRate float64
}
Config holds the configuration for a cross-asset attention model.
type Model ¶
type Model struct {
// contains filtered or unexported fields
}
Model implements a cross-attention model for multi-source features.
func LoadModel ¶ added in v1.44.0
LoadModel loads a previously saved crossasset model from the given path.
func (*Model) AttentionWeights ¶
AttentionWeights computes the attention weight matrix showing how much each source attends to each other source. Returns [n_sources][n_sources] where result[i][j] is how much source i attends to source j. Weights sum to 1 across the attended (j) dimension.
func (*Model) Forward ¶
Forward processes features through the cross-attention model. features shape: [n_sources][features_per_source]. Returns: [n_sources][d_model].
func (*Model) Predict ¶
Predict returns per-source direction and confidence. features shape: [n_sources][features_per_source]. Returns: directions [n_sources], confidences [n_sources].
func (*Model) Train ¶
func (m *Model) Train(data [][][]float64, labels [][]int, tc TrainConfig) error
Train trains the model on the given data. data shape: [n_samples][n_sources][features_per_source]. labels shape: [n_samples][n_sources] with values in {0=Long, 1=Short, 2=Flat}.
func (*Model) TrainGPU ¶ added in v1.39.0
func (m *Model) TrainGPU(data [][][]float64, labels [][]int, tc TrainConfig, _ compute.Engine[float32]) (*TrainResult, error)
TrainGPU trains the model using the CPU full-backprop path with AdamW.
The ztensor GPU engine has stability issues on Grace Hopper unified memory (CUDA launch timeouts, illegal memory access from arena tensor recycling). Until those are resolved, TrainGPU delegates to the proven CPU Train() path which uses AdamW and full backpropagation through all layers.
The engine parameter is accepted for API compatibility but not used. Training accuracy matches PyTorch (~75% on COIN walk-forward validation).
type TrainConfig ¶
TrainConfig holds training hyperparameters.
type TrainResult ¶ added in v1.39.0
type TrainResult struct {
Losses []float64 // per-epoch average loss
FinalAccuracy float64 // fraction of correct predictions on last epoch
}
TrainResult holds outcomes from GPU training.