model

package
v0.0.0-...-4326643 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 30, 2026 License: MIT Imports: 19 Imported by: 0

Documentation

Rendered for darwin/amd64

Overview

Package model provides weight loading and model serialization for gorch.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CausalLMLoss

func CausalLMLoss(model *nn.GPT, tokens []int) *g.Tensor

CausalLMLoss runs the model forward on tokens[:n-1] and returns the next-token cross-entropy loss against tokens[1:]. The returned tensor is a scalar with autograd attached, ready for Backward() followed by an optimiser step.

This is a thin convenience for the standard LM training pattern where the input and target sequences differ by a one-position shift. It does not handle batching across sequences (the rest of gorch operates one sequence at a time), so for multi-sequence training, sum or average losses across sequences in the caller.

All parameters created by nn.NewGPT already have requires_grad=true, and LoadGPT2 preserves that, so fine-tuning a pretrained model requires nothing more than calling this loss + optim.Step in a loop.

func DownloadGPT2

func DownloadGPT2(modelName, dir string) error

DownloadGPT2 downloads model files for a HuggingFace GPT-2 model.

func ExportLinearToONNX

func ExportLinearToONNX(l *nn.Linear, batchSize int, path string) error

ExportLinearToONNX is a convenience wrapper for a single Linear layer (for tests and minimal models).

func ExportSequentialToONNX

func ExportSequentialToONNX(seq *nn.Sequential, inputShape []int, path string) error

ExportSequentialToONNX serialises a Sequential of supported layers to an ONNX model file at path. inputShape is the static shape of the input tensor — typically (batch, features) for MLPs or (batch, channels, H, W) for CNNs. A symbolic batch dimension is emitted as -1 so downstream tools handle dynamic batch.

Supported layers:

  • nn.Linear → Gemm (with transB=1)
  • nn.ReLUModule → Relu
  • nn.SigmoidModule → Sigmoid
  • nn.TanhModule → Tanh
  • nn.Conv2d → Conv
  • nn.MaxPool2d → MaxPool
  • nn.Flatten → Flatten (axis=1)

Returns an error if a layer type is unsupported.

func Generate

func Generate(model *nn.GPT, tokenIDs []int, maxNewTokens int) []int

Generate produces text by autoregressively sampling from a GPT model. Uses greedy decoding (argmax).

func GenerateText

func GenerateText(model *nn.GPT, tok *BPETokenizer, prompt string, cfg GenerateConfig) string

GenerateText produces text from a prompt using the given model and config.

func GenerateWithConfig

func GenerateWithConfig(model *nn.GPT, tokenIDs []int, cfg GenerateConfig) []int

GenerateWithConfig generates tokens with temperature, top-k, and top-p sampling.

func LoadGPT2

func LoadGPT2(dir string, cfg GPT2Config) (*nn.GPT, error)

LoadGPT2 loads a pretrained GPT-2 model from safetensors. Handles the GPT-2 Conv1D convention (transposed weights) and fused QKV.

func LoadGPT2Verbose

func LoadGPT2Verbose(dir string, cfg GPT2Config) (*nn.GPT, error)

LoadGPT2Verbose is like LoadGPT2 but prints a one-line summary to log after a successful load. Existing tools that rely on the old log line can switch to this; new code should prefer LoadGPT2 (silent).

func LoadModelWeights

func LoadModelWeights(path string, params []*g.Tensor, nameMap map[string]int) error

LoadModelWeights loads a safetensors file and maps weights to a named parameter map. nameMap maps safetensors tensor names to model parameter indices.

func SaveModelWeights

func SaveModelWeights(path string, params []*g.Tensor, nameMap map[int]string) error

SaveModelWeights saves model parameters to a safetensors file.

func SaveSafetensors

func SaveSafetensors(path string, tensors map[string]*g.Tensor) error

SaveSafetensors saves tensors to a .safetensors file.

Types

type BPETokenizer

type BPETokenizer struct {
	Encoder    map[string]int    // token string → ID
	Decoder    map[int]string    // ID → token string
	BPERanks   map[[2]string]int // merge pair → priority rank
	VocabSize  int
	ByteEncode map[byte]rune // byte → unicode char mapping
	ByteDecode map[rune]byte // unicode char → byte mapping
}

BPETokenizer implements byte-pair encoding tokenization. Compatible with GPT-2/GPT-NeoX style vocab.json + merges.txt.

func LoadTokenizer

func LoadTokenizer(vocabPath, mergesPath string) (*BPETokenizer, error)

LoadTokenizer loads a BPE tokenizer from vocab.json and merges.txt.

func (*BPETokenizer) Decode

func (t *BPETokenizer) Decode(ids []int) string

Decode converts token IDs back to text.

func (*BPETokenizer) Encode

func (t *BPETokenizer) Encode(text string) []int

Encode converts text to token IDs.

func (*BPETokenizer) EncodeBatch

func (t *BPETokenizer) EncodeBatch(texts []string) [][]int

EncodeBatch encodes texts in parallel using GOMAXPROCS workers. Output order matches input order. The tokenizer's read-only state (Encoder, ByteEncode, BPE merges) makes per-text Encode safe for concurrent use.

For inputs of more than a few short strings this is a noticeable win over a sequential loop in caller code, especially when the caller is embedding many small chunks (RAG, retrieval, classification).

type GPT2Config

type GPT2Config struct {
	VocabSize int
	Dim       int
	NumHeads  int
	NumLayers int
	MaxSeq    int
}

GPT2Config holds GPT-2 architecture parameters.

func GPT2Small

func GPT2Small() GPT2Config

GPT2Small returns the config for openai-community/gpt2 (124M params).

func TinyStories1M

func TinyStories1M() GPT2Config

TinyStories1M returns the config for roneneldan/TinyStories-1M.

type GenerateConfig

type GenerateConfig struct {
	MaxNewTokens int     // maximum tokens to generate
	Temperature  float32 // 0 = greedy, >0 = sample with temperature
	TopK         int     // 0 = disabled, >0 = sample from top-K
	TopP         float32 // 0 = disabled, >0 = nucleus sampling threshold
	StopToken    int     // -1 = disabled, otherwise stop at this token
	UseKVCache   bool    // true = incremental decoding via KV cache
}

GenerateConfig controls text generation behavior.

func DefaultGenerateConfig

func DefaultGenerateConfig() GenerateConfig

DefaultGenerateConfig returns sensible defaults for text generation.

func GreedyConfig

func GreedyConfig(maxTokens int) GenerateConfig

GreedyConfig returns config for deterministic greedy decoding.

type KVCache

type KVCache struct {
	Keys    [][][]float32 // [layer][head] → flat (seqSoFar * headDim)
	Values  [][][]float32 // [layer][head] → flat (seqSoFar * headDim)
	Layers  int
	Heads   int
	HeadDim int
	SeqLen  int // number of tokens cached so far
}

KVCache stores precomputed key-value pairs for efficient autoregressive generation. This avoids recomputing attention for all previous tokens.

func NewKVCache

func NewKVCache(numLayers, numHeads, headDim int) *KVCache

NewKVCache creates an empty KV cache for a model.

func (*KVCache) Append

func (kv *KVCache) Append(layer, head int, key, value []float32)

Append adds new key-value vectors for one token to the cache.

func (*KVCache) GetKeys

func (kv *KVCache) GetKeys(layer, head int) []float32

GetKeys returns all cached keys for a layer/head as (seqLen, headDim).

func (*KVCache) GetValues

func (kv *KVCache) GetValues(layer, head int) []float32

GetValues returns all cached values for a layer/head.

func (*KVCache) Len

func (kv *KVCache) Len() int

Len returns the number of tokens cached.

func (*KVCache) Reset

func (kv *KVCache) Reset()

Reset clears the cache.

type ONNXFile

type ONNXFile struct {
	Tensors  map[string]*g.Tensor // initializer name → tensor
	Names    []string             // initializer names in file order
	Nodes    []ONNXNodeInfo       // graph node summary (for inspection)
	IRVer    int64
	Producer string
}

ONNXFile is a partial parse of an ONNX model. We only decode the pieces gorch can act on today — initializer tensors keyed by name — so users can load weights from any ONNX producer without implementing the full graph spec. Op nodes are recorded as a flat list for inspection but not executed.

func LoadONNX

func LoadONNX(path string) (*ONNXFile, error)

LoadONNX reads an ONNX file and returns its initializer tensors. Supports float32 (raw_data or float_data), int64 (raw_data), and the exporter we just wrote. Other dtypes are returned as errors so silent dtype mismatches don't propagate downstream.

type ONNXNodeInfo

type ONNXNodeInfo struct {
	OpType string
	Name   string
	Inputs []string
	Output []string
}

ONNXNodeInfo is a lightweight summary of a graph node for inspection. We don't reconstruct execution from these.

type SafetensorsFile

type SafetensorsFile struct {
	Tensors map[string]*g.Tensor
	Names   []string // ordered tensor names
}

SafetensorsFile represents a loaded safetensors file.

func LoadSafetensors

func LoadSafetensors(path string) (*SafetensorsFile, error)

LoadSafetensors loads a .safetensors file and returns all tensors.

Safetensors format:

  • 8 bytes: little-endian uint64 header length
  • N bytes: JSON header mapping tensor name → {dtype, shape, data_offsets}
  • Remaining: raw tensor data

Supports F32, F16 (converted to F32), and BF16 (converted to F32).

Streams tensor data: only the JSON header and one tensor's raw bytes are alive at any time, plus the running set of decoded F32 tensors. For a 622 MB file this drops peak transient RSS from ~1.24 GB (raw bytes + decoded floats both alive) to roughly the size of the largest single tensor + decoded total. See issue #10.

type SafetensorsHeader

type SafetensorsHeader struct {
	DType   string `json:"dtype"`
	Shape   []int  `json:"shape"`
	Offsets [2]int `json:"data_offsets"`
}

SafetensorsHeader represents the metadata for one tensor in a safetensors file.

type SimpleTokenizer

type SimpleTokenizer struct {
	CharToID map[byte]int
	IDToChar map[int]byte
	VocabSz  int
}

SimpleTokenizer is a minimal character-level tokenizer for testing. Maps each unique byte to a token ID.

func NewSimpleTokenizer

func NewSimpleTokenizer(text string) *SimpleTokenizer

NewSimpleTokenizer creates a character-level tokenizer from a text corpus.

func (*SimpleTokenizer) Decode

func (t *SimpleTokenizer) Decode(ids []int) string

func (*SimpleTokenizer) Encode

func (t *SimpleTokenizer) Encode(text string) []int

func (*SimpleTokenizer) VocabSize

func (t *SimpleTokenizer) VocabSize() int

Directories

Path Synopsis
Package mythos implements the OpenMythos recurrent-depth transformer architecture in gorch.
Package mythos implements the OpenMythos recurrent-depth transformer architecture in gorch.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL