model

package

v0.0.0-...-4326643 Latest Latest Go to latest Published: Apr 30, 2026 License: MIT Imports: 19 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/vinq1911/gorch

Links

Open Source Insights

Documentation ¶

Rendered for darwin/amd64

Overview ¶

Package model provides weight loading and model serialization for gorch.

Index ¶

func CausalLMLoss(model *nn.GPT, tokens []int) *g.Tensor
func DownloadGPT2(modelName, dir string) error
func ExportLinearToONNX(l *nn.Linear, batchSize int, path string) error
func ExportSequentialToONNX(seq *nn.Sequential, inputShape []int, path string) error
func Generate(model *nn.GPT, tokenIDs []int, maxNewTokens int) []int
func GenerateText(model *nn.GPT, tok *BPETokenizer, prompt string, cfg GenerateConfig) string
func GenerateWithConfig(model *nn.GPT, tokenIDs []int, cfg GenerateConfig) []int
func LoadGPT2(dir string, cfg GPT2Config) (*nn.GPT, error)
func LoadGPT2Verbose(dir string, cfg GPT2Config) (*nn.GPT, error)
func LoadModelWeights(path string, params []*g.Tensor, nameMap map[string]int) error
func SaveModelWeights(path string, params []*g.Tensor, nameMap map[int]string) error
func SaveSafetensors(path string, tensors map[string]*g.Tensor) error
type BPETokenizer
- func LoadTokenizer(vocabPath, mergesPath string) (*BPETokenizer, error)
- func (t *BPETokenizer) Decode(ids []int) string
- func (t *BPETokenizer) Encode(text string) []int
- func (t *BPETokenizer) EncodeBatch(texts []string) [][]int
type GPT2Config
- func GPT2Small() GPT2Config
- func TinyStories1M() GPT2Config
type GenerateConfig
- func DefaultGenerateConfig() GenerateConfig
- func GreedyConfig(maxTokens int) GenerateConfig
type KVCache
- func NewKVCache(numLayers, numHeads, headDim int) *KVCache
- func (kv *KVCache) Append(layer, head int, key, value []float32)
- func (kv *KVCache) GetKeys(layer, head int) []float32
- func (kv *KVCache) GetValues(layer, head int) []float32
- func (kv *KVCache) Len() int
- func (kv *KVCache) Reset()
type ONNXFile
- func LoadONNX(path string) (*ONNXFile, error)
type ONNXNodeInfo
type SafetensorsFile
- func LoadSafetensors(path string) (*SafetensorsFile, error)
type SafetensorsHeader
type SimpleTokenizer
- func NewSimpleTokenizer(text string) *SimpleTokenizer
- func (t *SimpleTokenizer) Decode(ids []int) string
- func (t *SimpleTokenizer) Encode(text string) []int
- func (t *SimpleTokenizer) VocabSize() int

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func CausalLMLoss ¶

func CausalLMLoss(model *nn.GPT, tokens []int) *g.Tensor

CausalLMLoss runs the model forward on tokens[:n-1] and returns the next-token cross-entropy loss against tokens[1:]. The returned tensor is a scalar with autograd attached, ready for Backward() followed by an optimiser step.

This is a thin convenience for the standard LM training pattern where the input and target sequences differ by a one-position shift. It does not handle batching across sequences (the rest of gorch operates one sequence at a time), so for multi-sequence training, sum or average losses across sequences in the caller.

All parameters created by nn.NewGPT already have requires_grad=true, and LoadGPT2 preserves that, so fine-tuning a pretrained model requires nothing more than calling this loss + optim.Step in a loop.

func DownloadGPT2 ¶

func DownloadGPT2(modelName, dir string) error

DownloadGPT2 downloads model files for a HuggingFace GPT-2 model.

func ExportLinearToONNX ¶

func ExportLinearToONNX(l *nn.Linear, batchSize int, path string) error

ExportLinearToONNX is a convenience wrapper for a single Linear layer (for tests and minimal models).

func ExportSequentialToONNX ¶

func ExportSequentialToONNX(seq *nn.Sequential, inputShape []int, path string) error

ExportSequentialToONNX serialises a Sequential of supported layers to an ONNX model file at path. inputShape is the static shape of the input tensor — typically (batch, features) for MLPs or (batch, channels, H, W) for CNNs. A symbolic batch dimension is emitted as -1 so downstream tools handle dynamic batch.

Supported layers:

nn.Linear → Gemm (with transB=1)
nn.ReLUModule → Relu
nn.SigmoidModule → Sigmoid
nn.TanhModule → Tanh
nn.Conv2d → Conv
nn.MaxPool2d → MaxPool
nn.Flatten → Flatten (axis=1)

Returns an error if a layer type is unsupported.

func Generate ¶

func Generate(model *nn.GPT, tokenIDs []int, maxNewTokens int) []int

Generate produces text by autoregressively sampling from a GPT model. Uses greedy decoding (argmax).

func GenerateText ¶

func GenerateText(model *nn.GPT, tok *BPETokenizer, prompt string, cfg GenerateConfig) string

GenerateText produces text from a prompt using the given model and config.

func GenerateWithConfig ¶

func GenerateWithConfig(model *nn.GPT, tokenIDs []int, cfg GenerateConfig) []int

GenerateWithConfig generates tokens with temperature, top-k, and top-p sampling.

func LoadGPT2 ¶

func LoadGPT2(dir string, cfg GPT2Config) (*nn.GPT, error)

LoadGPT2 loads a pretrained GPT-2 model from safetensors. Handles the GPT-2 Conv1D convention (transposed weights) and fused QKV.

func LoadGPT2Verbose ¶

func LoadGPT2Verbose(dir string, cfg GPT2Config) (*nn.GPT, error)

LoadGPT2Verbose is like LoadGPT2 but prints a one-line summary to log after a successful load. Existing tools that rely on the old log line can switch to this; new code should prefer LoadGPT2 (silent).

func LoadModelWeights ¶

func LoadModelWeights(path string, params []*g.Tensor, nameMap map[string]int) error

LoadModelWeights loads a safetensors file and maps weights to a named parameter map. nameMap maps safetensors tensor names to model parameter indices.

func SaveModelWeights ¶

func SaveModelWeights(path string, params []*g.Tensor, nameMap map[int]string) error

SaveModelWeights saves model parameters to a safetensors file.

func SaveSafetensors ¶

func SaveSafetensors(path string, tensors map[string]*g.Tensor) error

SaveSafetensors saves tensors to a .safetensors file.

Types ¶

type BPETokenizer ¶

type BPETokenizer struct {
	Encoder    map[string]int    // token string → ID
	Decoder    map[int]string    // ID → token string
	BPERanks   map[[2]string]int // merge pair → priority rank
	VocabSize  int
	ByteEncode map[byte]rune // byte → unicode char mapping
	ByteDecode map[rune]byte // unicode char → byte mapping
}

BPETokenizer implements byte-pair encoding tokenization. Compatible with GPT-2/GPT-NeoX style vocab.json + merges.txt.

func LoadTokenizer ¶

func LoadTokenizer(vocabPath, mergesPath string) (*BPETokenizer, error)

LoadTokenizer loads a BPE tokenizer from vocab.json and merges.txt.

func (*BPETokenizer) Decode ¶

func (t *BPETokenizer) Decode(ids []int) string

Decode converts token IDs back to text.

func (*BPETokenizer) Encode ¶

func (t *BPETokenizer) Encode(text string) []int

Encode converts text to token IDs.

func (*BPETokenizer) EncodeBatch ¶

func (t *BPETokenizer) EncodeBatch(texts []string) [][]int

EncodeBatch encodes texts in parallel using GOMAXPROCS workers. Output order matches input order. The tokenizer's read-only state (Encoder, ByteEncode, BPE merges) makes per-text Encode safe for concurrent use.

For inputs of more than a few short strings this is a noticeable win over a sequential loop in caller code, especially when the caller is embedding many small chunks (RAG, retrieval, classification).

type GPT2Config ¶

type GPT2Config struct {
	VocabSize int
	Dim       int
	NumHeads  int
	NumLayers int
	MaxSeq    int
}

GPT2Config holds GPT-2 architecture parameters.

func GPT2Small ¶

func GPT2Small() GPT2Config

GPT2Small returns the config for openai-community/gpt2 (124M params).

func TinyStories1M ¶

func TinyStories1M() GPT2Config

TinyStories1M returns the config for roneneldan/TinyStories-1M.

type GenerateConfig ¶

type GenerateConfig struct {
	MaxNewTokens int     // maximum tokens to generate
	Temperature  float32 // 0 = greedy, >0 = sample with temperature
	TopK         int     // 0 = disabled, >0 = sample from top-K
	TopP         float32 // 0 = disabled, >0 = nucleus sampling threshold
	StopToken    int     // -1 = disabled, otherwise stop at this token
	UseKVCache   bool    // true = incremental decoding via KV cache
}

GenerateConfig controls text generation behavior.

func DefaultGenerateConfig ¶

func DefaultGenerateConfig() GenerateConfig

DefaultGenerateConfig returns sensible defaults for text generation.

func GreedyConfig ¶

func GreedyConfig(maxTokens int) GenerateConfig

GreedyConfig returns config for deterministic greedy decoding.

type KVCache ¶

type KVCache struct {
	Keys    [][][]float32 // [layer][head] → flat (seqSoFar * headDim)
	Values  [][][]float32 // [layer][head] → flat (seqSoFar * headDim)
	Layers  int
	Heads   int
	HeadDim int
	SeqLen  int // number of tokens cached so far
}

KVCache stores precomputed key-value pairs for efficient autoregressive generation. This avoids recomputing attention for all previous tokens.

func NewKVCache ¶

func NewKVCache(numLayers, numHeads, headDim int) *KVCache

NewKVCache creates an empty KV cache for a model.

func (*KVCache) Append ¶

func (kv *KVCache) Append(layer, head int, key, value []float32)

Append adds new key-value vectors for one token to the cache.

func (*KVCache) GetKeys ¶

func (kv *KVCache) GetKeys(layer, head int) []float32

GetKeys returns all cached keys for a layer/head as (seqLen, headDim).

func (*KVCache) GetValues ¶

func (kv *KVCache) GetValues(layer, head int) []float32

GetValues returns all cached values for a layer/head.

func (*KVCache) Len ¶

func (kv *KVCache) Len() int

Len returns the number of tokens cached.

func (*KVCache) Reset ¶

func (kv *KVCache) Reset()

Reset clears the cache.

type ONNXFile ¶

type ONNXFile struct {
	Tensors  map[string]*g.Tensor // initializer name → tensor
	Names    []string             // initializer names in file order
	Nodes    []ONNXNodeInfo       // graph node summary (for inspection)
	IRVer    int64
	Producer string
}

ONNXFile is a partial parse of an ONNX model. We only decode the pieces gorch can act on today — initializer tensors keyed by name — so users can load weights from any ONNX producer without implementing the full graph spec. Op nodes are recorded as a flat list for inspection but not executed.

func LoadONNX ¶

func LoadONNX(path string) (*ONNXFile, error)

LoadONNX reads an ONNX file and returns its initializer tensors. Supports float32 (raw_data or float_data), int64 (raw_data), and the exporter we just wrote. Other dtypes are returned as errors so silent dtype mismatches don't propagate downstream.

type ONNXNodeInfo ¶

type ONNXNodeInfo struct {
	OpType string
	Name   string
	Inputs []string
	Output []string
}

ONNXNodeInfo is a lightweight summary of a graph node for inspection. We don't reconstruct execution from these.

type SafetensorsFile ¶

type SafetensorsFile struct {
	Tensors map[string]*g.Tensor
	Names   []string // ordered tensor names
}

SafetensorsFile represents a loaded safetensors file.

func LoadSafetensors ¶

func LoadSafetensors(path string) (*SafetensorsFile, error)

LoadSafetensors loads a .safetensors file and returns all tensors.

Safetensors format:

8 bytes: little-endian uint64 header length
N bytes: JSON header mapping tensor name → {dtype, shape, data_offsets}
Remaining: raw tensor data

Supports F32, F16 (converted to F32), and BF16 (converted to F32).

Streams tensor data: only the JSON header and one tensor's raw bytes are alive at any time, plus the running set of decoded F32 tensors. For a 622 MB file this drops peak transient RSS from ~1.24 GB (raw bytes + decoded floats both alive) to roughly the size of the largest single tensor + decoded total. See issue #10.

type SafetensorsHeader ¶

type SafetensorsHeader struct {
	DType   string `json:"dtype"`
	Shape   []int  `json:"shape"`
	Offsets [2]int `json:"data_offsets"`
}

SafetensorsHeader represents the metadata for one tensor in a safetensors file.

type SimpleTokenizer ¶

type SimpleTokenizer struct {
	CharToID map[byte]int
	IDToChar map[int]byte
	VocabSz  int
}

SimpleTokenizer is a minimal character-level tokenizer for testing. Maps each unique byte to a token ID.

func NewSimpleTokenizer ¶

func NewSimpleTokenizer(text string) *SimpleTokenizer

NewSimpleTokenizer creates a character-level tokenizer from a text corpus.

func (*SimpleTokenizer) Decode ¶

func (t *SimpleTokenizer) Decode(ids []int) string

func (*SimpleTokenizer) Encode ¶

func (t *SimpleTokenizer) Encode(text string) []int

func (*SimpleTokenizer) VocabSize ¶

func (t *SimpleTokenizer) VocabSize() int

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
mythos Package mythos implements the OpenMythos recurrent-depth transformer architecture in gorch.	Package mythos implements the OpenMythos recurrent-depth transformer architecture in gorch.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL