gonet

package module

v0.0.0-...-5b2c132 Latest Latest Go to latest Published: May 11, 2026 License: MIT Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/LucasInOz/gonet

Links

Open Source Insights

README ¶

gonet

Motivation

gonet is a neural network library written in Go as a project for learning, experimentation, and fun. It is not intended for production use. The goal is to make neural-network internals easier to inspect by implementing the core pieces directly: fully connected layers, embeddings, attention, normalization, losses, training loops, and automatic differentiation.

The repository contains two independent implementations:

The root gonet package builds dynamic computation graphs and performs reverse-mode automatic differentiation.
The arrimpl package contains an array-based fully connected network with explicit forward and backward propagation (which is meant for double check).

Examples

The examples/ directory contains small demonstrational mini-projects that exercise different parts of the framework:

Binary classifier - trains a small neural-net binary classifier inspired by Karpathy's micrograd introduction. It can run with either the computation-graph implementation or the array-based MLP.
Digit OCR - trains a digit classifier on sklearn digits or MNIST-style data, again with both graph and array-based training modes.
Word embedding - trains a tiny word embedding model and shows how similar words can converge toward similar learned vectors.
Makemore neural bigram - implements a character-level neural bigram language model, with both one-hot linear and embedding-based variants.
Makemore neural quadgram - implements an MLP character language model in the style of Bengio et al. 2003, using multiple previous characters as context.
Makemore WaveNet - builds a character language model with a WaveNet-like hierarchical structure.
Makemore decoder-only transformer - trains a small character-level GPT-style model with masked self-attention, multi-head attention, attention blocks, and token generation.

Decoder-Only Transformer Highlight

The decoder-only transformer example is the most sophisticated example in this repository for now. It demonstrates that this small Go deep-learning framework can express and train a relatively complex model: token embeddings, positional/context handling, masked self-attention, multi-head attention, stacked transformer blocks, and autoregressive character generation.

The example is still primarily educational. Performance is not competitive with production deep-learning frameworks (eg, PyTorch), but that is also the point: the model is implemented in a way that keeps the mechanics visible and hackable instead of hiding them behind highly optimized kernels (tensor operations).

Documentation ¶

Overview ¶

Package gonet provides a computational graph based implementation of neural net forward and backward propagation algorithm.

Index ¶

func NodeValues(ns []*Node) []float64
func SingleLinear(inputSize int, bias bool) *singleLinear
func Train(model Model, samples []util.Sample, cfg *util.TrainConfig, lf LossFunction) time.Duration
type E2ELoss
- func TrainLossFunc(model FeedForwarder, lf LossFunction) E2ELoss
type E2EPredictLoss
- func PredictLossFunc(model FeedForwarder, lf LossFunction) E2EPredictLoss
type Embedding
- func NewEmbedding(vocabSize, dim int, initNorm bool) *Embedding
- func (e *Embedding) E(index int) []*Node
- func (e *Embedding) EmbeddingFeed(in []*Node) (out []*Node)
- func (e *Embedding) S(index int) string
- func (e *Embedding) Sub(i, j int) (diff []float64)
- func (e *Embedding) UnembeddingFeed(in, bias []*Node) (out []*Node)
type FeedForwarder
type Layer
- func AttentionBlockLayer(embDim, headNum int, buildAttention func(int, int) Layer) Layer
- func EmbeddingLayer(vocabSize, dim int) Layer
- func EmbeddingLayerFrom(emb *Embedding) Layer
- func KQVLayer(embDim, headSize int) Layer
- func LayerNormLayer(dim int) Layer
- func LinearLayer(fanIn, fanOut int, bias bool) Layer
- func MaskedSelfAttentionLayer(embDim, headSize int) Layer
- func MultiHeadAttentionLayer(embDim, headNum int, buildAttention func(int, int) Layer) Layer
- func PositionalEmbeddingLayer(vocabSize, ctxLen, dim int) Layer
- func ReluLayer() Layer
- func SigmoidLayer() Layer
- func SoftmaxLayer(t float64) Layer
- func TanhLayer() Layer
- func UnembeddingLayer(dim, vocabSize int, bias bool) Layer
- func UnembeddingLayerFrom(emb *Embedding, bias bool) Layer
type LossFunction
type Model
- func DecoderOnlyTransformer(vocabSize, ctxLen, layerNum, headNum, embDim int) Model
- func EmbeddingModel(vocabSize, dim int) Model
- func LinearModel(fanIn, fanOut int, bias bool) Model
- func SequentialModel(layers ...Layer) Model
type Node
- func CrossEntropyLoss(actual, predicted []*Node) *Node
- func DotProduct(left, right []*Node) *Node
- func Identity(n *Node) *Node
- func LayerNorm(xs, gamma, beta []*Node, eps float64) (ys []*Node)
- func Linear(ws, xs []*Node, bias *Node) *Node
- func MaskedAttention(ks, qs, vs [][]*Node) []*Node
- func MaxMarginLoss(actual, predicted []*Node) *Node
- func Mean(xs ...*Node) *Node
- func MeanVariance(xs ...*Node) (mean, variance *Node)
- func Multiply(prev ...*Node) *Node
- func NewInputNode(v float64, name string) *Node
- func NewInputNodeBatch(size int, nameFmt string, noGrad bool) []*Node
- func NewInputNodeNoGrad(v float64, name string) *Node
- func NewNode(v float64, name string) *Node
- func Normalize(eps float64, xs ...*Node) (ys []*Node)
- func Plus(prev ...*Node) *Node
- func RawCrossEntropyLoss(actual, predicted []*Node) *Node
- func Relu(prev *Node) *Node
- func ResidualSumSquaredLoss(actual, predicted []*Node) *Node
- func Sigmoid(prev *Node) *Node
- func Softmax(t float64, prev ...*Node) []*Node
- func Tanh(prev *Node) *Node
- func VectorAdd(left, right []*Node) (out []*Node)
- func (n *Node) Backward()
- func (n *Node) Forward()
- func (n *Node) ForwardBackward()
- func (n *Node) G() float64
- func (n *Node) Learn(delta float64)
- func (n *Node) Name() string
- func (n *Node) SetName(name string)
- func (n *Node) SetV(v float64)
- func (n *Node) String() string
- func (n *Node) V() float64
- func (n *Node) ZeroG()
type Sample
- func NewSample(inputSize, outputSize int, noGrad bool) *Sample
- func (s *Sample) Update(us util.Sample)
type SampleBatch
- func NewSampleBatch(inputSize, outputSize, batchSize int) SampleBatch
- func (sb SampleBatch) Update(samples []util.Sample)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func NodeValues ¶

func NodeValues(ns []*Node) []float64

func SingleLinear ¶

func SingleLinear(inputSize int, bias bool) *singleLinear

func Train ¶

func Train(model Model, samples []util.Sample, cfg *util.TrainConfig, lf LossFunction) time.Duration

Types ¶

type E2ELoss ¶

type E2ELoss func([]util.Sample) *Node

func TrainLossFunc ¶

func TrainLossFunc(model FeedForwarder, lf LossFunction) E2ELoss

type E2EPredictLoss ¶

type E2EPredictLoss func([]util.Sample) float64

func PredictLossFunc ¶

func PredictLossFunc(model FeedForwarder, lf LossFunction) E2EPredictLoss

type Embedding ¶

type Embedding struct {
	// contains filtered or unexported fields
}

func NewEmbedding ¶

func NewEmbedding(vocabSize, dim int, initNorm bool) *Embedding

func (*Embedding) E ¶

func (e *Embedding) E(index int) []*Node

func (*Embedding) EmbeddingFeed ¶

func (e *Embedding) EmbeddingFeed(in []*Node) (out []*Node)

func (*Embedding) S ¶

func (e *Embedding) S(index int) string

func (*Embedding) Sub ¶

func (e *Embedding) Sub(i, j int) (diff []float64)

func (*Embedding) UnembeddingFeed ¶

func (e *Embedding) UnembeddingFeed(in, bias []*Node) (out []*Node)

type FeedForwarder ¶

type FeedForwarder interface {
	Feed([]*Node) []*Node
}

type Layer ¶

type Layer interface {
	FeedForwarder
	Parameters() []util.Parameter
	Name() string
}

func AttentionBlockLayer ¶

func AttentionBlockLayer(embDim, headNum int, buildAttention func(int, int) Layer) Layer

func EmbeddingLayer ¶

func EmbeddingLayer(vocabSize, dim int) Layer

func EmbeddingLayerFrom ¶

func EmbeddingLayerFrom(emb *Embedding) Layer

func KQVLayer ¶

func KQVLayer(embDim, headSize int) Layer

func LayerNormLayer ¶

func LayerNormLayer(dim int) Layer

func LinearLayer ¶

func LinearLayer(fanIn, fanOut int, bias bool) Layer

func MaskedSelfAttentionLayer ¶

func MaskedSelfAttentionLayer(embDim, headSize int) Layer

func MultiHeadAttentionLayer ¶

func MultiHeadAttentionLayer(embDim, headNum int, buildAttention func(int, int) Layer) Layer

func PositionalEmbeddingLayer ¶

func PositionalEmbeddingLayer(vocabSize, ctxLen, dim int) Layer

func ReluLayer ¶

func ReluLayer() Layer

func SigmoidLayer ¶

func SigmoidLayer() Layer

func SoftmaxLayer ¶

func SoftmaxLayer(t float64) Layer

func TanhLayer ¶

func TanhLayer() Layer

func UnembeddingLayer ¶

func UnembeddingLayer(dim, vocabSize int, bias bool) Layer

func UnembeddingLayerFrom ¶

func UnembeddingLayerFrom(emb *Embedding, bias bool) Layer

type LossFunction ¶

type LossFunction func(actual, predicted []*Node) *Node

type Model ¶

type Model interface {
	Layer
	util.Predictor
}

func DecoderOnlyTransformer ¶

func DecoderOnlyTransformer(vocabSize, ctxLen, layerNum, headNum, embDim int) Model

func EmbeddingModel ¶

func EmbeddingModel(vocabSize, dim int) Model

func LinearModel ¶

func LinearModel(fanIn, fanOut int, bias bool) Model

func SequentialModel ¶

func SequentialModel(layers ...Layer) Model

type Node ¶

type Node struct {
	// contains filtered or unexported fields
}

func CrossEntropyLoss ¶

func CrossEntropyLoss(actual, predicted []*Node) *Node

CrossEntropyLoss takes softmax activation into account already, so the values in `predicted` nodes are logits rather than probabilities actaully. If you would like to use the cross entropy function without softmax fused into it, use RawCrossEntoryLoss instead. Also note that, different from RawCrossEntoryLoss, all values in `actual` nodes are indexes of ground-truth. So do expect `actual` (indexes) and `predicted` (logits) to be of different lengths.

func DotProduct ¶

func DotProduct(left, right []*Node) *Node

func Identity ¶

func Identity(n *Node) *Node

func LayerNorm ¶

func LayerNorm(xs, gamma, beta []*Node, eps float64) (ys []*Node)

func Linear ¶

func Linear(ws, xs []*Node, bias *Node) *Node

func MaskedAttention ¶

func MaskedAttention(ks, qs, vs [][]*Node) []*Node

func MaxMarginLoss ¶

func MaxMarginLoss(actual, predicted []*Node) *Node

func Mean ¶

func Mean(xs ...*Node) *Node

func MeanVariance ¶

func MeanVariance(xs ...*Node) (mean, variance *Node)

func Multiply ¶

func Multiply(prev ...*Node) *Node

func NewInputNode ¶

func NewInputNode(v float64, name string) *Node

func NewInputNodeBatch ¶

func NewInputNodeBatch(size int, nameFmt string, noGrad bool) []*Node

func NewInputNodeNoGrad ¶

func NewInputNodeNoGrad(v float64, name string) *Node

func NewNode ¶

func NewNode(v float64, name string) *Node

func Normalize ¶

func Normalize(eps float64, xs ...*Node) (ys []*Node)

func Plus ¶

func Plus(prev ...*Node) *Node

func RawCrossEntropyLoss ¶

func RawCrossEntropyLoss(actual, predicted []*Node) *Node

RawCrossEntropyLoss defines the cross-entropy loss function. `actual` represents the actual probability of each predefined class which should contain only one non-vanishing entry with value 1, meaning this class is observed in the data set sample (hence probability equals 1).

func Relu ¶

func Relu(prev *Node) *Node

func ResidualSumSquaredLoss ¶

func ResidualSumSquaredLoss(actual, predicted []*Node) *Node

ResidualSumSquaredLoss is the Residual Sum of Squared (RSS) or Sum of Squared Errors (SSE).

func Sigmoid ¶

func Sigmoid(prev *Node) *Node

func Softmax ¶

func Softmax(t float64, prev ...*Node) []*Node

Softmax also accepts a parameter `t`, which is sometimes called temperature.

func Tanh ¶

func Tanh(prev *Node) *Node

func VectorAdd ¶

func VectorAdd(left, right []*Node) (out []*Node)

func (*Node) Backward ¶

func (n *Node) Backward()

func (*Node) Forward ¶

func (n *Node) Forward()

func (*Node) ForwardBackward ¶

func (n *Node) ForwardBackward()

func (*Node) G ¶

func (n *Node) G() float64

func (*Node) Learn ¶

func (n *Node) Learn(delta float64)

func (*Node) Name ¶

func (n *Node) Name() string

func (*Node) SetName ¶

func (n *Node) SetName(name string)

func (*Node) SetV ¶

func (n *Node) SetV(v float64)

func (*Node) String ¶

func (n *Node) String() string

func (*Node) V ¶

func (n *Node) V() float64

func (*Node) ZeroG ¶

func (n *Node) ZeroG()

type Sample ¶

type Sample struct {
	X []*Node
	Y []*Node
}

func NewSample ¶

func NewSample(inputSize, outputSize int, noGrad bool) *Sample

func (*Sample) Update ¶

func (s *Sample) Update(us util.Sample)

type SampleBatch ¶

type SampleBatch []*Sample

func NewSampleBatch ¶

func NewSampleBatch(inputSize, outputSize, batchSize int) SampleBatch

func (SampleBatch) Update ¶

func (sb SampleBatch) Update(samples []util.Sample)

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
arrimpl Package arrimpl provides an array based implementation of Fully Connected Neural Network, which is much faster for large and deep networks.	Package arrimpl provides an array based implementation of Fully Connected Neural Network, which is much faster for large and deep networks.
examples
binary_classifier command
binary_classifier/data command
digit_ocr command
makemore Package makemore define the common utilities that are shared by all "makemore" examples.	Package makemore define the common utilities that are shared by all "makemore" examples.
makemore/nn_bigram command
makemore/nn_quadgram command
makemore/transformer command
makemore/wavenet command
word_embedding command
util Package util ...	Package util ...

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL