llama

package

v1.4.0 Latest Latest Go to latest Published: Apr 28, 2023 License: MIT Imports: 16 Imported by: 1

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/gotzmann/llama.go

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
func Colorize(format string, opts ...interface{}) (n int, err error)
func Eval(lctx *Context, vocab *ml.Vocab, model *Model, tokens []uint32, ...) error
func ExtractTokens(r *ring.Ring, count int) []uint32
func Resize(slice []float32, size int) []float32
func ResizeInplace(slice *[]float32, size int)
func SampleTopPTopK(logits []float32, lastNTokens *ring.Ring, lastNTokensSize uint32, topK uint32, ...) uint32
type Context
- func NewContext(model *Model, params *ModelParams) *Context
- func (ctx *Context) ReleaseContext()
type ContextParams
type HParams
type KVCache
type Layer
type Model
- func LoadModel(fileName string, params *ModelParams, silent bool) (*ml.Vocab, *Model, error)
- func NewModel(params *ModelParams) *Model
type ModelParams
type ModelType

Constants ¶

View Source

const (
	LLAMA_FILE_VERSION           = 1
	LLAMA_FILE_MAGIC             = 0x67676a74 // 'ggjt' in hex
	LLAMA_FILE_MAGIC_OLD         = 0x67676d66 // 'ggmf' in hex
	LLAMA_FILE_MAGIC_UNVERSIONED = 0x67676d6c // 'ggml' pre-versioned files
)

Variables ¶

This section is empty.

Functions ¶

func Colorize ¶

func Colorize(format string, opts ...interface{}) (n int, err error)

Colorize is a function to print colored text to the console

func Eval ¶

func Eval(
	lctx *Context,
	vocab *ml.Vocab,
	model *Model,
	tokens []uint32,
	pastCount uint32,
	params *ModelParams,
) error

Eval runs one inference iteration over the LLaMA model lctx = model context with all LLaMA data tokens = new batch of tokens to process pastCount = the context size so far params = all other parameters like max threads allowed, etc

func ExtractTokens ¶

func ExtractTokens(r *ring.Ring, count int) []uint32

ExtractTokens is a function to extract a slice of tokens from the ring buffer

func Resize ¶

func Resize(slice []float32, size int) []float32

Resize() (safe) for using instead of C++ std::vector:resize() https://go.dev/play/p/VlQ7N75E5AD

func ResizeInplace ¶

func ResizeInplace(slice *[]float32, size int)

NB! This do not clear the underlying array when resizing https://go.dev/play/p/DbK4dFqwrZn

func SampleTopPTopK ¶

func SampleTopPTopK(
	logits []float32,
	lastNTokens *ring.Ring,
	lastNTokensSize uint32,
	topK uint32,
	topP float32,
	temp float32,
	repeatPenalty float32,
) uint32

SampleTopPTopK samples next token given probabilities for each embedding:

consider only the top K tokens
from them, consider only the top tokens with cumulative probability > P

Types ¶

type Context ¶

type Context struct {
	Logits    []float32 // decode output 2D array [tokensCount][vocabSize]
	Embedding []float32 // input embedding 1D array [embdSize]
	MLContext *ml.Context
	// contains filtered or unexported fields
}

Context is the context of the model.

func NewContext ¶

func NewContext(model *Model, params *ModelParams) *Context

NewContext creates a new context.

func (*Context) ReleaseContext ¶ added in v1.4.0

func (ctx *Context) ReleaseContext()

type ContextParams ¶

type ContextParams struct {
	CtxSize    uint32 // text context
	PartsCount int    // -1 for default
	Seed       int    // RNG seed, 0 for random
	LogitsAll  bool   // the llama_eval() call computes all logits, not just the last one
	VocabOnly  bool   // only load the vocabulary, no weights
	UseLock    bool   // force system to keep model in RAM
	Embedding  bool   // embedding mode only
}

ContextParams are the parameters for the context. struct llama_context_params {

type HParams ¶

type HParams struct {
	// contains filtered or unexported fields
}

HParams are the hyperparameters of the model (LLaMA-7B commented as example).

type KVCache ¶

type KVCache struct {
	K *ml.Tensor
	V *ml.Tensor

	N uint32 // number of tokens currently in the cache
}

KVCache is a key-value cache for the self attention.

type Layer ¶

type Layer struct {
	// contains filtered or unexported fields
}

Layer is a single layer of the model.

type Model ¶

type Model struct {
	Type ModelType
	// contains filtered or unexported fields
}

Model is the representation of any NN model (and LLaMA too).

func LoadModel ¶

func LoadModel(fileName string, params *ModelParams, silent bool) (*ml.Vocab, *Model, error)

LoadModel loads a model's weights from a file See convert-pth-to-ggml.py for details on format func LoadModel(fileName string, params ModelParams, silent bool) (*Context, error) {

func NewModel ¶

func NewModel(params *ModelParams) *Model

NewModel creates a new model with default hyperparameters.

type ModelParams ¶ added in v1.2.0

type ModelParams struct {
	Model  string // model path
	Prompt string

	MaxThreads int

	UseAVX  bool
	UseNEON bool

	Seed         int
	PredictCount uint32 // new tokens to predict
	RepeatLastN  uint32 // last n tokens to penalize
	PartsCount   int    // amount of model parts (-1 = determine from model dimensions)
	CtxSize      uint32 // context size
	BatchSize    uint32 // batch size for prompt processing
	KeepCount    uint32

	TopK          uint32  // 40
	TopP          float32 // 0.95
	Temp          float32 // 0.80
	RepeatPenalty float32 // 1.10

	InputPrefix string   // string to prefix user inputs with
	Antiprompt  []string // string upon seeing which more user input is prompted

	MemoryFP16   bool // use f16 instead of f32 for memory kv
	RandomPrompt bool // do not randomize prompt if none provided
	UseColor     bool // use color to distinguish generations and inputs
	Interactive  bool // interactive mode

	Embedding        bool // get only sentence embedding
	InteractiveStart bool // wait for user input immediately

	Instruct   bool // instruction mode (used for Alpaca models)
	IgnoreEOS  bool // do not stop generating after eos
	Perplexity bool // compute perplexity over the prompt
	UseMLock   bool // use mlock to keep model in memory
	MemTest    bool // compute maximum memory usage

	VerbosePrompt bool
}

type ModelType ¶

type ModelType uint8

ModelType is the type of the model.

const (
	MODEL_UNKNOWN ModelType = iota
	MODEL_7B
	MODEL_13B
	MODEL_30B
	MODEL_65B
)

available llama models

Source Files ¶

View all Source files

llama.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL