multimodel

package
v1.15.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 24, 2026 License: Apache-2.0 Imports: 6 Imported by: 0

Documentation

Overview

Package multimodel provides a ModelManager that loads and unloads models on demand with LRU eviction when GPU memory budget is exceeded.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	// MaxGPUMemoryBytes is the memory budget. When loading a new model would
	// exceed this limit, the least-recently-used model is evicted first.
	MaxGPUMemoryBytes int64

	// PreloadModels lists model IDs to load eagerly at creation time.
	PreloadModels []string
}

Config controls the ModelManager behavior.

type ModelLoader

type ModelLoader interface {
	Load(ctx context.Context, modelID string) (io.Closer, int64, error)
}

ModelLoader loads a model by ID and returns its handle and estimated size in bytes.

type ModelManager

type ModelManager struct {
	// contains filtered or unexported fields
}

ModelManager manages a set of loaded models with LRU GPU eviction. It is safe for concurrent use.

func NewModelManager

func NewModelManager(loader ModelLoader, cfg Config) (*ModelManager, error)

NewModelManager creates a ModelManager. If cfg.PreloadModels is non-empty, those models are loaded eagerly. An error is returned if any preload fails.

func (*ModelManager) Close

func (m *ModelManager) Close() error

Close unloads all models and releases resources.

func (*ModelManager) Get

func (m *ModelManager) Get(ctx context.Context, modelID string) (io.Closer, error)

Get returns the model for the given ID, loading it if necessary. If loading would exceed the memory budget, the least-recently-used model is evicted first. Get is safe for concurrent callers.

func (*ModelManager) Loaded

func (m *ModelManager) Loaded() []string

Loaded returns the IDs of all currently loaded models.

func (*ModelManager) Unload

func (m *ModelManager) Unload(modelID string) error

Unload explicitly removes a model by ID, freeing its resources.

func (*ModelManager) UsedBytes

func (m *ModelManager) UsedBytes() int64

UsedBytes returns the current estimated GPU memory usage.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL