Documentation
¶
Overview ¶
Package multimodel provides a ModelManager that loads and unloads models on demand with LRU eviction when GPU memory budget is exceeded.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct {
// MaxGPUMemoryBytes is the memory budget. When loading a new model would
// exceed this limit, the least-recently-used model is evicted first.
MaxGPUMemoryBytes int64
// PreloadModels lists model IDs to load eagerly at creation time.
PreloadModels []string
}
Config controls the ModelManager behavior.
type ModelLoader ¶
ModelLoader loads a model by ID and returns its handle and estimated size in bytes.
type ModelManager ¶
type ModelManager struct {
// contains filtered or unexported fields
}
ModelManager manages a set of loaded models with LRU GPU eviction. It is safe for concurrent use.
func NewModelManager ¶
func NewModelManager(loader ModelLoader, cfg Config) (*ModelManager, error)
NewModelManager creates a ModelManager. If cfg.PreloadModels is non-empty, those models are loaded eagerly. An error is returned if any preload fails.
func (*ModelManager) Close ¶
func (m *ModelManager) Close() error
Close unloads all models and releases resources.
func (*ModelManager) Get ¶
Get returns the model for the given ID, loading it if necessary. If loading would exceed the memory budget, the least-recently-used model is evicted first. Get is safe for concurrent callers.
func (*ModelManager) Loaded ¶
func (m *ModelManager) Loaded() []string
Loaded returns the IDs of all currently loaded models.
func (*ModelManager) Unload ¶
func (m *ModelManager) Unload(modelID string) error
Unload explicitly removes a model by ID, freeing its resources.
func (*ModelManager) UsedBytes ¶
func (m *ModelManager) UsedBytes() int64
UsedBytes returns the current estimated GPU memory usage.