Documentation
¶
Index ¶
- Constants
- func ModelDir() string
- func PreferredModelFormat(runtime string, supportedFormats []mv1.ModelFormat) (mv1.ModelFormat, error)
- type Client
- type Manager
- func (m *Manager) GetLLMAddress(modelID string) (string, error)
- func (m *Manager) Initialize(ctx context.Context, apiReader client.Reader, namespace string) error
- func (m *Manager) ListInProgressModels() []string
- func (m *Manager) ListSyncedModelIDs(ctx context.Context) []string
- func (m *Manager) PullModel(ctx context.Context, modelID string) error
- func (m *Manager) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error)
- func (m *Manager) SetupWithManager(mgr ctrl.Manager) error
- type ScalerRegisterer
Constants ¶
const RuntimeNameOllama = "ollama"
RuntimeNameOllama is the name of the Ollama runtime.
const RuntimeNameVLLM = "vllm"
RuntimeNameVLLM is the name of the VLLM runtime.
Variables ¶
This section is empty.
Functions ¶
func ModelDir ¶ added in v0.251.0
func ModelDir() string
ModelDir returns the directory where models are stored.
func PreferredModelFormat ¶ added in v0.301.0
func PreferredModelFormat(runtime string, supportedFormats []mv1.ModelFormat) (mv1.ModelFormat, error)
PreferredModelFormat returns the preferred model format.
Types ¶
type Client ¶
type Client interface { GetAddress(name string) string DeployRuntime(ctx context.Context, modelID string) (types.NamespacedName, error) }
Client is the interface for managing runtimes.
func NewOllamaClient ¶
func NewOllamaClient( k8sClient client.Client, namespace string, rconfig config.RuntimeConfig, oconfig config.OllamaConfig, ) Client
NewOllamaClient creates a new Ollama runtime client.
func NewVLLMClient ¶ added in v0.249.0
func NewVLLMClient( k8sClient client.Client, namespace string, rconfig config.RuntimeConfig, modelContextLengths map[string]int, modelClient modelClient, ) Client
NewVLLMClient creates a new VLLM runtime client.
type Manager ¶
type Manager struct {
// contains filtered or unexported fields
}
Manager manages runtimes.
func NewManager ¶
func NewManager( k8sClient client.Client, rtClient Client, autoscaler ScalerRegisterer, ) *Manager
NewManager creates a new runtime manager.
func (*Manager) GetLLMAddress ¶
GetLLMAddress returns the address of the LLM.
func (*Manager) Initialize ¶
Initialize initializes ready and pending runtimes. This function is not thread-safe.
func (*Manager) ListInProgressModels ¶
ListInProgressModels returns the list of models that are in progress.
func (*Manager) ListSyncedModelIDs ¶
ListSyncedModelIDs returns the list of models that are synced.
type ScalerRegisterer ¶ added in v0.303.0
type ScalerRegisterer interface { Register(modelID string, target types.NamespacedName) Unregister(target types.NamespacedName) }
ScalerRegisterer is an interface for registering and unregistering scalers.