Documentation
¶
Index ¶
- func ChatTemplate(modelName string) (string, error)
- func IsAWQQuantizedModel(modelName string) bool
- func ModelFilePath(modelDir, modelName string) string
- type Manager
- func (m *Manager) CreateNewModelOfGGUF(modelName string, spec *manager.ModelSpec) error
- func (m *Manager) DownloadAndCreateNewModel(modelName string, resp *mv1.GetBaseModelPathResponse) error
- func (m *Manager) IsReady() (bool, string)
- func (m *Manager) Run() error
- func (m *Manager) UpdateModelTemplateToLatest(modelName string) error
- func (m *Manager) WaitForReady() error
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ChatTemplate ¶ added in v0.253.0
ChatTemplate returns the chat template for the given model.
func IsAWQQuantizedModel ¶ added in v0.272.0
IsAWQQuantizedModel returns true if the model name is an AWQ quantized model.
func ModelFilePath ¶ added in v0.253.0
ModelFilePath returns the file path of the model.
Types ¶
type Manager ¶
type Manager struct {
// contains filtered or unexported fields
}
Manager manages the Ollama service.
TODO(kenji): Refactor this class once we completely switch to the one-odel-per-pod implementation where inference-manager-engine doesn't directly run vLLM or Ollama.
func (*Manager) CreateNewModelOfGGUF ¶ added in v0.273.0
CreateNewModelOfGGUF creates a new model with the given name and spec that uses a GGUF model file.
func (*Manager) DownloadAndCreateNewModel ¶ added in v0.273.0
func (m *Manager) DownloadAndCreateNewModel(modelName string, resp *mv1.GetBaseModelPathResponse) error
DownloadAndCreateNewModel downloads the model from the given path and creates a new model.
func (*Manager) IsReady ¶ added in v0.212.0
IsReady returns true if the processor is ready. If not, it returns a message describing why it is not ready.
func (*Manager) UpdateModelTemplateToLatest ¶ added in v0.222.0
UpdateModelTemplateToLatest updates the model template to the latest.
func (*Manager) WaitForReady ¶
WaitForReady waits for the vllm service to be ready.