Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ChatTemplate ¶ added in v0.253.0
ChatTemplate returns the chat template for the given model.
func IsAWQQuantizedModel ¶ added in v0.272.0
IsAWQQuantizedModel returns true if the model name is an AWQ quantized model.
func ModelFilePath ¶ added in v0.253.0
func ModelFilePath(modelDir, modelName string, format mv1.ModelFormat) (string, error)
ModelFilePath returns the file path of the model.
func PreferredModelFormat ¶ added in v0.284.0
func PreferredModelFormat(resp *mv1.GetBaseModelPathResponse) (mv1.ModelFormat, error)
PreferredModelFormat returns the preferred model format.
Types ¶
type Manager ¶
type Manager struct {
// contains filtered or unexported fields
}
Manager manages the Ollama service.
TODO(kenji): Refactor this class once we completely switch to the one-odel-per-pod implementation where inference-manager-engine doesn't directly run vLLM or Ollama.
func (*Manager) CreateNewModelOfGGUF ¶ added in v0.273.0
CreateNewModelOfGGUF creates a new model with the given name and spec that uses a GGUF model file.
func (*Manager) DownloadAndCreateNewModel ¶ added in v0.273.0
func (m *Manager) DownloadAndCreateNewModel(ctx context.Context, modelName string, resp *mv1.GetBaseModelPathResponse) error
DownloadAndCreateNewModel downloads the model from the given path and creates a new model.
func (*Manager) UpdateModelTemplateToLatest ¶ added in v0.222.0
UpdateModelTemplateToLatest updates the model template to the latest.