vllm

package
v0.290.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 5, 2024 License: Apache-2.0 Imports: 10 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ChatTemplate added in v0.253.0

func ChatTemplate(modelName string) (string, error)

ChatTemplate returns the chat template for the given model.

func IsAWQQuantizedModel added in v0.272.0

func IsAWQQuantizedModel(modelName string) bool

IsAWQQuantizedModel returns true if the model name is an AWQ quantized model.

func ModelFilePath added in v0.253.0

func ModelFilePath(modelDir, modelName string, format mv1.ModelFormat) (string, error)

ModelFilePath returns the file path of the model.

func PreferredModelFormat added in v0.284.0

func PreferredModelFormat(resp *mv1.GetBaseModelPathResponse) (mv1.ModelFormat, error)

PreferredModelFormat returns the preferred model format.

Types

type Manager

type Manager struct {
	// contains filtered or unexported fields
}

Manager manages the Ollama service.

TODO(kenji): Refactor this class once we completely switch to the one-odel-per-pod implementation where inference-manager-engine doesn't directly run vLLM or Ollama.

func New

func New(modelDir string, s3Client s3Client) *Manager

New returns a new Manager.

func (*Manager) CreateNewModelOfGGUF added in v0.273.0

func (m *Manager) CreateNewModelOfGGUF(modelName string, spec *ollama.ModelSpec) error

CreateNewModelOfGGUF creates a new model with the given name and spec that uses a GGUF model file.

func (*Manager) DownloadAndCreateNewModel added in v0.273.0

func (m *Manager) DownloadAndCreateNewModel(ctx context.Context, modelName string, resp *mv1.GetBaseModelPathResponse) error

DownloadAndCreateNewModel downloads the model from the given path and creates a new model.

func (*Manager) UpdateModelTemplateToLatest added in v0.222.0

func (m *Manager) UpdateModelTemplateToLatest(modelName string) error

UpdateModelTemplateToLatest updates the model template to the latest.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL