vllm

package
v0.275.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 31, 2024 License: Apache-2.0 Imports: 12 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ChatTemplate added in v0.253.0

func ChatTemplate(modelName string) (string, error)

ChatTemplate returns the chat template for the given model.

func IsAWQQuantizedModel added in v0.272.0

func IsAWQQuantizedModel(modelName string) bool

IsAWQQuantizedModel returns true if the model name is an AWQ quantized model.

func ModelFilePath added in v0.253.0

func ModelFilePath(modelDir, modelName string) string

ModelFilePath returns the file path of the model.

Types

type Manager

type Manager struct {
	// contains filtered or unexported fields
}

Manager manages the Ollama service.

TODO(kenji): Refactor this class once we completely switch to the one-odel-per-pod implementation where inference-manager-engine doesn't directly run vLLM or Ollama.

func New

func New(c *config.Config, modelDir string, s3Client s3Client) *Manager

New returns a new Manager.

func (*Manager) CreateNewModelOfGGUF added in v0.273.0

func (m *Manager) CreateNewModelOfGGUF(modelName string, spec *manager.ModelSpec) error

CreateNewModelOfGGUF creates a new model with the given name and spec that uses a GGUF model file.

func (*Manager) DownloadAndCreateNewModel added in v0.273.0

func (m *Manager) DownloadAndCreateNewModel(modelName string, resp *mv1.GetBaseModelPathResponse) error

DownloadAndCreateNewModel downloads the model from the given path and creates a new model.

func (*Manager) IsReady added in v0.212.0

func (m *Manager) IsReady() (bool, string)

IsReady returns true if the processor is ready. If not, it returns a message describing why it is not ready.

func (*Manager) Run

func (m *Manager) Run() error

Run starts the vLLM service.

func (*Manager) UpdateModelTemplateToLatest added in v0.222.0

func (m *Manager) UpdateModelTemplateToLatest(modelName string) error

UpdateModelTemplateToLatest updates the model template to the latest.

func (*Manager) WaitForReady

func (m *Manager) WaitForReady() error

WaitForReady waits for the vllm service to be ready.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL