infprocessor

package
v0.220.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 13, 2024 License: Apache-2.0 Imports: 11 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type EngineStatus added in v0.215.0

type EngineStatus struct {
	RegisteredModelIDs []string      `json:"registeredModelIds"`
	InProgressModelIDs []string      `json:"inProgressModelIds"`
	Tasks              []*TaskStatus `json:"tasks"`
}

EngineStatus is the status of an engine.

type P

type P struct {
	// contains filtered or unexported fields
}

P processes inference tasks.

func NewP

func NewP(queue *TaskQueue, engineRouter engineRouter) *P

NewP creates a new processor.

func (*P) AddOrUpdateEngineStatus added in v0.115.0

func (p *P) AddOrUpdateEngineStatus(
	srv engineCommunicator,
	engineStatus *v1.EngineStatus,
	clusterInfo *auth.ClusterInfo,
)

AddOrUpdateEngineStatus adds or updates the engine status.

func (*P) DumpStatus added in v0.215.0

func (p *P) DumpStatus() *Status

DumpStatus dumps the status of the processor.

func (*P) MaxInProgressTaskDuration added in v0.163.0

func (p *P) MaxInProgressTaskDuration() time.Duration

MaxInProgressTaskDuration returns the maximum duration of in-progress tasks.

func (*P) NumEnginesByTenantID added in v0.167.0

func (p *P) NumEnginesByTenantID() map[string]int

NumEnginesByTenantID returns the number of engines by tenant ID.

func (*P) NumInProgressTasks added in v0.163.0

func (p *P) NumInProgressTasks() int

NumInProgressTasks returns the number of in-progress tasks.

func (*P) NumQueuedTasks added in v0.163.0

func (p *P) NumQueuedTasks() int32

NumQueuedTasks returns the number of queued tasks.

func (*P) ProcessTaskResult added in v0.121.0

func (p *P) ProcessTaskResult(
	taskResult *v1.TaskResult,
	clusterInfo *auth.ClusterInfo,
) error

ProcessTaskResult processes the task result.

func (*P) RemoveEngine added in v0.135.0

func (p *P) RemoveEngine(engineID string, clusterInfo *auth.ClusterInfo)

RemoveEngine removes the engine.

func (*P) Run

func (p *P) Run(ctx context.Context) error

Run runs the processor.

type Status added in v0.215.0

type Status struct {
	Tenants map[string]*TenantStatus `json:"tenants"`
}

Status is the status of the processor.

type Task

type Task struct {
	ID       string
	TenantID string

	Req    *v1.CreateChatCompletionRequest
	Header http.Header
	RespCh chan *http.Response
	ErrCh  chan error

	EngineID string

	CreatedAt time.Time
	// contains filtered or unexported fields
}

Task is an inference task. TODO(kenji): Consider preserving the request context as well.

func (*Task) WaitForCompletion

func (t *Task) WaitForCompletion(ctx context.Context) (*http.Response, error)

WaitForCompletion waits for the completion of the task.

type TaskQueue

type TaskQueue struct {
	// contains filtered or unexported fields
}

TaskQueue is a queue for inference tasks.

func NewTaskQueue

func NewTaskQueue() *TaskQueue

NewTaskQueue creates a new task queue.

func (*TaskQueue) Dequeue

func (q *TaskQueue) Dequeue(ctx context.Context) (*Task, error)

Dequeue removes a task from the queue.

func (*TaskQueue) Enqueue

func (q *TaskQueue) Enqueue(t *Task)

Enqueue inserts a task into the queue.

type TaskStatus added in v0.215.0

type TaskStatus struct {
	ID      string `json:"id"`
	ModelID string `json:"modelId"`
}

TaskStatus is the status of a task.

type TenantStatus added in v0.215.0

type TenantStatus struct {
	Engines map[string]*EngineStatus `json:"engines"`
}

TenantStatus is the status of a tenant.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL