adaptive

package
v1.26.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 27, 2026 License: Apache-2.0 Imports: 5 Imported by: 0

Documentation

Overview

Package adaptive implements an adaptive batch scheduler that dynamically adjusts batch size based on queue depth and latency targets to maximize throughput while meeting latency SLOs. (Stability: beta)

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Batcher

type Batcher struct {
	// contains filtered or unexported fields
}

Batcher dynamically adjusts batch size based on queue depth and latency.

func New

func New(config Config, handler Handler) *Batcher

New creates a new adaptive batcher.

func (*Batcher) BatchSize

func (b *Batcher) BatchSize() int

BatchSize returns the current dynamically-adjusted batch size.

func (*Batcher) LatencyEMA

func (b *Batcher) LatencyEMA() float64

LatencyEMA returns the current exponential moving average latency in ms.

func (*Batcher) Start

func (b *Batcher) Start()

Start begins the batch collection loop.

func (*Batcher) Stop

func (b *Batcher) Stop()

Stop gracefully shuts down the batcher.

func (*Batcher) Submit

func (b *Batcher) Submit(ctx context.Context, req Request) (Result, error)

Submit enqueues a request and blocks until it completes or the context expires.

type Config

type Config struct {
	// MinBatchSize is the smallest batch the batcher will form (default 1).
	MinBatchSize int
	// MaxBatchSize is the largest batch the batcher will form (default 32).
	MaxBatchSize int
	// TargetLatencyMS is the latency SLO in milliseconds. When the
	// exponential moving average of batch latency exceeds this target,
	// the batcher decreases the batch size.
	TargetLatencyMS float64
	// QueueTimeoutMS is how long (in ms) the batcher waits to fill a batch
	// before dispatching whatever it has collected (default 50).
	QueueTimeoutMS float64
}

Config controls the adaptive batcher.

type Handler

type Handler func(ctx context.Context, reqs []Request) []Result

Handler processes a batch of requests and returns results.

type Request

type Request struct {
	ID     string
	Tokens []int
}

Request represents an incoming inference request.

type Result

type Result struct {
	RequestID string
	Output    []int
	Err       error
}

Result holds the output for a completed request.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL