Documentation
¶
Overview ¶
Package registry provides a bbolt-backed model version registry for tracking, activating, and managing model versions used by the serving layer.
Index ¶
- type ABConfig
- type ABRouter
- type ABStats
- type CanaryConfig
- type CanaryController
- func (c *CanaryController) CurrentWeight() float64
- func (c *CanaryController) Promote() error
- func (c *CanaryController) RecordFailure()
- func (c *CanaryController) RecordSuccess()
- func (c *CanaryController) Rollback() error
- func (c *CanaryController) Start(ctx context.Context) error
- func (c *CanaryController) Step()
- func (c *CanaryController) Stop() error
- func (c *CanaryController) SuccessRate() float64
- type LatencyStats
- type MetricsStore
- type ModelVersion
- type Registry
- type ShadowConfig
- type ShadowMetrics
- type ShadowResult
- type ShadowRunner
- type VersionMetrics
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type ABRouter ¶
type ABRouter struct {
// contains filtered or unexported fields
}
ABRouter routes requests between a champion and challenger model using deterministic hashing for consistent assignment.
func NewABRouter ¶
NewABRouter creates an ABRouter with the given configuration.
func (*ABRouter) Route ¶
Route returns the model ID to use for the given request. The assignment is deterministic: the same requestID always maps to the same model.
func (*ABRouter) UpdateWeights ¶
UpdateWeights atomically updates the challenger traffic weight. It returns an error if weight is not in [0, 1].
type CanaryConfig ¶
type CanaryConfig struct {
// ModelID identifies the model version being canaried.
ModelID string
// InitialWeight is the starting traffic weight (0.0–1.0).
InitialWeight float64
// MaxWeight is the maximum weight before requiring explicit promotion.
MaxWeight float64
// StepSize is the weight increment applied each step.
StepSize float64
// StepInterval is the duration between automatic weight increases.
StepInterval time.Duration
// SuccessThreshold is the minimum success rate (0.0–1.0) required to step up.
SuccessThreshold float64
}
CanaryConfig configures a canary release rollout.
type CanaryController ¶
type CanaryController struct {
// contains filtered or unexported fields
}
CanaryController manages gradual traffic ramp-up for a model version. It automatically increases the traffic weight when the observed success rate meets or exceeds the configured threshold.
func NewCanaryController ¶
func NewCanaryController(cfg CanaryConfig) *CanaryController
NewCanaryController creates a CanaryController with the given configuration.
func (*CanaryController) CurrentWeight ¶
func (c *CanaryController) CurrentWeight() float64
CurrentWeight returns the current traffic weight.
func (*CanaryController) Promote ¶
func (c *CanaryController) Promote() error
Promote sets the canary weight to 1.0, directing all traffic to this version.
func (*CanaryController) RecordFailure ¶
func (c *CanaryController) RecordFailure()
RecordFailure increments the failure counter (thread-safe).
func (*CanaryController) RecordSuccess ¶
func (c *CanaryController) RecordSuccess()
RecordSuccess increments the success counter (thread-safe).
func (*CanaryController) Rollback ¶
func (c *CanaryController) Rollback() error
Rollback sets the canary weight to 0.0, removing all traffic from this version.
func (*CanaryController) Start ¶
func (c *CanaryController) Start(ctx context.Context) error
Start begins a background goroutine that periodically steps up the canary weight when the success rate meets the threshold.
func (*CanaryController) Step ¶
func (c *CanaryController) Step()
Step performs a single canary step: if the success rate meets the threshold and weight is below max, it increases weight by StepSize (capped at MaxWeight).
func (*CanaryController) Stop ¶
func (c *CanaryController) Stop() error
Stop halts the background stepping goroutine.
func (*CanaryController) SuccessRate ¶
func (c *CanaryController) SuccessRate() float64
SuccessRate returns the ratio of successes to total requests. Returns 0 if no requests have been recorded.
type LatencyStats ¶
LatencyStats holds latency percentiles in milliseconds.
type MetricsStore ¶
type MetricsStore struct {
// contains filtered or unexported fields
}
MetricsStore is an in-memory, thread-safe store of per-model performance metrics.
func NewMetricsStore ¶
func NewMetricsStore() *MetricsStore
NewMetricsStore returns a ready-to-use MetricsStore.
func (*MetricsStore) All ¶
func (s *MetricsStore) All() []VersionMetrics
All returns metrics for every registered model.
func (*MetricsStore) GetMetrics ¶
func (s *MetricsStore) GetMetrics(modelID string) (VersionMetrics, bool)
GetMetrics computes and returns the current metrics for the given model.
func (*MetricsStore) Record ¶
func (s *MetricsStore) Record(modelID string, latencyMs float64, isError bool)
Record appends a latency sample and increments counters for the given model.
func (*MetricsStore) Reset ¶
func (s *MetricsStore) Reset(modelID string)
Reset clears all recorded data for the given model.
type ModelVersion ¶
type ModelVersion struct {
ID string `json:"id"`
Name string `json:"name"`
Version string `json:"version"`
Path string `json:"path"`
Format string `json:"format"`
CreatedAt time.Time `json:"created_at"`
Metrics map[string]float64 `json:"metrics,omitempty"`
Active bool `json:"active"`
}
ModelVersion describes a single registered model version.
type Registry ¶
type Registry struct {
// contains filtered or unexported fields
}
Registry is a bbolt-backed store for model versions.
func NewRegistry ¶
NewRegistry opens (or creates) a bbolt database at dbPath and returns a Registry.
func (*Registry) Activate ¶
Activate marks the version with the given id as active and deactivates all other versions that share the same Name.
func (*Registry) Delete ¶
Delete removes the model version with the given id. It returns errNotFound if the id does not exist.
func (*Registry) GetActive ¶
func (r *Registry) GetActive(name string) (*ModelVersion, error)
GetActive returns the currently active version for the given model name. It returns errNotFound if no active version exists.
func (*Registry) List ¶
func (r *Registry) List(name string) ([]ModelVersion, error)
List returns all versions registered under the given model name.
func (*Registry) Register ¶
func (r *Registry) Register(mv ModelVersion) error
Register stores a new model version. It returns an error if a version with the same ID already exists.
type ShadowConfig ¶
type ShadowConfig struct {
ChampionID string
ChallengerID string
SampleRate float64 // 0.0–1.0: fraction of requests that also run the challenger.
}
ShadowConfig configures shadow mode inference where a challenger model runs alongside the champion model on a sample of requests.
type ShadowMetrics ¶
ShadowMetrics tracks aggregate shadow inference counters.
type ShadowResult ¶
type ShadowResult struct {
ChampionOutput []float32
ChallengerOutput []float32
LatencyDelta time.Duration // challenger_latency - champion_latency
Sampled bool
}
ShadowResult holds the output of a shadow inference run.
type ShadowRunner ¶
type ShadowRunner struct {
// contains filtered or unexported fields
}
ShadowRunner executes shadow mode inference.
func NewShadowRunner ¶
func NewShadowRunner(cfg ShadowConfig) *ShadowRunner
NewShadowRunner creates a ShadowRunner with the given configuration.
func (*ShadowRunner) Metrics ¶
func (s *ShadowRunner) Metrics() ShadowMetrics
Metrics returns aggregate shadow inference counters.
func (*ShadowRunner) RunShadow ¶
func (s *ShadowRunner) RunShadow( ctx context.Context, input []float32, inferFn func(modelID string, input []float32) ([]float32, error), ) (*ShadowResult, error)
RunShadow executes the champion model and optionally the challenger model concurrently. The champion result is always returned. If the challenger errors, the error is logged but the champion result is still returned.
type VersionMetrics ¶
type VersionMetrics struct {
ModelID string
RequestCount int64
ErrorCount int64
ErrorRate float64
Latency LatencyStats
LastUpdated time.Time
}
VersionMetrics holds performance metrics for a model version.