router

package
v0.4.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 25, 2026 License: AGPL-3.0 Imports: 10 Imported by: 0

Documentation

Overview

Package router defines HTTP route registration and middleware chaining.

Package router defines HTTP route registration and middleware chaining, as well as model selection based on request scenarios.

Package router defines HTTP route registration and middleware chaining.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func GetFallbackChain

func GetFallbackChain(primary config.ModelConfig, fallbacks map[string][]config.ModelConfig) []config.ModelConfig

GetFallbackChain returns the fallback chain for a given primary model.

func IsRetryableError

func IsRetryableError(err error) bool

IsRetryableError determines if an error is worth retrying with a fallback.

Types

type CapacityDecision added in v0.3.5

type CapacityDecision struct {
	Models             []config.ModelConfig
	Skipped            []SkippedModel
	InputTokens        int
	RequestedMaxTokens int
	SelectedMaxTokens  int
	ContextWindow      int
	ContextMargin      int
	NeedsVision        bool
	NeedsTools         bool
}

func FilterByCapacity added in v0.3.5

func FilterByCapacity(chain []config.ModelConfig, inputTokens int, requestedMaxTokens int, needsVision bool, needsTools bool) (CapacityDecision, error)

type CircuitBreaker

type CircuitBreaker struct {
	// contains filtered or unexported fields
}

CircuitBreaker tracks failure rates and prevents calls to failing models.

func NewCircuitBreaker

func NewCircuitBreaker(threshold int, recoveryTimeout time.Duration) *CircuitBreaker

NewCircuitBreaker creates a circuit breaker with default thresholds.

func (*CircuitBreaker) AllowRequest

func (cb *CircuitBreaker) AllowRequest() bool

AllowRequest returns true if the circuit allows a request.

func (*CircuitBreaker) RecordFailure

func (cb *CircuitBreaker) RecordFailure()

RecordFailure records a failed call.

func (*CircuitBreaker) RecordSuccess

func (cb *CircuitBreaker) RecordSuccess()

RecordSuccess records a successful call.

func (*CircuitBreaker) State

func (cb *CircuitBreaker) State() CircuitState

State returns the current circuit state.

type CircuitState

type CircuitState int

CircuitState represents the state of a circuit breaker.

const (
	CircuitClosed   CircuitState = iota // Normal operation
	CircuitHalfOpen                     // Testing if service recovered
	CircuitOpen                         // Failing fast, not attempting calls
)

type EvaluationContext

type EvaluationContext struct {
	Request         *core.NormalizedRequest
	TokenCount      int
	AvailableModels []config.ModelConfig
	History         []RouteDecision
}

EvaluationContext carries all information needed to evaluate routing policies.

type FallbackHandler

type FallbackHandler struct {
	// contains filtered or unexported fields
}

FallbackHandler manages model fallback with circuit breaker protection.

func NewFallbackHandler

func NewFallbackHandler(logger *slog.Logger, cbThreshold int, cbTimeout time.Duration) *FallbackHandler

NewFallbackHandler creates a new fallback handler with circuit breakers.

func (*FallbackHandler) ExecuteWithFallback

func (h *FallbackHandler) ExecuteWithFallback(
	ctx context.Context,
	models []config.ModelConfig,
	executor func(context.Context, config.ModelConfig) ([]byte, error),
) (*FallbackResult, []byte, error)

ExecuteWithFallback tries models in sequence until one succeeds. Respects circuit breaker state to skip models that are failing repeatedly.

func (*FallbackHandler) GetCircuitStates

func (h *FallbackHandler) GetCircuitStates() map[string]string

GetCircuitStates returns the state of all circuit breakers.

type FallbackResult

type FallbackResult struct {
	ModelID     string
	Success     bool
	Error       error
	Attempted   int
	TotalModels int
}

FallbackResult contains the result of a fallback attempt.

type MessageContent

type MessageContent struct {
	Role        string
	Content     string
	HasImage    bool
	ImageHashes []string
}

MessageContent represents a single message in a conversation.

type ModelOverridePolicy

type ModelOverridePolicy struct {
	// contains filtered or unexported fields
}

ModelOverridePolicy checks whether the requested model has an entry in model_overrides. If so, it uses that override as the primary and appends the default fallback chain.

func NewModelOverridePolicy

func NewModelOverridePolicy(router *ModelRouter) *ModelOverridePolicy

NewModelOverridePolicy creates a model override policy.

func (*ModelOverridePolicy) Evaluate

Evaluate checks model_overrides for the requested model.

func (*ModelOverridePolicy) Name

func (p *ModelOverridePolicy) Name() string

Name returns the policy identifier.

type ModelRouter

type ModelRouter struct {
	// contains filtered or unexported fields
}

ModelRouter handles model selection based on scenarios.

func NewModelRouter

func NewModelRouter(atomic *config.AtomicConfig) *ModelRouter

NewModelRouter creates a new model router.

func (*ModelRouter) IsStreamingScenarioRoutingEnabled

func (r *ModelRouter) IsStreamingScenarioRoutingEnabled() bool

IsStreamingScenarioRoutingEnabled returns whether streaming requests should use scenario-based routing instead of always routing to the fast model.

func (*ModelRouter) Route

func (r *ModelRouter) Route(messages []MessageContent, tokenCount int, requestedModel string) (RouteResult, error)

Route determines which model to use for a request. If respect_requested_model is enabled and requestedModel is provided, it overrides scenario-based routing.

func (*ModelRouter) RouteForStreaming

func (r *ModelRouter) RouteForStreaming(messages []MessageContent, tokenCount int, requestedModel string) (RouteResult, error)

RouteForStreaming determines which model to use for streaming requests. Prioritizes fast TTFT (time-to-first-token) over capability. If respect_requested_model is enabled and requestedModel is provided, it overrides scenario-based routing.

func (*ModelRouter) RouteWithOverride

func (r *ModelRouter) RouteWithOverride(requestedModel string) (RouteResult, bool)

RouteWithOverride checks if the requested model matches a model_overrides entry.

When matched, the returned RouteResult uses the override ModelConfig as the primary. The fallback chain is fallbacks[<requestedModel>], falling back to fallbacks["default"] when the override key has no entry (matching the behavior of Route and RouteForStreaming). The caller (MessagesHandler) is expected to merge a scenario-derived safety-net chain on top.

Returns the override RouteResult and true if matched, or a zero value and false if the requested model has no entry in model_overrides.

type Policy

type Policy interface {
	// Name returns the policy identifier.
	Name() string

	// Evaluate examines the context and returns the model chain to try, the
	// decision explanation, or an error if no model matches this policy.
	Evaluate(ctx *EvaluationContext) ([]config.ModelConfig, RouteDecision, error)
}

Policy evaluates a routing strategy and selects a model chain.

type PolicyEngine

type PolicyEngine struct {
	// contains filtered or unexported fields
}

PolicyEngine composes multiple policies with ordered evaluation. Policies are evaluated in registration order; the first policy that returns a non-empty chain wins.

func NewPolicyEngine

func NewPolicyEngine() *PolicyEngine

NewPolicyEngine creates a policy engine with the default set of policies:

  1. ModelOverridePolicy — check model_overrides config entries
  2. RespectRequestModelPolicy — check respect_requested_model config
  3. ScenarioPolicy — scenario-based routing (existing DetectScenario logic)

func (*PolicyEngine) AddPolicy

func (eng *PolicyEngine) AddPolicy(p Policy)

AddPolicy appends a policy to the evaluation chain.

func (*PolicyEngine) Evaluate

Evaluate runs each policy in order and returns the first successful result.

func (*PolicyEngine) EvaluateDryRun

func (eng *PolicyEngine) EvaluateDryRun(ctx *EvaluationContext) []RouteDecision

EvaluateDryRun returns all policy decisions without executing. Useful for debugging and the dry-run endpoint.

type RequestFacts added in v0.3.5

type RequestFacts struct {
	LatestUserText          string
	LatestUserHasImage      bool
	LatestTextComplexIntent bool
	NeedsVision             bool
}

func AnalyzeRequestFacts added in v0.3.5

func AnalyzeRequestFacts(messages []MessageContent) RequestFacts

type RouteDecision

type RouteDecision struct {
	PolicyName string
	ModelID    string
	Provider   string
	Reason     string
	Weight     int
}

RouteDecision records a routing decision for observability.

type RouteResult

type RouteResult struct {
	Primary   config.ModelConfig
	Fallbacks []config.ModelConfig
	Scenario  Scenario
}

RouteResult contains the selected model and fallback chain.

func (*RouteResult) GetModelChain

func (rr *RouteResult) GetModelChain() []config.ModelConfig

GetModelChain returns the full chain of models to try (primary + fallbacks).

type Scenario

type Scenario string

Scenario represents the routing scenario for model selection.

const (
	ScenarioDefault           Scenario = "default"
	ScenarioBackground        Scenario = "background"
	ScenarioThink             Scenario = "think"
	ScenarioComplex           Scenario = "complex"
	ScenarioLongContext       Scenario = "long_context"
	ScenarioFast              Scenario = "fast"
	ScenarioOverride          Scenario = "override"
	ScenarioVision            Scenario = "vision"
	ScenarioVisionComplex     Scenario = "vision_complex"
	ScenarioVisionLongContext Scenario = "vision_long_context"
)

type ScenarioPolicy

type ScenarioPolicy struct {
	// contains filtered or unexported fields
}

ScenarioPolicy runs scenario-based routing using the existing DetectScenario logic. It handles both streaming and non-streaming paths.

func NewScenarioPolicy

func NewScenarioPolicy(router *ModelRouter) *ScenarioPolicy

NewScenarioPolicy creates a scenario policy.

func (*ScenarioPolicy) Evaluate

Evaluate runs scenario detection and returns the model chain.

func (*ScenarioPolicy) Name

func (p *ScenarioPolicy) Name() string

Name returns the policy identifier.

type ScenarioResult

type ScenarioResult struct {
	Scenario   Scenario
	TokenCount int
	Reason     string
}

ScenarioResult contains the detected scenario and token count.

func DetectScenario

func DetectScenario(messages []MessageContent, tokenCount int, cfg *config.Config) ScenarioResult

DetectScenario analyzes a request to determine which model to use. Routing priority:

  1. Long Context (> threshold)
  2. Complex (architectural patterns or tool-heavy operations)
  3. Think (reasoning patterns)
  4. Background (simple operations with NO tools)
  5. Default

For streaming requests, consider using RouteForStreaming() to prefer faster models.

func RouteForStreaming

func RouteForStreaming(messages []MessageContent, tokenCount int, cfg *config.Config) ScenarioResult

RouteForStreaming selects a model optimized for streaming latency. For streaming, we prioritize fast TTFT (time-to-first-token) over capability. This may return a less capable model but one that streams faster.

type SkippedModel added in v0.3.5

type SkippedModel struct {
	ModelID string `json:"model_id"`
	Reason  string `json:"reason"`
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL