models

package
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 31, 2026 License: MIT Imports: 22 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var ErrNoProvider = errors.New("no provider configured")

ErrNoProvider is returned when no provider is configured for a model.

Functions

func BuildProviders

func BuildProviders(logger *zap.SugaredLogger, cfg *config.Root, copilotAuth TokenProvider) map[string]Provider

BuildProviders creates the full provider map from configuration.

GitHub Copilot is always registered using the supplied TokenProvider for auth. Additional providers are built from cfg.Providers:

  • "anthropic" → native Anthropic messages API client
  • all others → OpenAI-compatible client (go-openai with custom base URL + API key)

Unknown provider names with neither a configured baseURL nor a known default fall back to the OpenAI base URL; an operator warning is logged.

func DefaultBaseURL

func DefaultBaseURL(name string) string

DefaultBaseURL returns the known base URL for a provider name, or empty string if unknown.

func ModelID

func ModelID(full string) string

ModelID strips the provider prefix from a model string. e.g. "github-copilot/claude-sonnet-4.6" → "claude-sonnet-4.6"

func NewCopilotClient

func NewCopilotClient(provider TokenProvider) *openai.Client

NewCopilotClient creates an OpenAI-compatible client pointed at GitHub Copilot. The Copilot API does NOT use a /v1 prefix — paths go directly on the domain. The transport rewrites the host dynamically from the token's endpoints.api field. HTTP/2 is disabled because the Copilot API may time out on HTTP/2 handshakes.

func ProviderNames

func ProviderNames(providers map[string]Provider) []string

ProviderNames returns the sorted list of registered provider names for logging.

func StreamAll

func StreamAll(stream Stream) (string, error)

StreamAll collects a full streaming response into a string.

Types

type APIError

type APIError struct {
	StatusCode int
	RetryAfter time.Duration // parsed from Retry-After header; 0 = not specified
	Message    string
}

APIError is a structured error returned by providers that includes the HTTP status code and optional Retry-After duration from the server.

func (*APIError) Error

func (e *APIError) Error() string

type Cooldown

type Cooldown struct {
	// contains filtered or unexported fields
}

Cooldown tracks per-model failure cooldowns using exponential backoff. When a model fails, it enters a cooldown period that doubles on each consecutive failure, up to a configurable maximum.

Thread-safe.

func NewCooldown

func NewCooldown(cfg CooldownConfig) *Cooldown

NewCooldown creates a Cooldown tracker with the given config.

func (*Cooldown) AvailableFrom

func (c *Cooldown) AvailableFrom(models []string) []string

AvailableFrom returns the subset of models that are currently not in cooldown.

func (*Cooldown) CooldownRemaining

func (c *Cooldown) CooldownRemaining(model string) time.Duration

CooldownRemaining returns the remaining cooldown duration for a model. Returns 0 if the model is available.

func (*Cooldown) GC

func (c *Cooldown) GC()

GC removes expired cooldown entries to prevent unbounded map growth.

func (*Cooldown) IsAvailable

func (c *Cooldown) IsAvailable(model string) bool

IsAvailable returns true if the model is not in a cooldown period.

func (*Cooldown) RecordFailure

func (c *Cooldown) RecordFailure(model string)

RecordFailure marks a model as having failed, starting or extending its cooldown with exponential backoff.

func (*Cooldown) RecordSuccess

func (c *Cooldown) RecordSuccess(model string)

RecordSuccess clears the cooldown state for a model.

func (*Cooldown) Reset

func (c *Cooldown) Reset(model string)

Reset clears the cooldown state for a specific model.

func (*Cooldown) ResetAll

func (c *Cooldown) ResetAll()

ResetAll clears all cooldown state.

type CooldownConfig

type CooldownConfig struct {
	BaseDelay  time.Duration // initial cooldown after first failure (default 1m)
	MaxDelay   time.Duration // cap on cooldown duration (default 1h)
	Multiplier float64       // backoff multiplier (default 5)
}

CooldownConfig configures cooldown behaviour.

type ModelHealthStatus

type ModelHealthStatus struct {
	Model     string `json:"model"`
	Provider  string `json:"provider"`
	Available bool   `json:"available"`
	Reason    string `json:"reason,omitempty"` // why unavailable
}

ModelHealth returns the health status of the primary model and fallbacks. It checks provider registration and cooldown state without making API calls.

type Provider

type Provider interface {
	Chat(ctx context.Context, req openai.ChatCompletionRequest) (openai.ChatCompletionResponse, error)
	ChatStream(ctx context.Context, req openai.ChatCompletionRequest) (Stream, error)
}

Provider can make chat completion requests to a model API backend.

func NewCopilotProvider

func NewCopilotProvider(provider TokenProvider) Provider

NewCopilotProvider creates a Provider backed by the GitHub Copilot API.

func ProviderFor

func ProviderFor(providers map[string]Provider, fullModel string) (Provider, string, error)

ProviderFor returns the provider and stripped model ID for a full model string. Returns an error if the provider is not registered.

type Router

type Router struct {
	Timeout          time.Duration                     // per-model timeout; 0 = use caller's ctx deadline
	StreamConnectTTL time.Duration                     // stream connect timeout for primary when fallbacks exist; 0 = 30s
	RateLimiters     map[string]*taskqueue.RateLimiter // provider name → rate limiter (optional)
	Cooldowns        *Cooldown                         // per-model cooldowns (optional); nil = no cooldowns
	MaxRetries       int                               // max retries per model for transient errors; 0 = use default (2)
	// contains filtered or unexported fields
}

Router dispatches chat completion requests to the correct provider based on the provider prefix in model strings (e.g. "anthropic/claude-opus-4-6"). It tries the primary model first, then each fallback in order. Within each model attempt, transient errors (429, 5xx, connection resets) are retried with exponential backoff before moving to the next model.

func NewRouter

func NewRouter(logger *zap.SugaredLogger, providers map[string]Provider, primary string, fallbacks []string) *Router

NewRouter creates a Router with the given provider registry and model config. Model strings must use the "provider/model-id" format. Models without a "/" separator are routed to the "github-copilot" provider.

func (*Router) Chat

Chat sends a chat completion request, trying primary then fallbacks. The Model field in req should be the full "provider/model-id" string; the router strips the provider prefix before calling the provider. Transient errors (429, 5xx, connection failures) are retried with exponential backoff before falling through to the next model.

func (*Router) ChatStream

func (r *Router) ChatStream(ctx context.Context, req openai.ChatCompletionRequest) (Stream, error)

ChatStream sends a streaming chat completion request. Returns a Stream for the first model that successfully starts; tries fallbacks on error. Each model gets a connect-timeout sub-context to prevent a hanging primary from consuming the entire caller deadline and starving fallbacks. Transient errors are retried with exponential backoff before moving to the next model. On success, the sub-context is NOT canceled — the stream's HTTP body needs it alive. It will expire naturally when the parent context is done.

func (*Router) ModelHealth

func (r *Router) ModelHealth() []ModelHealthStatus

func (*Router) SetModels

func (r *Router) SetModels(primary string, fallbacks []string)

SetModels updates the primary model and fallbacks at runtime (e.g. hot-reload).

type Stream

type Stream interface {
	Recv() (openai.ChatCompletionStreamResponse, error)
	Close() error
}

Stream is a streaming chat completion iterator. *openai.ChatCompletionStream satisfies this interface.

type StreamUsage

type StreamUsage interface {
	Usage() (inputTokens, outputTokens int)
}

StreamUsage is an optional interface that streaming providers can implement to report token usage captured from the stream (e.g. Anthropic message_start and message_delta events).

type ThinkingConfig

type ThinkingConfig struct {
	Enabled      bool
	BudgetTokens int
	Level        string // "off", "enabled", "adaptive"; empty = use Enabled field
}

ThinkingConfig controls extended thinking for providers that support it (Anthropic).

type ThinkingStream

type ThinkingStream interface {
	SetThinkingCallback(func(text string))
}

ThinkingStream is an optional interface that streaming providers can implement to surface extended thinking deltas (e.g. Anthropic thinking_delta events).

type TokenProvider

type TokenProvider interface {
	GetToken(ctx context.Context) (string, error)
	APIURL() string
}

TokenProvider can fetch a fresh Copilot API token and report the API base URL.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL