Documentation
¶
Overview ¶
Package runtimesignals collects volatile per-provider runtime signals (status, quota remaining, recent p50 latency) and writes them to the runtime tier of M1's per-source on-disk cache (ADR-012). M2's snapshot assembler reads the cache files to populate KnownModel.Status, .QuotaRemaining, and .RecentP50Latency.
Index ¶
- Constants
- Variables
- func CacheSource(providerName string) discoverycache.Source
- func Observe(ctx context.Context, providerName, providerType, baseURL string, ...) error
- func RecordResponse(provider string, h http.Header, latency time.Duration, providerType string)
- func Write(cache *discoverycache.Cache, sig Signal) error
- type CollectInput
- type LatencyWindow
- type ModelStatus
- type Signal
- type Store
Constants ¶
const ( // RuntimeTTL is the default cache TTL for runtime signals (ADR-012 §2). RuntimeTTL = 5 * time.Minute // RuntimeRefreshDeadline bounds how long a runtime signal refresh may take. RuntimeRefreshDeadline = 5 * time.Second )
Variables ¶
var DefaultStore = NewStore()
DefaultStore is the process-singleton Store used by the package-level RecordResponse and Collect functions.
Functions ¶
func CacheSource ¶
func CacheSource(providerName string) discoverycache.Source
CacheSource returns the discoverycache.Source descriptor for a provider's runtime signal. The file path resolves to runtime/<provider>.json.
func Observe ¶
func Observe(ctx context.Context, providerName, providerType, baseURL string, headers http.Header, latency time.Duration, callErr error) error
Observe refreshes the persistent runtime signal from the in-memory store and writes the refreshed snapshot to the default on-disk runtime cache.
func RecordResponse ¶
RecordResponse records an HTTP response observation for a provider using the DefaultStore. providerType controls which header parser is applied ("openrouter", "anthropic", or any other value for the OpenAI parser).
Types ¶
type CollectInput ¶
CollectInput describes the provider identity needed to assemble a runtime signal. Type controls the collection path; BaseURL is used by local providers for the live /v1/models check.
type LatencyWindow ¶
type LatencyWindow struct {
// contains filtered or unexported fields
}
LatencyWindow tracks recent request latencies in a fixed-size circular buffer. When the buffer is full, the oldest sample is overwritten. P50 is computed over all samples currently in the window. The window is safe for concurrent use.
func NewLatencyWindow ¶
func NewLatencyWindow(n int) *LatencyWindow
NewLatencyWindow creates a LatencyWindow that retains the last n samples.
func (*LatencyWindow) P50 ¶
func (w *LatencyWindow) P50() time.Duration
P50 returns the 50th-percentile latency across the current window. Returns 0 when no samples have been recorded. For an even number of samples, the upper-middle value is returned (index count/2 after sorting).
func (*LatencyWindow) Record ¶
func (w *LatencyWindow) Record(d time.Duration)
Record adds one latency observation. When the buffer is full, the oldest sample is silently discarded.
type ModelStatus ¶
type ModelStatus string
ModelStatus represents the operational availability of a provider.
const ( StatusAvailable ModelStatus = "available" StatusDegraded ModelStatus = "degraded" StatusExhausted ModelStatus = "exhausted" StatusUnknown ModelStatus = "unknown" )
type Signal ¶
type Signal struct {
Provider string `json:"provider"`
Status ModelStatus `json:"status"`
QuotaRemaining *int `json:"quota_remaining,omitempty"`
QuotaResetAt *time.Time `json:"quota_reset_at,omitempty"`
RecentP50Latency time.Duration `json:"recent_p50_latency_ns"`
LastSuccessAt *time.Time `json:"last_success_at,omitempty"`
LastErrorAt *time.Time `json:"last_error_at,omitempty"`
LastErrorMsg string `json:"last_error_msg,omitempty"`
RecordedAt time.Time `json:"recorded_at"`
}
Signal is the runtime state snapshot for one provider. It is serialized as JSON to runtime/<provider>.json in M1's cache directory.
func ReadCached ¶
func ReadCached(cache *discoverycache.Cache, providerName string) (*Signal, bool)
ReadCached reads a Signal from M1's runtime cache for providerName. Returns (nil, false) when the entry is absent or cannot be decoded.
func Warmup ¶ added in v0.12.3
func Warmup(ctx context.Context, cache *discoverycache.Cache, providerName string, cfg CollectInput) (Signal, error)
Warmup synchronously refreshes and writes the runtime signal for providerName through the shared discovery cache coordinator. Callers can invoke it from a heartbeat without bypassing the same lock/singleflight path used by writes.
type Store ¶
type Store struct {
// contains filtered or unexported fields
}
Store is the per-process in-memory state for runtime signal collection. It holds per-provider latency windows and the most recently observed rate-limit header signals. All methods are safe for concurrent use.
func (*Store) Collect ¶
Collect assembles a runtime Signal for providerName. The cfg.Type field controls which collection path is used:
- "claude" → QuotaHarness-backed cached quota state
- "codex" → QuotaHarness-backed cached quota state
- "gemini" → QuotaHarness-backed cached quota state
- "openrouter" → most recently recorded rate-limit headers (OpenRouter parser)
- "openai", "anthropic", and unknown HTTP types → rate-limit headers
- local types → HTTP GET /v1/models alive check (no quota concept)