fleet

package
v0.9.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 1, 2026 License: Apache-2.0 Imports: 13 Imported by: 0

Documentation

Overview

Package fleet provides the normalized model catalog, live registry, and runtime client wiring. The catalog is built from config at startup. The registry merges catalog entries with live inventory from providers (Ollama, Anthropic, LM Studio) and exposes the result to the router for scoring.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CanExpandLoadedContext

func CanExpandLoadedContext(dep Deployment, contextSize int) bool

CanExpandLoadedContext reports whether a deployment can plausibly be reloaded by its runner to satisfy the requested context size.

func IsUnknownDeployment

func IsUnknownDeployment(err error) bool

IsUnknownDeployment reports whether err identifies a missing deployment ID.

func IsUnknownModel

func IsUnknownModel(err error) bool

IsUnknownModel reports whether err identifies a missing model reference or deployment ID.

func IsUnknownResource

func IsUnknownResource(err error) bool

IsUnknownResource reports whether err identifies a missing resource ID.

Types

type AnthropicRateLimitBucket

type AnthropicRateLimitBucket struct {
	Limit     int        `json:"limit"`
	Remaining int        `json:"remaining"`
	Reset     *time.Time `json:"reset,omitempty"`
}

AnthropicRateLimitBucket is one Anthropic rate-limit quota family: requests, total tokens, input tokens, or output tokens.

type AnthropicRateLimitSnapshot

type AnthropicRateLimitSnapshot struct {
	CapturedAt        time.Time                 `json:"captured_at"`
	UpstreamRequestID string                    `json:"upstream_request_id,omitempty"`
	Requests          *AnthropicRateLimitBucket `json:"requests,omitempty"`
	Tokens            *AnthropicRateLimitBucket `json:"tokens,omitempty"`
	InputTokens       *AnthropicRateLimitBucket `json:"input_tokens,omitempty"`
	OutputTokens      *AnthropicRateLimitBucket `json:"output_tokens,omitempty"`
	RetryAfterSeconds int64                     `json:"retry_after_seconds,omitempty"`
}

AnthropicRateLimitSnapshot is a JSON-friendly view of the latest Anthropic rate-limit response headers captured by the shared provider client.

type Catalog

type Catalog struct {
	DefaultModel  string
	RecoveryModel string
	LocalFirst    bool
	Resources     []Resource
	Deployments   []Deployment
	// contains filtered or unexported fields
}

Catalog is the normalized, provider-aware model view used by both runtime client construction and the router.

func BuildCatalog

func BuildCatalog(cfg *config.Config) (*Catalog, error)

BuildCatalog converts config-driven model/provider configuration into a normalized provider resource and deployment catalog.

func MergeInventory

func MergeInventory(base *Catalog, inv *Inventory) (*Catalog, error)

MergeInventory overlays discovered provider inventory on top of the immutable config-defined catalog. Config deployments keep their IDs and routing authority; discovered deployments are explicit-use only until a later policy layer promotes them into routing.

func (*Catalog) ContextWindowForModel

func (c *Catalog) ContextWindowForModel(ref string, defaultSize int) int

ContextWindowForModel returns the configured context window for a model reference or resolved deployment ID. When only an upstream model name is available from a provider response and multiple deployments share that name, the largest configured window is used as a safe upper bound.

func (*Catalog) DeploymentByRef

func (c *Catalog) DeploymentByRef(ref string) (Deployment, bool)

DeploymentByRef resolves a model reference or deployment ID and returns the normalized deployment metadata when known.

func (*Catalog) PrimaryOllamaURL

func (c *Catalog) PrimaryOllamaURL() string

PrimaryOllamaURL returns the preferred Ollama base URL for callers that still need one local LLM endpoint outside the routed deployment path (for example media transcription helpers). Preference order is: a resource named "default", then the first Ollama resource in sorted order.

func (*Catalog) ResolveDeploymentRef

func (c *Catalog) ResolveDeploymentRef(ref string) (Deployment, error)

ResolveDeploymentRef resolves a caller-provided model reference or deployment ID into the normalized deployment metadata.

func (*Catalog) ResolveModelRef

func (c *Catalog) ResolveModelRef(ref string) (string, error)

ResolveModelRef resolves a raw model reference or qualified deployment ID into a concrete deployment ID.

func (*Catalog) ResourceByID

func (c *Catalog) ResourceByID(id string) (Resource, bool)

ResourceByID returns the normalized resource metadata for the given configured resource ID.

func (*Catalog) RouterConfig

func (c *Catalog) RouterConfig(maxAuditLog int) router.Config

RouterConfig converts the normalized catalog into router config.

type ClientBundle

type ClientBundle struct {
	Client          llm.Client
	ResourceClients map[string]llm.Client
	HealthClients   map[string]ResourceHealthClient
	OllamaClients   map[string]*modelproviders.OllamaClient
	LMStudioClients map[string]*modelproviders.LMStudioClient
	// AnthropicClient is the singleton Anthropic provider shared across
	// all anthropic-backed resources, retained here so late-bind
	// machinery (e.g., Runtime.SetLogger) can find it without scanning
	// ResourceClients for the *AnthropicClient type.
	AnthropicClient *modelproviders.AnthropicClient
}

ClientBundle contains the routed LLM client plus provider-specific resource clients keyed by resource ID for connection watching and inventory discovery.

func BuildClients

func BuildClients(cat *Catalog, cfg *config.Config, logger *slog.Logger) (*ClientBundle, error)

BuildClients constructs provider clients and a routed llm.Client from the normalized catalog.

func (*ClientBundle) BuildRoutedClient

func (b *ClientBundle) BuildRoutedClient(cat *Catalog) (llm.Client, error)

BuildRoutedClient constructs a routed llm.Client for the provided effective catalog using the bundle's stable per-resource clients.

type Deployment

type Deployment struct {
	ID                        string
	ModelName                 string
	ModelType                 string
	Publisher                 string
	Provider                  string
	ResourceID                string
	Server                    string
	CompatibilityType         string
	RunnerState               string
	SupportsTools             bool
	ObservedSupportsTools     bool
	TrainedForToolUse         bool
	ProviderSupportsTools     bool
	SupportsStreaming         bool
	ObservedSupportsStreaming bool
	SupportsImages            bool
	ContextWindow             int
	ObservedContextWindow     int
	MaxContextWindow          int
	LoadedContextWindow       int
	LoadedInstanceID          string
	Speed                     int
	Quality                   int
	CostTier                  int
	MinComplexity             string
	Source                    DeploymentSource
	Routable                  bool

	// Provider-exported metadata kept alongside the normalized Thane
	// deployment so later routing/policy layers can reason with it.
	Family        string
	Families      []string
	ParameterSize string
	Quantization  string

	PolicyState             DeploymentPolicyState
	PolicySource            DeploymentPolicySource
	PolicyReason            string
	PolicyUpdatedAt         time.Time
	RoutableSource          DeploymentPolicySource
	ResourcePolicyState     DeploymentPolicyState
	ResourcePolicySource    DeploymentPolicySource
	ResourcePolicyReason    string
	ResourcePolicyUpdatedAt time.Time

	SupportsToolsOverride     *bool
	SupportsStreamingOverride *bool
	ContextWindowOverride     int
}

Deployment is the normalized routing unit derived from config. The same upstream model on different resources becomes distinct deployments with distinct IDs.

type DeploymentPolicy

type DeploymentPolicy struct {
	State     DeploymentPolicyState
	Routable  *bool
	Reason    string
	UpdatedAt time.Time
}

DeploymentPolicy is the mutable runtime policy overlay for one deployment.

type DeploymentPolicySource

type DeploymentPolicySource string

DeploymentPolicySource describes whether a deployment policy comes from the default baseline or from an explicit runtime overlay.

const (
	DeploymentPolicySourceDefault DeploymentPolicySource = "default"
	DeploymentPolicySourceOverlay DeploymentPolicySource = "overlay"
)

type DeploymentPolicyState

type DeploymentPolicyState string

DeploymentPolicyState describes the runtime policy state of a deployment in the mutable overlay.

const (
	DeploymentPolicyStateActive   DeploymentPolicyState = "active"
	DeploymentPolicyStateInactive DeploymentPolicyState = "inactive"
	DeploymentPolicyStateFlagged  DeploymentPolicyState = "flagged"
)

func ParseDeploymentPolicyState

func ParseDeploymentPolicyState(raw string) (DeploymentPolicyState, error)

ParseDeploymentPolicyState validates a caller-provided policy state.

type DeploymentSource

type DeploymentSource string

DeploymentSource describes where a deployment definition came from.

const (
	DeploymentSourceConfig     DeploymentSource = "config"
	DeploymentSourceDiscovered DeploymentSource = "discovered"
)

type DiscoveredModel

type DiscoveredModel struct {
	SupportsChat        bool
	Name                string
	ModelType           string
	Publisher           string
	CompatibilityType   string
	State               string
	Family              string
	Families            []string
	ParameterSize       string
	Quantization        string
	SupportsTools       bool
	TrainedForToolUse   bool
	SupportsStreaming   bool
	SupportsImages      bool
	ContextWindow       int
	MaxContextWindow    int
	LoadedContextWindow int
	LoadedInstanceID    string
}

DiscoveredModel is provider-exported model metadata normalized just enough for Thane's overlay layer.

type ExplicitModelPrepareResult

type ExplicitModelPrepareResult struct {
	Changed  bool
	Resolved string
	Instance string
	Snapshot *RegistrySnapshot
}

ExplicitModelPrepareResult describes the provider-side outcome of readying an explicit deployment for immediate use.

type Inventory

type Inventory struct {
	Resources []ResourceInventory
}

Inventory is the mutable provider-exported overlay that sits on top of the immutable config-defined model catalog.

func DiscoverInventory

func DiscoverInventory(ctx context.Context, cat *Catalog, bundle *ClientBundle) *Inventory

DiscoverInventory probes configured resources for live model inventory. Discovery is best-effort; individual resource failures are captured in the returned overlay instead of aborting startup.

type RefreshResult

type RefreshResult struct {
	Changed  bool
	Snapshot *RegistrySnapshot
}

RefreshResult describes the outcome of a runtime inventory refresh.

type Registry

type Registry struct {
	// contains filtered or unexported fields
}

Registry is the long-lived model registry for one Thane process. It holds an immutable config-defined base catalog, a mutable discovered inventory overlay, and the effective merged catalog derived from the two.

func NewRegistry

func NewRegistry(base *Catalog) (*Registry, error)

NewRegistry constructs a registry from the immutable config-defined base catalog.

func (*Registry) ApplyDeploymentPolicy

func (r *Registry) ApplyDeploymentPolicy(id string, policy DeploymentPolicy, updatedAt time.Time) error

ApplyDeploymentPolicy upserts a runtime policy override for one deployment ID in the current registry.

func (*Registry) ApplyInventory

func (r *Registry) ApplyInventory(inv *Inventory, refreshedAt time.Time) error

ApplyInventory replaces the mutable overlay and recomputes the effective merged catalog.

func (*Registry) ApplyResourcePolicy

func (r *Registry) ApplyResourcePolicy(id string, policy ResourcePolicy, updatedAt time.Time) error

ApplyResourcePolicy upserts a runtime policy override for one configured resource ID in the current registry.

func (*Registry) BaseCatalog

func (r *Registry) BaseCatalog() *Catalog

BaseCatalog returns the immutable config-defined base catalog.

func (*Registry) Catalog

func (r *Registry) Catalog() *Catalog

Catalog returns the current effective merged catalog. The returned pointer must be treated as immutable by callers.

func (*Registry) ClearDeploymentPolicy

func (r *Registry) ClearDeploymentPolicy(id string, updatedAt time.Time) error

ClearDeploymentPolicy removes an explicit runtime policy override for one deployment ID.

func (*Registry) ClearResourcePolicy

func (r *Registry) ClearResourcePolicy(id string, updatedAt time.Time) error

ClearResourcePolicy removes an explicit runtime policy override for one configured resource ID.

func (*Registry) Refresh

func (r *Registry) Refresh(ctx context.Context, bundle *ClientBundle) error

Refresh probes live inventory from the configured resources and applies the resulting overlay to the registry.

func (*Registry) ReplaceDeploymentPolicies

func (r *Registry) ReplaceDeploymentPolicies(policies map[string]DeploymentPolicy, updatedAt time.Time) error

ReplaceDeploymentPolicies swaps the explicit runtime policy overlay with the provided policy map. Policies for currently absent deployments are retained so they can reapply automatically when a discovered deployment returns in a later inventory refresh.

func (*Registry) ReplaceResourcePolicies

func (r *Registry) ReplaceResourcePolicies(policies map[string]ResourcePolicy, updatedAt time.Time) error

ReplaceResourcePolicies swaps the explicit runtime resource-policy overlay with the provided policy map.

func (*Registry) Snapshot

func (r *Registry) Snapshot() *RegistrySnapshot

Snapshot returns a JSON-friendly view of the registry state.

type RegistryDeploymentSnapshot

type RegistryDeploymentSnapshot struct {
	ID                        string                 `json:"id"`
	Model                     string                 `json:"model"`
	ModelType                 string                 `json:"model_type,omitempty"`
	Publisher                 string                 `json:"publisher,omitempty"`
	Provider                  string                 `json:"provider"`
	Resource                  string                 `json:"resource"`
	Source                    DeploymentSource       `json:"source"`
	Routable                  bool                   `json:"routable"`
	RoutableSource            DeploymentPolicySource `json:"routable_source"`
	CompatibilityType         string                 `json:"compatibility_type,omitempty"`
	RunnerState               string                 `json:"runner_state,omitempty"`
	SupportsTools             bool                   `json:"supports_tools,omitempty"`
	ObservedSupportsTools     bool                   `json:"observed_supports_tools,omitempty"`
	TrainedForToolUse         bool                   `json:"trained_for_tool_use,omitempty"`
	ProviderSupportsTools     bool                   `json:"provider_supports_tools,omitempty"`
	SupportsStreaming         bool                   `json:"supports_streaming,omitempty"`
	ObservedSupportsStreaming bool                   `json:"observed_supports_streaming,omitempty"`
	SupportsImages            bool                   `json:"supports_images,omitempty"`
	ContextWindow             int                    `json:"context_window,omitempty"`
	ObservedContextWindow     int                    `json:"observed_context_window,omitempty"`
	MaxContextWindow          int                    `json:"max_context_window,omitempty"`
	LoadedContextWindow       int                    `json:"loaded_context_window,omitempty"`
	LoadedInstanceID          string                 `json:"loaded_instance_id,omitempty"`
	Speed                     int                    `json:"speed,omitempty"`
	Quality                   int                    `json:"quality,omitempty"`
	CostTier                  int                    `json:"cost_tier,omitempty"`
	MinComplexity             string                 `json:"min_complexity,omitempty"`
	Family                    string                 `json:"family,omitempty"`
	Families                  []string               `json:"families,omitempty"`
	ParameterSize             string                 `json:"parameter_size,omitempty"`
	Quantization              string                 `json:"quantization,omitempty"`
	PolicyState               DeploymentPolicyState  `json:"policy_state"`
	PolicySource              DeploymentPolicySource `json:"policy_source"`
	PolicyReason              string                 `json:"policy_reason,omitempty"`
	PolicyUpdated             string                 `json:"policy_updated_at,omitempty"`
	ResourcePolicyState       DeploymentPolicyState  `json:"resource_policy_state"`
	ResourcePolicySource      DeploymentPolicySource `json:"resource_policy_source"`
	ResourcePolicyReason      string                 `json:"resource_policy_reason,omitempty"`
	ResourcePolicyUpdated     string                 `json:"resource_policy_updated_at,omitempty"`
}

RegistryDeploymentSnapshot is the API-facing state for one effective deployment in the merged catalog.

type RegistryResourceSnapshot

type RegistryResourceSnapshot struct {
	ID                string                 `json:"id"`
	Provider          string                 `json:"provider"`
	URL               string                 `json:"url,omitempty"`
	SupportsChat      bool                   `json:"supports_chat,omitempty"`
	SupportsStreaming bool                   `json:"supports_streaming,omitempty"`
	SupportsTools     bool                   `json:"supports_tools,omitempty"`
	SupportsImages    bool                   `json:"supports_images,omitempty"`
	SupportsInventory bool                   `json:"supports_inventory,omitempty"`
	LastRefresh       string                 `json:"last_refresh,omitempty"`
	LastError         string                 `json:"last_error,omitempty"`
	DiscoveredModels  int                    `json:"discovered_models,omitempty"`
	PolicyState       DeploymentPolicyState  `json:"policy_state"`
	PolicySource      DeploymentPolicySource `json:"policy_source"`
	PolicyReason      string                 `json:"policy_reason,omitempty"`
	PolicyUpdated     string                 `json:"policy_updated_at,omitempty"`
}

RegistryResourceSnapshot is the API-facing runtime state for one provider resource.

type RegistrySnapshot

type RegistrySnapshot struct {
	Generation    int64                        `json:"generation"`
	UpdatedAt     string                       `json:"updated_at,omitempty"`
	DefaultModel  string                       `json:"default_model,omitempty"`
	RecoveryModel string                       `json:"recovery_model,omitempty"`
	LocalFirst    bool                         `json:"local_first"`
	Resources     []RegistryResourceSnapshot   `json:"resources,omitempty"`
	Deployments   []RegistryDeploymentSnapshot `json:"deployments,omitempty"`
}

RegistrySnapshot is the model-registry state exported for observability and API inspection.

type Resource

type Resource struct {
	ID              string
	Provider        string
	URL             string
	IdleTTLSeconds  int
	Capabilities    modelproviders.Capabilities
	PolicyState     DeploymentPolicyState
	PolicySource    DeploymentPolicySource
	PolicyReason    string
	PolicyUpdatedAt time.Time
}

Resource is a runtime model provider resource that can serve one or more model deployments. Examples include an Ollama server on a specific host or a global cloud provider endpoint.

type ResourceHealthClient

type ResourceHealthClient struct {
	Ping          func(ctx context.Context) error
	AttachWatcher func(w llm.ReadyWatcher)
}

ResourceHealthClient is the minimal health/watch surface that app wiring needs from one model-provider resource.

type ResourceInventory

type ResourceInventory struct {
	ResourceID   string
	Provider     string
	Capabilities modelproviders.Capabilities
	Attempted    bool
	Models       []DiscoveredModel
	Error        string
}

ResourceInventory captures the models advertised by one provider resource at a point in time. Errors are recorded per-resource so the overlay can be partial without blocking startup.

type ResourcePolicy

type ResourcePolicy struct {
	State     DeploymentPolicyState
	Reason    string
	UpdatedAt time.Time
}

ResourcePolicy is the mutable runtime policy overlay for one configured provider resource.

type ResourceRuntime

type ResourceRuntime struct {
	ID               string
	Provider         string
	URL              string
	Capabilities     modelproviders.Capabilities
	LastRefresh      time.Time
	LastError        string
	DiscoveredModels int
}

ResourceRuntime captures runtime discovery state for one provider resource.

type Runtime

type Runtime struct {
	// contains filtered or unexported fields
}

Runtime owns the long-lived model registry plus the swappable routed llm.Client built from it.

func NewRuntime

func NewRuntime(ctx context.Context, base *Catalog, cfg *config.Config, logger *slog.Logger) (*Runtime, error)

NewRuntime builds the initial registry, performs a first inventory refresh, and constructs the swappable routed client.

func (*Runtime) AnthropicRateLimitSnapshot

func (r *Runtime) AnthropicRateLimitSnapshot() *AnthropicRateLimitSnapshot

AnthropicRateLimitSnapshot returns the latest captured Anthropic rate-limit snapshot, or nil when the Anthropic provider is not configured or no Anthropic response has been observed yet.

func (*Runtime) Client

func (r *Runtime) Client() llm.Client

Client returns the swappable llm.Client.

func (*Runtime) HealthClients

func (r *Runtime) HealthClients() map[string]ResourceHealthClient

HealthClients returns the stable per-resource health clients used by connwatch and inventory refresh triggers.

func (*Runtime) InventoryClientCount

func (r *Runtime) InventoryClientCount() int

InventoryClientCount reports how many resource clients participate in runtime inventory discovery.

func (*Runtime) LMStudioClient

func (r *Runtime) LMStudioClient(resourceID string) *modelproviders.LMStudioClient

LMStudioClient returns the stable per-resource LM Studio client used by runtime discovery and explicit recovery flows.

func (*Runtime) OllamaClients

func (r *Runtime) OllamaClients() map[string]*modelproviders.OllamaClient

OllamaClients returns the stable per-resource Ollama clients used by watchers and discovery refresh.

func (*Runtime) PrepareExplicitModel

func (r *Runtime) PrepareExplicitModel(ctx context.Context, ref string, contextSize int) (*ExplicitModelPrepareResult, error)

PrepareExplicitModel asks the backing provider to ready an explicit deployment for the requested context size, then refreshes the live registry snapshot when the provider state changes. Today this is used only for LM Studio deployments whose loaded context window is smaller than the runner-advertised maximum.

func (*Runtime) Refresh

func (r *Runtime) Refresh(ctx context.Context) (*RefreshResult, error)

Refresh probes provider inventory, updates the registry overlay, and swaps in a new routed client for future requests.

func (*Runtime) Registry

func (r *Runtime) Registry() *Registry

Registry returns the long-lived model registry.

func (*Runtime) SetLogger

func (r *Runtime) SetLogger(logger *slog.Logger)

SetLogger rebinds the logger on every provider client owned by this runtime. The intended caller is app.New, after initLogging has finished, to swap clients off the bootstrap Info-level logger and onto the dataset-routed handler. Each provider re-applies its own "provider" attribute and the per-resource clients receive a logger pre-decorated with the resource ID, so attributes match what BuildClients originally produced.

Not safe to call concurrently with in-flight requests; intended to be invoked once during init, before the runtime services traffic.

type UnknownDeploymentError

type UnknownDeploymentError struct {
	Deployment string
}

UnknownDeploymentError reports that a requested deployment ID does not exist in the current effective registry snapshot.

func (*UnknownDeploymentError) Error

func (e *UnknownDeploymentError) Error() string

type UnknownModelError

type UnknownModelError struct {
	Model string
}

UnknownModelError reports that a caller referenced a model or deployment ID that is not present in the current catalog.

func (*UnknownModelError) Error

func (e *UnknownModelError) Error() string

type UnknownResourceError

type UnknownResourceError struct {
	Resource string
}

UnknownResourceError reports that a requested resource ID does not exist in the current effective registry snapshot.

func (*UnknownResourceError) Error

func (e *UnknownResourceError) Error() string

Directories

Path Synopsis
Package providers implements concrete model runner integrations.
Package providers implements concrete model runner integrations.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL