Documentation
ΒΆ
Overview ΒΆ
Package aigateway provides a high-performance, zero-dependency AI gateway for routing requests to large language model (LLM) providers.
The Gateway type is the main entry point: create one with New, register providers with RegisterProvider, load plugins from config with LoadPlugins, and route requests with Route or RouteStream.
Plugins and routing strategies (single, fallback, load-balance, conditional, content-based, ab-test) are configured via Config which can be loaded from a YAML or JSON file using LoadConfig.
Index ΒΆ
- Constants
- func ValidateConfig(cfg Config) error
- type ABVariantConfig
- type CircuitBreakerConfig
- type Condition
- type Config
- type ContentCondition
- type EventHookFunc
- type ExporterConfig
- type Gateway
- func (g *Gateway) AddHook(fn EventHookFunc)
- func (g *Gateway) AllModels() []providers.ModelInfo
- func (g *Gateway) Catalog() models.Catalog
- func (g *Gateway) Close() error
- func (g *Gateway) Embed(ctx context.Context, req providers.EmbeddingRequest) (*providers.EmbeddingResponse, error)
- func (g *Gateway) FindByModel(model string) (providers.Provider, bool)
- func (g *Gateway) FindStreamingByModel(model string) (providers.StreamProvider, bool)
- func (g *Gateway) GenerateImage(ctx context.Context, req providers.ImageRequest) (*providers.ImageResponse, error)
- func (g *Gateway) Get(name string) (providers.Provider, bool)
- func (g *Gateway) GetConfig() Config
- func (g *Gateway) GetProvider(name string) (providers.Provider, bool)
- func (g *Gateway) List() []string
- func (g *Gateway) ListProviders() []string
- func (g *Gateway) LoadPlugins() error
- func (g *Gateway) MCPInitDone() <-chan struct{}
- func (g *Gateway) Observability() observability.Provider
- func (g *Gateway) RegisterPlugin(stage plugin.Stage, p plugin.Plugin) error
- func (g *Gateway) RegisterProvider(p providers.Provider)
- func (g *Gateway) ReloadConfig(cfg Config) error
- func (g *Gateway) Route(ctx context.Context, req providers.Request) (*providers.Response, error)
- func (g *Gateway) RouteStream(ctx context.Context, req providers.Request) (<-chan providers.StreamChunk, error)
- func (g *Gateway) SetObservability(p observability.Provider)
- func (g *Gateway) StartDiscovery(ctx context.Context, interval time.Duration) error
- type ObservabilityConfig
- type PluginConfig
- type RetryConfig
- type StrategyConfig
- type StrategyMode
- type Target
- type TracingConfig
Constants ΒΆ
const ( SubjectRequestCompleted = "gateway.request.completed" SubjectRequestFailed = "gateway.request.failed" )
Event subject constants used when invoking gateway hooks.
Variables ΒΆ
This section is empty.
Functions ΒΆ
func ValidateConfig ΒΆ
ValidateConfig validates a Config for correctness.
Types ΒΆ
type ABVariantConfig ΒΆ added in v0.8.5
type ABVariantConfig struct {
// TargetKey is the virtual_key of the provider for this variant.
TargetKey string `json:"target_key" yaml:"target_key"`
// Weight is the relative traffic share for this variant.
// All weights are summed; each variant's fraction is Weight/Total.
// Zero is treated as 1 (equal distribution).
Weight float64 `json:"weight" yaml:"weight"`
// Label is a short human-readable identifier (e.g. "control", "challenger").
// It is logged with every routed request for observability.
Label string `json:"label" yaml:"label"`
}
ABVariantConfig defines a single traffic variant for the "ab-test" strategy.
type CircuitBreakerConfig ΒΆ added in v0.2.0
type CircuitBreakerConfig struct {
// FailureThreshold is the number of consecutive failures before the circuit
// opens. Defaults to 5.
FailureThreshold int `json:"failure_threshold" yaml:"failure_threshold"`
// SuccessThreshold is the number of consecutive successes in half-open state
// required to close the circuit. Defaults to 1.
SuccessThreshold int `json:"success_threshold" yaml:"success_threshold"`
// Timeout is the duration the circuit stays open before transitioning to
// half-open (e.g. "30s"). Defaults to "30s".
Timeout string `json:"timeout" yaml:"timeout"`
}
CircuitBreakerConfig configures the per-provider circuit breaker.
type Condition ΒΆ
type Condition struct {
Key string `json:"key" yaml:"key"`
Value string `json:"value" yaml:"value"`
TargetKey string `json:"target_key" yaml:"target_key"`
}
Condition represents a condition for conditional routing.
type Config ΒΆ
type Config struct {
// Strategy defines how requests are routed (e.g., single, fallback, loadbalance).
Strategy StrategyConfig `json:"strategy" yaml:"strategy"`
// Targets is a list of provider targets to route requests to.
Targets []Target `json:"targets" yaml:"targets"`
// Plugins configuration (optional).
Plugins []PluginConfig `json:"plugins,omitempty" yaml:"plugins,omitempty"`
// Aliases maps friendly model names (e.g. "fast", "smart") to concrete model IDs.
// Aliases are resolved before routing β they must not reference other aliases.
Aliases map[string]string `json:"aliases,omitempty" yaml:"aliases,omitempty"`
// MCPServers configures external MCP tool servers for agentic tool calling.
// When set, the gateway injects discovered tools into every chat completion
// request and executes an agentic loop when the LLM returns tool_calls.
// FerroCloud populates this field from the tenant's mcp_servers table at
// gateway.New() time β no separate MCPRegistry() public method is exposed.
MCPServers []mcp.ServerConfig `json:"mcp_servers,omitempty" yaml:"mcp_servers,omitempty"`
// MCPToolCallAuditFn, if non-nil, is called after every MCP tool invocation.
// This field cannot be set via JSON or YAML β set it programmatically before
// calling New. FerroCloud uses it to write async audit entries to the
// mcp_tool_call_logs table.
MCPToolCallAuditFn mcp.ToolCallAuditFn `json:"-" yaml:"-"`
// Observability configures OpenTelemetry tracing. When omitted the
// gateway runs with a NoOp provider (zero allocations on the hot
// path). See internal/otel.
Observability ObservabilityConfig `json:"observability,omitempty" yaml:"observability,omitempty"`
}
Config holds the configuration for the AI Gateway.
func LoadConfig ΒΆ
LoadConfig reads and parses a config file from the given path. Supported formats: JSON (.json), YAML (.yaml, .yml).
type ContentCondition ΒΆ added in v0.8.5
type ContentCondition struct {
// Type is the matching rule type.
Type string `json:"type" yaml:"type"`
// Value is the substring or regex pattern to match against.
Value string `json:"value" yaml:"value"`
// TargetKey is the virtual_key of the provider to route to when this rule matches.
TargetKey string `json:"target_key" yaml:"target_key"`
}
ContentCondition maps a prompt-content matching rule to a routing target. Used with the "content-based" strategy mode.
Supported types:
- "prompt_contains" β case-insensitive substring match on user messages
- "prompt_not_contains" β true when NO user message contains the value
- "prompt_regex" β Go regular expression match on user messages
type EventHookFunc ΒΆ added in v0.2.0
EventHookFunc is called asynchronously after a gateway event (request completed or failed). It replaces the old EventPublisher interface with a simpler function-based hook pattern.
type ExporterConfig ΒΆ added in v1.1.0
type ExporterConfig struct {
// Name is the canonical exporter name, e.g. "langsmith".
// Must match the name passed to observability.RegisterExporter.
Name string `json:"name" yaml:"name"`
// Enabled gates the exporter. Set to false to temporarily disable
// without removing the config block.
Enabled bool `json:"enabled" yaml:"enabled"`
// Config is the exporter-specific configuration map. Passed
// verbatim to Exporter.Init at gateway startup.
Config map[string]any `json:"config,omitempty" yaml:"config,omitempty"`
}
ExporterConfig configures a single observability plugin exporter. Plugin authors register their factory via observability.RegisterExporter in their package init(); gateway operators then reference the name here.
Example (YAML):
exporters:
- name: langsmith
enabled: true
config:
api_key: "${LANGSMITH_API_KEY}"
type Gateway ΒΆ
type Gateway struct {
// contains filtered or unexported fields
}
Gateway is the main entry point for routing LLM requests.
func (*Gateway) AddHook ΒΆ added in v0.2.0
func (g *Gateway) AddHook(fn EventHookFunc)
AddHook registers an EventHookFunc that is called asynchronously on each completed or failed request. Multiple hooks may be registered; all are invoked for every event on the shared bounded hook worker pool, so hook implementations should return promptly and avoid indefinite blocking.
func (*Gateway) AllModels ΒΆ added in v0.2.0
AllModels returns ModelInfo from all registered providers. If auto-discovery has run for a provider, discovered models take precedence over the provider's static model list.
func (*Gateway) Catalog ΒΆ added in v0.4.5
Catalog returns a shallow copy of the loaded model catalog. A copy is returned so callers cannot mutate the gateway's internal catalog.
func (*Gateway) Embed ΒΆ added in v0.3.0
func (g *Gateway) Embed(ctx context.Context, req providers.EmbeddingRequest) (*providers.EmbeddingResponse, error)
Embed routes an embedding request to the first registered EmbeddingProvider that supports the requested model.
func (*Gateway) FindByModel ΒΆ added in v0.2.0
FindByModel returns the first registered provider that supports the given model.
func (*Gateway) FindStreamingByModel ΒΆ added in v1.0.0
func (g *Gateway) FindStreamingByModel(model string) (providers.StreamProvider, bool)
FindStreamingByModel returns the first registered streaming-capable provider that supports the given model.
func (*Gateway) GenerateImage ΒΆ added in v0.3.0
func (g *Gateway) GenerateImage(ctx context.Context, req providers.ImageRequest) (*providers.ImageResponse, error)
GenerateImage routes an image generation request to the first registered ImageProvider that supports the requested model.
func (*Gateway) Get ΒΆ added in v0.2.0
Get satisfies providers.ProviderSource (alias for GetProvider).
func (*Gateway) GetProvider ΒΆ added in v0.2.0
GetProvider returns a registered provider by name.
func (*Gateway) List ΒΆ added in v0.2.0
List satisfies providers.ProviderSource (alias for ListProviders).
func (*Gateway) ListProviders ΒΆ added in v0.2.0
ListProviders returns the names of all registered providers.
func (*Gateway) LoadPlugins ΒΆ
LoadPlugins initializes and registers plugins from the gateway configuration.
func (*Gateway) MCPInitDone ΒΆ added in v0.8.0
func (g *Gateway) MCPInitDone() <-chan struct{}
MCPInitDone returns a channel that is closed once all MCP servers have completed their initialization handshake. The channel is pre-closed when no MCP servers are configured.
func (*Gateway) Observability ΒΆ added in v1.1.0
func (g *Gateway) Observability() observability.Provider
Observability returns the current observability.Provider. Always non-nil; defaults to NoOp.
func (*Gateway) RegisterPlugin ΒΆ
RegisterPlugin registers a plugin at the given lifecycle stage.
func (*Gateway) RegisterProvider ΒΆ
RegisterProvider registers a provider with the gateway.
func (*Gateway) ReloadConfig ΒΆ
ReloadConfig validates and applies a new configuration, forcing strategy rebuild on next request.
func (*Gateway) Route ΒΆ
Route routes a request to the appropriate provider based on the configuration.
func (*Gateway) RouteStream ΒΆ
func (g *Gateway) RouteStream(ctx context.Context, req providers.Request) (<-chan providers.StreamChunk, error)
RouteStream runs before-request plugins then returns a metered streaming response channel. Provider resolution follows the configured strategy mode, then falls back to any registered provider that supports the requested model and streaming. Prometheus metrics and event hooks are emitted when the returned channel drains (matching the behaviour of Route for non-streaming).
When MCP servers are configured the request is routed through Route instead so that the full agentic tool-call loop can run. The final response is wrapped into a single-chunk stream and returned to the caller (Phase 1 behaviour β true final-response streaming is Phase 1.5).
func (*Gateway) SetObservability ΒΆ added in v1.1.0
func (g *Gateway) SetObservability(p observability.Provider)
SetObservability installs an observability.Provider on the gateway. Pass observability.NoOp() to disable. The provider's StartRequestSpan is called at the top of Route and RouteStream; span attributes are populated incrementally as the request progresses through routing, provider execution, plugins, and final cost/usage calculation.
Safe to call only at startup, before serving traffic. The cmd/ferrogw wire-up constructs the provider via internal/otel.Init.
func (*Gateway) StartDiscovery ΒΆ added in v0.3.0
StartDiscovery periodically refreshes model lists from providers that implement DiscoveryProvider. It runs in a background goroutine until ctx is cancelled. interval must be greater than zero; an error is returned otherwise.
type ObservabilityConfig ΒΆ added in v1.1.0
type ObservabilityConfig struct {
// Tracing holds the OTLP tracing configuration. v1.1.0 ships
// tracing only; metrics and logs exporters arrive in later
// releases (see docs/OSS-ECOSYSTEM-ROADMAP.md).
Tracing TracingConfig `json:"tracing,omitempty" yaml:"tracing,omitempty"`
// Exporters lists the plugin observability exporters that should
// receive gateway events (request completed / request failed).
// Each entry names an exporter registered via
// observability.RegisterExporter and carries its own Config block.
// Exporters that are not registered at startup emit a warning and
// are skipped β they do not prevent the gateway from starting.
Exporters []ExporterConfig `json:"exporters,omitempty" yaml:"exporters,omitempty"`
}
ObservabilityConfig is the user-facing observability section of gateway config. It mirrors internal/otel.Config but lives here so the public Config schema does not pull in internal packages.
Standard OTEL_* environment variables (notably OTEL_EXPORTER_OTLP_ENDPOINT) always take precedence β this matches the OTel SDK convention required for predictable container deployments.
type PluginConfig ΒΆ
type PluginConfig struct {
Name string `json:"name" yaml:"name"`
Type string `json:"type" yaml:"type"`
Stage string `json:"stage" yaml:"stage"`
Enabled bool `json:"enabled" yaml:"enabled"`
Config map[string]interface{} `json:"config" yaml:"config"`
}
PluginConfig holds plugin configuration.
type RetryConfig ΒΆ
type RetryConfig struct {
// Attempts is the maximum number of attempts per target (1 = no retries).
Attempts int `json:"attempts" yaml:"attempts"`
// OnStatusCodes, when non-empty, limits retries to the listed HTTP status
// codes. A retry is skipped when the provider returns a code not in the
// list, and the strategy moves on to the next target immediately.
// Leave empty to retry on any error (default behaviour).
// Example: [429, 502, 503]
OnStatusCodes []int `json:"on_status_codes,omitempty" yaml:"on_status_codes,omitempty"`
// InitialBackoffMs is the base backoff in milliseconds for the exponential
// back-off formula: delay = InitialBackoffMs * 2^(attempt-1).
// Defaults to 100 ms when unset or zero.
InitialBackoffMs int `json:"initial_backoff_ms,omitempty" yaml:"initial_backoff_ms,omitempty"`
}
RetryConfig defines retry behavior for the fallback strategy.
type StrategyConfig ΒΆ
type StrategyConfig struct {
Mode StrategyMode `json:"mode" yaml:"mode"`
Conditions []Condition `json:"conditions,omitempty" yaml:"conditions,omitempty"` // For conditional routing
// ContentConditions defines rules for the content-based routing strategy.
// Rules are evaluated in order; the first match wins.
ContentConditions []ContentCondition `json:"content_conditions,omitempty" yaml:"content_conditions,omitempty"`
// ABVariants defines the weighted variants for the ab-test strategy.
ABVariants []ABVariantConfig `json:"ab_variants,omitempty" yaml:"ab_variants,omitempty"`
}
StrategyConfig defines the routing strategy.
type StrategyMode ΒΆ
type StrategyMode string
StrategyMode represents the routing strategy mode.
const ( ModeSingle StrategyMode = "single" ModeFallback StrategyMode = "fallback" ModeLoadBalance StrategyMode = "loadbalance" ModeConditional StrategyMode = "conditional" ModeLatency StrategyMode = "least-latency" ModeCostOptimized StrategyMode = "cost-optimized" ModeContentBased StrategyMode = "content-based" ModeABTest StrategyMode = "ab-test" )
StrategyMode constants define the supported routing strategies.
type Target ΒΆ
type Target struct {
// VirtualKey is the unique identifier for the provider (or a virtual key in the vault).
VirtualKey string `json:"virtual_key" yaml:"virtual_key"`
// Weight is used for load balancing.
Weight float64 `json:"weight,omitempty" yaml:"weight,omitempty"`
// Retry configuration for this target.
Retry *RetryConfig `json:"retry,omitempty" yaml:"retry,omitempty"`
// CircuitBreaker configuration for this target (optional).
CircuitBreaker *CircuitBreakerConfig `json:"circuit_breaker,omitempty" yaml:"circuit_breaker,omitempty"`
}
Target represents a specific provider target.
type TracingConfig ΒΆ added in v1.1.0
type TracingConfig struct {
// Enabled is the master switch. Defaults to true; the pipeline
// still short-circuits to NoOp when no OTLP endpoint is configured.
Enabled bool `json:"enabled" yaml:"enabled"`
// Endpoint overrides OTEL_EXPORTER_OTLP_ENDPOINT (host:port form).
Endpoint string `json:"endpoint,omitempty" yaml:"endpoint,omitempty"`
// Protocol selects the OTLP transport: "grpc" (default) or "http/protobuf".
Protocol string `json:"protocol,omitempty" yaml:"protocol,omitempty"`
// ServiceName populates the OTel service.name resource attribute.
ServiceName string `json:"service_name,omitempty" yaml:"service_name,omitempty"`
// SampleRatio is the head sampler ratio (0.0β1.0). Pointer so an
// explicit 0.0 (sample nothing) is distinguishable from an omitted
// field; nil falls back to the default of 1.0 (sample everything).
SampleRatio *float64 `json:"sample_ratio,omitempty" yaml:"sample_ratio,omitempty"`
// PrivacyLevel controls whether prompt/response content is exported.
// One of: "none", "metadata" (default), "full".
PrivacyLevel string `json:"privacy_level,omitempty" yaml:"privacy_level,omitempty"`
// ShutdownGrace is the maximum time the gateway waits for in-flight
// OTel exports to drain during graceful shutdown. Accepts any Go
// duration string, e.g. "10s", "500ms". Defaults to 10s when empty
// or unparseable.
ShutdownGrace string `json:"shutdown_grace,omitempty" yaml:"shutdown_grace,omitempty"`
// Headers are additional HTTP/gRPC metadata headers sent with every OTLP
// export request. Use this to authenticate with managed backends such as
// Datadog, New Relic, Honeycomb, or Grafana Cloud.
//
// SECURITY: prefer ${ENV_VAR} references for secret values β only the
// template (e.g. "${DATADOG_API_KEY}") is persisted in config and returned
// by the admin config API; the secret is resolved from the environment at
// export time and never stored. A literal value IS persisted verbatim and
// exposed via /admin/config, so do not hard-code raw secrets here. The
// standard OTEL_EXPORTER_OTLP_HEADERS environment variable also applies per
// OTel convention.
Headers map[string]string `json:"headers,omitempty" yaml:"headers,omitempty"`
}
TracingConfig configures the OTLP tracing pipeline. All fields are optional; sensible defaults apply when omitted (see internal/otel.DefaultConfig).
Directories
ΒΆ
| Path | Synopsis |
|---|---|
|
cmd
|
|
|
ferrogw
command
Package main is the entry point for the ferrogw gateway server and CLI.
|
Package main is the entry point for the ferrogw gateway server and CLI. |
|
internal
|
|
|
admin
Package admin provides HTTP handlers for the gateway administration API.
|
Package admin provides HTTP handlers for the gateway administration API. |
|
apierror
Package apierror provides OpenAI-compatible JSON error response helpers.
|
Package apierror provides OpenAI-compatible JSON error response helpers. |
|
bootstrap
Package bootstrap provides env-driven factory functions for persistence backends.
|
Package bootstrap provides env-driven factory functions for persistence backends. |
|
cache
Package cache provides the CacheEntry and Cache interface used by the response-cache plugin.
|
Package cache provides the CacheEntry and Cache interface used by the response-cache plugin. |
|
circuitbreaker
Package circuitbreaker implements the circuit-breaker pattern for provider calls.
|
Package circuitbreaker implements the circuit-breaker pattern for provider calls. |
|
cli
Package cli provides shared types and helpers for the ferrogw CLI commands.
|
Package cli provides shared types and helpers for the ferrogw CLI commands. |
|
dashboard
Package dashboard provides template rendering and asset helpers for the gateway web dashboard.
|
Package dashboard provides template rendering and asset helpers for the gateway web dashboard. |
|
discovery
Package discovery provides shared helpers for providers that support live model enumeration via OpenAI-compatible GET /v1/models (or similar) endpoints.
|
Package discovery provides shared helpers for providers that support live model enumeration via OpenAI-compatible GET /v1/models (or similar) endpoints. |
|
events
Package events defines compact internal hook event payloads for the gateway hot path and converts them to the public map form only at dispatch time.
|
Package events defines compact internal hook event payloads for the gateway hot path and converts them to the public map form only at dispatch time. |
|
handler
Package handler provides HTTP handler functions for the OpenAI-compatible API.
|
Package handler provides HTTP handler functions for the OpenAI-compatible API. |
|
httpclient
Package httpclient provides the shared process-wide HTTP client used by providers so connection pooling is reused consistently under load.
|
Package httpclient provides the shared process-wide HTTP client used by providers so connection pooling is reused consistently under load. |
|
httpserver
Package httpserver provides HTTP server construction helpers for the gateway.
|
Package httpserver provides HTTP server construction helpers for the gateway. |
|
latency
Package latency provides a thread-safe rolling-window latency tracker used by the least-latency routing strategy to pick the fastest provider.
|
Package latency provides a thread-safe rolling-window latency tracker used by the least-latency routing strategy to pick the fastest provider. |
|
logging
Package logging provides structured JSON logging with trace ID propagation.
|
Package logging provides structured JSON logging with trace ID propagation. |
|
mcp
Package mcp implements the Model Context Protocol (MCP) 2025-11-25 Streamable HTTP transport for the Ferro Labs AI Gateway.
|
Package mcp implements the Model Context Protocol (MCP) 2025-11-25 Streamable HTTP transport for the Ferro Labs AI Gateway. |
|
metrics
Package metrics registers the Prometheus metrics used by the gateway.
|
Package metrics registers the Prometheus metrics used by the gateway. |
|
middleware
Package middleware provides HTTP middleware for the gateway server.
|
Package middleware provides HTTP middleware for the gateway server. |
|
otel
Package otel wires the gateway core to OpenTelemetry.
|
Package otel wires the gateway core to OpenTelemetry. |
|
plugins/budget
Package budget provides a gateway plugin that enforces per-API-key USD spend limits using in-memory accumulation.
|
Package budget provides a gateway plugin that enforces per-API-key USD spend limits using in-memory accumulation. |
|
plugins/cache
Package cache provides a response-cache plugin that stores LLM responses in memory and serves them on exact-match cache hits, reducing provider cost and latency for repeated requests.
|
Package cache provides a response-cache plugin that stores LLM responses in memory and serves them on exact-match cache hits, reducing provider cost and latency for repeated requests. |
|
plugins/logger
Package logger provides a request-logger plugin that records each LLM request and response to standard output.
|
Package logger provides a request-logger plugin that records each LLM request and response to standard output. |
|
plugins/maxtoken
Package maxtoken provides a max-token guardrail plugin that caps the max_tokens and message count on outgoing requests.
|
Package maxtoken provides a max-token guardrail plugin that caps the max_tokens and message count on outgoing requests. |
|
plugins/ratelimit
Package ratelimit provides a gateway plugin that enforces per-request rate limits using an in-memory token bucket.
|
Package ratelimit provides a gateway plugin that enforces per-request rate limits using an in-memory token bucket. |
|
plugins/wordfilter
Package wordfilter provides a word-filter guardrail plugin that rejects requests containing blocked words.
|
Package wordfilter provides a word-filter guardrail plugin that rejects requests containing blocked words. |
|
proxy
Package proxy provides a transparent pass-through HTTP reverse proxy that forwards unhandled /v1/* requests to the matching upstream provider.
|
Package proxy provides a transparent pass-through HTTP reverse proxy that forwards unhandled /v1/* requests to the matching upstream provider. |
|
ratelimit
Package ratelimit provides a simple in-memory token-bucket rate limiter.
|
Package ratelimit provides a simple in-memory token-bucket rate limiter. |
|
redact
Package redact strips sensitive substrings from text before it is emitted to logs or observability backends.
|
Package redact strips sensitive substrings from text before it is emitted to logs or observability backends. |
|
requestlog
Package requestlog provides persistent storage primitives for request/response logs.
|
Package requestlog provides persistent storage primitives for request/response logs. |
|
sse
Package sse provides Server-Sent Events streaming for OpenAI-compatible responses.
|
Package sse provides Server-Sent Events streaming for OpenAI-compatible responses. |
|
strategies
Package strategies implements the routing strategies used by the gateway.
|
Package strategies implements the routing strategies used by the gateway. |
|
streamwrap
Package streamwrap provides a metering wrapper for streaming LLM responses.
|
Package streamwrap provides a metering wrapper for streaming LLM responses. |
|
testutil
Package testutil provides shared test helpers.
|
Package testutil provides shared test helpers. |
|
transport
Package transport owns all HTTP transports used for upstream provider calls.
|
Package transport owns all HTTP transports used for upstream provider calls. |
|
version
Package version holds build-time version information for Ferro Labs AI Gateway binaries.
|
Package version holds build-time version information for Ferro Labs AI Gateway binaries. |
|
Package mcp exposes the public configuration types for Ferro Labs AI Gateway's MCP (Model Context Protocol) integration.
|
Package mcp exposes the public configuration types for Ferro Labs AI Gateway's MCP (Model Context Protocol) integration. |
|
Package models provides the model catalog β a structured map of every supported model's pricing, capabilities, and lifecycle metadata.
|
Package models provides the model catalog β a structured map of every supported model's pricing, capabilities, and lifecycle metadata. |
|
Package observability is the public, semver-stable surface for the Ferro Labs AI Gateway observability subsystem.
|
Package observability is the public, semver-stable surface for the Ferro Labs AI Gateway observability subsystem. |
|
Package plugin defines the Plugin interface and the lifecycle stages used to hook into the gateway request pipeline.
|
Package plugin defines the Plugin interface and the lifecycle stages used to hook into the gateway request pipeline. |
|
Package providers re-exports all contracts and types from providers/core as type aliases so that existing code importing this package continues to compile without any changes.
|
Package providers re-exports all contracts and types from providers/core as type aliases so that existing code importing this package continues to compile without any changes. |
|
ai21
Package ai21 provides a client for the AI21 Labs API (Jamba and Jurassic models).
|
Package ai21 provides a client for the AI21 Labs API (Jamba and Jurassic models). |
|
anthropic
Package anthropic provides a client for the Anthropic API (Claude models).
|
Package anthropic provides a client for the Anthropic API (Claude models). |
|
azure_foundry
Package azurefoundry provides a client for the Azure AI Foundry API.
|
Package azurefoundry provides a client for the Azure AI Foundry API. |
|
azure_openai
Package azureopenai provides a client for the Azure OpenAI API.
|
Package azureopenai provides a client for the Azure OpenAI API. |
|
bedrock
Package bedrock provides a client for AWS Bedrock.
|
Package bedrock provides a client for AWS Bedrock. |
|
cerebras
Package cerebras provides a client for the Cerebras inference API.
|
Package cerebras provides a client for the Cerebras inference API. |
|
cloudflare
Package cloudflare provides a client for Cloudflare Workers AI.
|
Package cloudflare provides a client for Cloudflare Workers AI. |
|
cohere
Package cohere provides a client for the Cohere API.
|
Package cohere provides a client for the Cohere API. |
|
core
Package core defines the stable public contracts for the providers layer: interfaces, shared data types, and supporting helpers.
|
Package core defines the stable public contracts for the providers layer: interfaces, shared data types, and supporting helpers. |
|
databricks
Package databricks provides a client for the Databricks model serving API.
|
Package databricks provides a client for the Databricks model serving API. |
|
deepinfra
Package deepinfra provides a client for the DeepInfra OpenAI-compatible API.
|
Package deepinfra provides a client for the DeepInfra OpenAI-compatible API. |
|
deepseek
Package deepseek provides a client for the DeepSeek API.
|
Package deepseek provides a client for the DeepSeek API. |
|
fireworks
Package fireworks provides a client for the Fireworks AI API.
|
Package fireworks provides a client for the Fireworks AI API. |
|
gemini
Package gemini provides a client for the Google Gemini API.
|
Package gemini provides a client for the Google Gemini API. |
|
groq
Package groq provides a client for the Groq API.
|
Package groq provides a client for the Groq API. |
|
hugging_face
Package huggingface provides a client for the Hugging Face Inference API.
|
Package huggingface provides a client for the Hugging Face Inference API. |
|
mistral
Package mistral provides a client for the Mistral AI API.
|
Package mistral provides a client for the Mistral AI API. |
|
moonshot
Package moonshot provides a client for the Moonshot AI OpenAI-compatible API.
|
Package moonshot provides a client for the Moonshot AI OpenAI-compatible API. |
|
novita
Package novita provides a client for the Novita OpenAI-compatible API.
|
Package novita provides a client for the Novita OpenAI-compatible API. |
|
nvidia_nim
Package nvidianim provides a client for the NVIDIA NIM OpenAI-compatible API.
|
Package nvidianim provides a client for the NVIDIA NIM OpenAI-compatible API. |
|
ollama
Package ollama provides a client for the Ollama local LLM server.
|
Package ollama provides a client for the Ollama local LLM server. |
|
ollama_cloud
Package ollamacloud provides a client for the Ollama Cloud API.
|
Package ollamacloud provides a client for the Ollama Cloud API. |
|
openai
Package openai provides a client for the OpenAI API using the official Go SDK.
|
Package openai provides a client for the OpenAI API using the official Go SDK. |
|
openrouter
Package openrouter provides a client for the OpenRouter API.
|
Package openrouter provides a client for the OpenRouter API. |
|
perplexity
Package perplexity provides a client for the Perplexity AI API.
|
Package perplexity provides a client for the Perplexity AI API. |
|
qwen
Package qwen provides a client for the Alibaba Cloud DashScope OpenAI-compatible API.
|
Package qwen provides a client for the Alibaba Cloud DashScope OpenAI-compatible API. |
|
replicate
Package replicate provides a client for the Replicate API.
|
Package replicate provides a client for the Replicate API. |
|
sambanova
Package sambanova provides a client for the SambaNova OpenAI-compatible API.
|
Package sambanova provides a client for the SambaNova OpenAI-compatible API. |
|
together
Package together provides a client for the Together AI API.
|
Package together provides a client for the Together AI API. |
|
vertex_ai
Package vertexai provides a client for Google Vertex AI.
|
Package vertexai provides a client for Google Vertex AI. |
|
xai
Package xai provides a client for the xAI (Grok) API.
|
Package xai provides a client for the xAI (Grok) API. |
|
scripts
|
|
|
catalog-check
command
catalog-check reads every "source" URL from models/catalog.json and performs a HEAD request against each one.
|
catalog-check reads every "source" URL from models/catalog.json and performs a HEAD request against each one. |
|
Package web contains embedded web UI template assets.
|
Package web contains embedded web UI template assets. |
Ferro Labs AI Gateway
