aigateway

package module
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 3, 2026 License: Apache-2.0 Imports: 22 Imported by: 0

README

Ferro Logo Ferro Labs AI Gateway

The high-performance, open-source control plane for your AI applications.
Route, observe, and secure requests across 100+ LLM providers via a single OpenAI-compatible API.

License Go Go Reference Code Scanning Discord


AI Gateway Architecture

Ferro Gateway is a remarkably fast, lightweight routing tier built in Go. It acts as an intelligent intermediary between your applications and upstream foundation models, effectively transforming fragmented API integration into a unified, secure, and observable infrastructure layer.

Zero SDK changes required. Drop it into your existing OpenAI-reliant code in one line.

✨ Core Capabilities

  • Unified API: Connect to 100+ top-tier models (OpenAI, Anthropic, Gemini, Mistral, Ollama, DeepSeek, and more) using the exact same standard OpenAI request/response format.
  • Smart Routing Engine: Mitigate downtime and optimize costs using 4 robust routing strategies: Single, Fallback (w/ exponential backoff), Weighted Load Balancing, and Conditional (model-based).
  • Transparent Pass-Through Proxy: Automatically forwards requests for non-chat endpoints (like /v1/audio, /v1/images, /v1/files, etc.) directly to the provider. The gateway securely injects your auth credentials while proxying raw bytes!
  • Observability Built-In: Structured JSON logs with per-request trace IDs, Prometheus /metrics endpoint (request count, latency histograms, token usage), and a deep /health endpoint with per-provider status.
  • Resilience by Default: Per-provider circuit breakers (Closed/Open/HalfOpen) auto-disable failing backends. Token-bucket rate limiting is available as both an HTTP middleware (per-IP) and a plugin (per-provider).
  • Extensible Middleware: Intercept requests via pluggable plugins for Guardrails (PII/word filtering), Token Limiting, exact-match Caching, Rate Limiting, and Request Logging.
  • Secure Access Manager: Centrally issue scoped, auto-expiring API keys with native RBAC. Zero external database required for stand-up.

✅ Integrated Model Providers

The following providers are integrated in the gateway codebase:

Provider Integrated
OpenAI
Anthropic
Gemini (Google)
Mistral
Groq
Cohere
DeepSeek
Together AI
Perplexity
Fireworks
AI21
Azure OpenAI
Ollama (local/self-hosted)
Replicate
AWS Bedrock

Providers are enabled when their corresponding environment variables/credentials are configured.


⚡ Quick Start

Run via Docker

The fastest way to get started is pulling the official image from GitHub Container Registry.

docker run --rm -p 8080:8080 \
  -e OPENAI_API_KEY=sk-your-key \
  ghcr.io/ferro-labs/ai-gateway:latest

Build from Source

Ensure you have Go 1.24+ installed.

git clone https://github.com/ferro-labs/ai-gateway.git
cd ai-gateway

export OPENAI_API_KEY=sk-your-key
make run
# Server listens locally on :8080

🧾 Persistent Request Logging

The built-in request-logger plugin can persist request lifecycle events (before_request, after_request, on_error) into SQLite or PostgreSQL.

Plugin config (YAML)

plugins:
  - name: request-logger
    type: logging
    stage: before_request
    enabled: true
    config:
      level: info
      persist: true
      backend: sqlite   # sqlite | postgres
      dsn: ferrogw-requests.db

For PostgreSQL:

plugins:
  - name: request-logger
    type: logging
    stage: before_request
    enabled: true
    config:
      level: info
      persist: true
      backend: postgres
      dsn: postgresql://user:pass@localhost:5432/ferrogw?sslmode=disable

🧪 Postgres Integration Tests

PostgreSQL store integration tests are opt-in. Set FERROGW_TEST_POSTGRES_DSN and run the admin package tests.

export FERROGW_TEST_POSTGRES_DSN='postgresql://user:pass@localhost:5432/ferrogw_test?sslmode=disable'
go test ./internal/admin

Without that environment variable, Postgres integration tests are skipped automatically.


⚙️ Storage Backend Env Quick Reference

Use these environment variables to enable persistent backends:

Area Backend Env DSN Env Supported Values
Runtime config store CONFIG_STORE_BACKEND CONFIG_STORE_DSN memory (default), sqlite, postgres
API key store API_KEY_STORE_BACKEND API_KEY_STORE_DSN memory (default), sqlite, postgres
Request log store REQUEST_LOG_STORE_BACKEND REQUEST_LOG_STORE_DSN sqlite, postgres (unset = disabled)

Example (fully persistent local setup with SQLite):

export CONFIG_STORE_BACKEND=sqlite
export CONFIG_STORE_DSN=./ferrogw-config.db

export API_KEY_STORE_BACKEND=sqlite
export API_KEY_STORE_DSN=./ferrogw-keys.db

export REQUEST_LOG_STORE_BACKEND=sqlite
export REQUEST_LOG_STORE_DSN=./ferrogw-requests.db

Example (production-style PostgreSQL setup):

export CONFIG_STORE_BACKEND=postgres
export CONFIG_STORE_DSN='postgresql://user:pass@db:5432/ferrogw?sslmode=require'

export API_KEY_STORE_BACKEND=postgres
export API_KEY_STORE_DSN='postgresql://user:pass@db:5432/ferrogw?sslmode=require'

export REQUEST_LOG_STORE_BACKEND=postgres
export REQUEST_LOG_STORE_DSN='postgresql://user:pass@db:5432/ferrogw?sslmode=require'

Production note:

  • You can use a single shared DSN for all three stores (simpler operations).
  • For stronger isolation, use separate databases or schemas per area (config, API keys, request logs) with least-privilege credentials.
  • Prefer TLS-enabled Postgres connections in production (sslmode=require at minimum; sslmode=verify-full when certificate validation is configured).
  • Use sslmode=disable only when transport encryption is enforced outside Postgres (for example, mTLS service mesh or a trusted local Unix socket).

First-run bootstrap keys (optional)

For first-time setup, you can provide bootstrap bearer keys for /admin/* routes:

export ADMIN_BOOTSTRAP_KEY='change-me-to-a-long-random-value'
export ADMIN_BOOTSTRAP_READ_ONLY_KEY='change-me-to-a-long-random-value'
export ADMIN_BOOTSTRAP_ENABLED=true
  • ADMIN_BOOTSTRAP_KEY is treated as admin scope.
  • ADMIN_BOOTSTRAP_READ_ONLY_KEY is treated as read_only scope.
  • Bootstrap keys are only honored while the API key store is empty.
  • Set ADMIN_BOOTSTRAP_ENABLED=false to force-disable bootstrap auth.

Use it as:

curl http://localhost:8080/admin/dashboard \
  -H "Authorization: Bearer $ADMIN_BOOTSTRAP_KEY"

After creating persistent API keys via POST /admin/keys, remove or unset bootstrap values and restart.


🔎 API Key Usage Analytics

Admin API provides a usage analytics endpoint:

GET /admin/keys/usage
Authorization: Bearer <admin-or-readonly-key>

Supported query params:

  • limit (default 20, max 100)
  • offset (default 0)
  • sort (usage or last_used; default usage)
  • active (true or false)
  • since (RFC3339 timestamp; filters by last_used_at >= since)

Example:

curl "http://localhost:8080/admin/keys/usage?limit=10&offset=0&sort=usage&active=true&since=2026-02-01T00:00:00Z" \
  -H "Authorization: Bearer gw-..."

Response contains data (sorted by usage_count desc) and summary totals.


⏱️ API Key Expiration Management

Update key expiration without rotating or recreating the key:

PUT /admin/keys/{id}
Authorization: Bearer <admin-key>
Content-Type: application/json

Supported expiration fields in request body:

  • expires_at (RFC3339 timestamp) to set/update expiration
  • clear_expiration (true) to remove expiration

Examples:

curl -X PUT "http://localhost:8080/admin/keys/<id>" \
  -H "Authorization: Bearer gw-..." \
  -H "Content-Type: application/json" \
  -d '{"expires_at":"2026-03-15T00:00:00Z"}'
curl -X PUT "http://localhost:8080/admin/keys/<id>" \
  -H "Authorization: Bearer gw-..." \
  -H "Content-Type: application/json" \
  -d '{"clear_expiration":true}'

🔑 API Key Detail API

Fetch a single API key by ID (masked key value):

GET /admin/keys/{id}
Authorization: Bearer <admin-or-readonly-key>

Returns 404 if the key does not exist.


📜 Admin Request Logs API

When request log storage is enabled, admin API exposes persisted request logs:

GET /admin/logs
Authorization: Bearer <admin-or-readonly-key>

Supported query params:

  • limit (default 50, max 200)
  • offset (default 0)
  • stage (e.g. before_request, after_request, on_error)
  • model
  • provider
  • since (RFC3339 timestamp)

Example:

curl "http://localhost:8080/admin/logs?limit=20&offset=0&stage=on_error&since=2026-02-01T00:00:00Z" \
  -H "Authorization: Bearer gw-..."

If request log storage is disabled, endpoint returns 501 Not Implemented.

Cleanup old request logs

DELETE /admin/logs?before=<RFC3339>
Authorization: Bearer <admin-key>

Optional filters:

  • stage
  • model
  • provider

Example:

curl -X DELETE "http://localhost:8080/admin/logs?before=2026-02-01T00:00:00Z&stage=on_error" \
  -H "Authorization: Bearer gw-..."

Response includes the number of deleted entries in deleted.

Request log stats

GET /admin/logs/stats
Authorization: Bearer <admin-or-readonly-key>

Optional filters:

  • limit (positive integer; caps by_provider and by_model cardinality)
  • stage
  • model
  • provider
  • since (RFC3339 timestamp)

Response contains aggregated summary, by_stage, by_provider, and by_model counts.


📊 Admin Dashboard Summary API

For a minimal admin dashboard UI, fetch aggregate status from one endpoint:

GET /admin/dashboard
Authorization: Bearer <admin-or-readonly-key>

Response sections:

  • providers (total, available)
  • keys (total, active, expired, total_usage)
  • request_logs (enabled, total)

🕘 Config History API

Create/update/delete runtime config over admin APIs:

POST /admin/config
PUT /admin/config
DELETE /admin/config
Authorization: Bearer <admin-key>

Notes:

  • POST /admin/config creates a new runtime config version (same payload schema as PUT)
  • PUT /admin/config updates the current runtime config
  • DELETE /admin/config resets to startup config and clears persisted override

Persist config across restarts via env vars:

  • CONFIG_STORE_BACKEND: memory (default), sqlite, postgres
  • CONFIG_STORE_DSN: backend DSN or SQLite file path

Runtime config updates are tracked in-memory and exposed via:

GET /admin/config/history
Authorization: Bearer <admin-or-readonly-key>

Response includes:

  • data[] with version, updated_at, and config
  • summary.total_versions

Rollback to a previous version:

POST /admin/config/rollback/{version}
Authorization: Bearer <admin-key>

Example:

curl -X POST "http://localhost:8080/admin/config/rollback/2" \
  -H "Authorization: Bearer gw-..."

🖥️ Built-in Admin Dashboard UI

A minimal dashboard page is available at:

GET /dashboard

It prompts for an admin/read-only key and then calls GET /admin/dashboard to render provider, key, and request-log summary cards.

The page also loads GET /admin/config/history and includes per-version rollback actions using POST /admin/config/rollback/{version}.


🔌 1-Line Migration

Ferro Labs AI Gateway natively speaks the OpenAI spec. Point your existing client SDKs to the Gateway by changing simply the baseURLno SDK changes, no prompt edits, no refactoring.

Python
from openai import OpenAI

client = OpenAI(
    api_key="sk-ferro-...", # Managed via ferro
    base_url="http://localhost:8080/v1",  # ← Only change this line
)
TypeScript / Node.js
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "sk-ferro-...",
  baseURL: "http://localhost:8080/v1",  // ← Only change this line 
});
cURL
curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-ferro-..." \
  -H "Content-Type: application/json" \
  -d '{"model":"claude-3-opus-20240229","messages":[{"role":"user","content":"Hello!"}]}'
  # The gateway automatically detects the model and routes to Anthropic!

🛣️ Project Roadmap

Ferro Gateway is actively developed to support an end-to-end AI operating environment. We are currently transitioning through major foundational and production-grade phases:

  • v0.1.0 — Foundation Release: Core routing, multi-provider execution, basic guardrails, and streaming capabilities.
  • v0.2.0 — Observability & Resilience: Structured JSON logging with trace IDs, Prometheus metrics, per-provider circuit breakers, token-bucket rate limiting, deep health checks, and consistent error schema.
  • v0.3.0 — Modality Expansions: Embeddings, Image generation mapping, Cost tracking via pricing tables, and Model aliasing.
  • v0.4.0 — Persistent State: Dedicated Admin API, SQLite/PostgreSQL persistence, persistent request logs, dashboard, and runtime config CRUD.
  • v0.5.0 — Advanced Intelligence: Least-latency and Cost-optimized algorithmic routing, A/B Testing modules, and Semantic Caching.
  • v1.0.0 — Production Ready: Helm charts, open-telemetry export, edge caching, and official SDK embeddings.

Review our detailed ROADMAP.md for deeper implementation plans.


🤝 Contributing

We welcome community contributions! The priority areas for ecosystem growth are:

  1. Adding support for new niche LLM providers.
  2. Building new middleware plugins (Guardrails, Modifiers, Analyzers).
  3. Enhancing test coverage and documentation.

Please see our CONTRIBUTING.md for style guidelines and PR processes. By participating, you agree to follow our Code of Conduct.

📄 License

Ferro Labs AI Gateway is proudly open-source and released under the Apache 2.0 License.

Documentation

Overview

Package aigateway provides a high-performance, zero-dependency AI gateway for routing requests to large language model (LLM) providers.

The Gateway type is the main entry point: create one with New, register providers with RegisterProvider, load plugins from config with LoadPlugins, and route requests with Route or RouteStream.

Plugins and routing strategies (single, fallback, load-balance, conditional) are configured via Config which can be loaded from a YAML or JSON file using LoadConfig.

Index

Constants

View Source
const (
	SubjectRequestCompleted = "gateway.request.completed"
	SubjectRequestFailed    = "gateway.request.failed"
)

Event subject constants used when invoking gateway hooks.

Variables

This section is empty.

Functions

func ValidateConfig

func ValidateConfig(cfg Config) error

ValidateConfig validates a Config for correctness.

Types

type CircuitBreakerConfig added in v0.2.0

type CircuitBreakerConfig struct {
	// FailureThreshold is the number of consecutive failures before the circuit
	// opens. Defaults to 5.
	FailureThreshold int `json:"failure_threshold" yaml:"failure_threshold"`
	// SuccessThreshold is the number of consecutive successes in half-open state
	// required to close the circuit. Defaults to 1.
	SuccessThreshold int `json:"success_threshold" yaml:"success_threshold"`
	// Timeout is the duration the circuit stays open before transitioning to
	// half-open (e.g. "30s"). Defaults to "30s".
	Timeout string `json:"timeout" yaml:"timeout"`
}

CircuitBreakerConfig configures the per-provider circuit breaker.

type Condition

type Condition struct {
	Key       string `json:"key" yaml:"key"`
	Value     string `json:"value" yaml:"value"`
	TargetKey string `json:"target_key" yaml:"target_key"`
}

Condition represents a condition for conditional routing.

type Config

type Config struct {
	// Strategy defines how requests are routed (e.g., single, fallback, loadbalance).
	Strategy StrategyConfig `json:"strategy" yaml:"strategy"`
	// Targets is a list of provider targets to route requests to.
	Targets []Target `json:"targets" yaml:"targets"`
	// Plugins configuration (optional).
	Plugins []PluginConfig `json:"plugins,omitempty" yaml:"plugins,omitempty"`
	// Aliases maps friendly model names (e.g. "fast", "smart") to concrete model IDs.
	// Aliases are resolved before routing — they must not reference other aliases.
	Aliases map[string]string `json:"aliases,omitempty" yaml:"aliases,omitempty"`
}

Config holds the configuration for the AI Gateway.

func LoadConfig

func LoadConfig(path string) (*Config, error)

LoadConfig reads and parses a config file from the given path. Supported formats: JSON (.json), YAML (.yaml, .yml).

type EventHookFunc added in v0.2.0

type EventHookFunc func(ctx context.Context, subject string, data map[string]interface{})

EventHookFunc is called asynchronously after a gateway event (request completed or failed). It replaces the old EventPublisher interface with a simpler function-based hook pattern.

type Gateway

type Gateway struct {
	// contains filtered or unexported fields
}

Gateway is the main entry point for routing LLM requests.

func New

func New(cfg Config) (*Gateway, error)

New creates a new Gateway instance with the given configuration.

func (*Gateway) AddHook added in v0.2.0

func (g *Gateway) AddHook(fn EventHookFunc)

AddHook registers an EventHookFunc that is called asynchronously on each completed or failed request. Multiple hooks may be registered; all are invoked for every event.

func (*Gateway) AllModels added in v0.2.0

func (g *Gateway) AllModels() []providers.ModelInfo

AllModels returns ModelInfo from all registered providers. If auto-discovery has run for a provider, discovered models take precedence over the provider's static model list.

func (*Gateway) Catalog added in v0.4.5

func (g *Gateway) Catalog() models.Catalog

Catalog returns a shallow copy of the loaded model catalog. A copy is returned so callers cannot mutate the gateway's internal catalog.

func (*Gateway) Close

func (g *Gateway) Close() error

Close cleans up resources.

func (*Gateway) Embed added in v0.3.0

Embed routes an embedding request to the first registered EmbeddingProvider that supports the requested model.

func (*Gateway) FindByModel added in v0.2.0

func (g *Gateway) FindByModel(model string) (providers.Provider, bool)

FindByModel returns the first registered provider that supports the given model.

func (*Gateway) GenerateImage added in v0.3.0

func (g *Gateway) GenerateImage(ctx context.Context, req providers.ImageRequest) (*providers.ImageResponse, error)

GenerateImage routes an image generation request to the first registered ImageProvider that supports the requested model.

func (*Gateway) Get added in v0.2.0

func (g *Gateway) Get(name string) (providers.Provider, bool)

Get satisfies providers.ProviderSource (alias for GetProvider).

func (*Gateway) GetConfig

func (g *Gateway) GetConfig() Config

GetConfig returns a copy of the current configuration.

func (*Gateway) GetProvider added in v0.2.0

func (g *Gateway) GetProvider(name string) (providers.Provider, bool)

GetProvider returns a registered provider by name.

func (*Gateway) List added in v0.2.0

func (g *Gateway) List() []string

List satisfies providers.ProviderSource (alias for ListProviders).

func (*Gateway) ListProviders added in v0.2.0

func (g *Gateway) ListProviders() []string

ListProviders returns the names of all registered providers.

func (*Gateway) LoadPlugins

func (g *Gateway) LoadPlugins() error

LoadPlugins initializes and registers plugins from the gateway configuration.

func (*Gateway) RegisterPlugin

func (g *Gateway) RegisterPlugin(stage plugin.Stage, p plugin.Plugin) error

RegisterPlugin registers a plugin at the given lifecycle stage.

func (*Gateway) RegisterProvider

func (g *Gateway) RegisterProvider(p providers.Provider)

RegisterProvider registers a provider with the gateway.

func (*Gateway) ReloadConfig

func (g *Gateway) ReloadConfig(cfg Config) error

ReloadConfig validates and applies a new configuration, forcing strategy rebuild on next request.

func (*Gateway) Route

Route routes a request to the appropriate provider based on the configuration.

func (*Gateway) RouteStream

func (g *Gateway) RouteStream(ctx context.Context, req providers.Request) (<-chan providers.StreamChunk, error)

RouteStream runs before-request plugins then returns a metered streaming response channel. Provider resolution follows the configured strategy mode, then falls back to any registered provider that supports the requested model and streaming. Prometheus metrics and event hooks are emitted when the returned channel drains (matching the behaviour of Route for non-streaming).

func (*Gateway) StartDiscovery added in v0.3.0

func (g *Gateway) StartDiscovery(ctx context.Context, interval time.Duration) error

StartDiscovery periodically refreshes model lists from providers that implement DiscoveryProvider. It runs in a background goroutine until ctx is cancelled. interval must be greater than zero; an error is returned otherwise.

type PluginConfig

type PluginConfig struct {
	Name    string                 `json:"name" yaml:"name"`
	Type    string                 `json:"type" yaml:"type"`
	Stage   string                 `json:"stage" yaml:"stage"`
	Enabled bool                   `json:"enabled" yaml:"enabled"`
	Config  map[string]interface{} `json:"config" yaml:"config"`
}

PluginConfig holds plugin configuration.

type RetryConfig

type RetryConfig struct {
	// Attempts is the maximum number of attempts per target (1 = no retries).
	Attempts int `json:"attempts" yaml:"attempts"`
	// OnStatusCodes, when non-empty, limits retries to the listed HTTP status
	// codes. A retry is skipped when the provider returns a code not in the
	// list, and the strategy moves on to the next target immediately.
	// Leave empty to retry on any error (default behaviour).
	// Example: [429, 502, 503]
	OnStatusCodes []int `json:"on_status_codes,omitempty" yaml:"on_status_codes,omitempty"`
	// InitialBackoffMs is the base backoff in milliseconds for the exponential
	// back-off formula: delay = InitialBackoffMs * 2^(attempt-1).
	// Defaults to 100 ms when unset or zero.
	InitialBackoffMs int `json:"initial_backoff_ms,omitempty" yaml:"initial_backoff_ms,omitempty"`
}

RetryConfig defines retry behavior for the fallback strategy.

type StrategyConfig

type StrategyConfig struct {
	Mode       StrategyMode `json:"mode" yaml:"mode"`
	Conditions []Condition  `json:"conditions,omitempty" yaml:"conditions,omitempty"` // For conditional routing
}

StrategyConfig defines the routing strategy.

type StrategyMode

type StrategyMode string

StrategyMode represents the routing strategy mode.

const (
	ModeSingle        StrategyMode = "single"
	ModeFallback      StrategyMode = "fallback"
	ModeLoadBalance   StrategyMode = "loadbalance"
	ModeConditional   StrategyMode = "conditional"
	ModeLatency       StrategyMode = "least-latency"
	ModeCostOptimized StrategyMode = "cost-optimized"
)

StrategyMode constants define the supported routing strategies.

type Target

type Target struct {
	// VirtualKey is the unique identifier for the provider (or a virtual key in the vault).
	VirtualKey string `json:"virtual_key" yaml:"virtual_key"`
	// Weight is used for load balancing.
	Weight float64 `json:"weight,omitempty" yaml:"weight,omitempty"`
	// Retry configuration for this target.
	Retry *RetryConfig `json:"retry,omitempty" yaml:"retry,omitempty"`
	// CircuitBreaker configuration for this target (optional).
	CircuitBreaker *CircuitBreakerConfig `json:"circuit_breaker,omitempty" yaml:"circuit_breaker,omitempty"`
}

Target represents a specific provider target.

Directories

Path Synopsis
cmd
ferrogw command
Package main provides the HTTP handlers for legacy OpenAI completions endpoint.
Package main provides the HTTP handlers for legacy OpenAI completions endpoint.
ferrogw-cli command
Package main provides the ferrogw-cli command-line tool for managing the Ferro Labs AI Gateway.
Package main provides the ferrogw-cli command-line tool for managing the Ferro Labs AI Gateway.
examples
basic command
Package main demonstrates sending a request directly to any configured LLM provider.
Package main demonstrates sending a request directly to any configured LLM provider.
custom-plugin command
Package main demonstrates how to write and register a custom plugin with Ferro Labs AI Gateway.
Package main demonstrates how to write and register a custom plugin with Ferro Labs AI Gateway.
embedded command
Package main demonstrates embedding Ferro Labs AI Gateway inside an existing Go HTTP server.
Package main demonstrates embedding Ferro Labs AI Gateway inside an existing Go HTTP server.
fallback command
Package main demonstrates the fallback routing strategy.
Package main demonstrates the fallback routing strategy.
loadbalance command
Package main demonstrates weighted load balancing across multiple providers.
Package main demonstrates weighted load balancing across multiple providers.
with-circuit-breaker command
Package main demonstrates per-provider circuit breaker configuration.
Package main demonstrates per-provider circuit breaker configuration.
with-guardrails command
Package main demonstrates using built-in guardrail plugins.
Package main demonstrates using built-in guardrail plugins.
with-hooks command
Package main demonstrates gateway event hooks.
Package main demonstrates gateway event hooks.
internal
admin
Package admin provides HTTP handlers for the gateway administration API.
Package admin provides HTTP handlers for the gateway administration API.
cache
Package cache provides the CacheEntry and Cache interface used by the response-cache plugin.
Package cache provides the CacheEntry and Cache interface used by the response-cache plugin.
circuitbreaker
Package circuitbreaker implements the circuit-breaker pattern for provider calls.
Package circuitbreaker implements the circuit-breaker pattern for provider calls.
latency
Package latency provides a thread-safe rolling-window latency tracker used by the least-latency routing strategy to pick the fastest provider.
Package latency provides a thread-safe rolling-window latency tracker used by the least-latency routing strategy to pick the fastest provider.
logging
Package logging provides structured JSON logging with trace ID propagation.
Package logging provides structured JSON logging with trace ID propagation.
metrics
Package metrics registers the Prometheus metrics used by the gateway.
Package metrics registers the Prometheus metrics used by the gateway.
plugins/cache
Package cache provides a response-cache plugin that stores LLM responses in memory and serves them on exact-match cache hits, reducing provider cost and latency for repeated requests.
Package cache provides a response-cache plugin that stores LLM responses in memory and serves them on exact-match cache hits, reducing provider cost and latency for repeated requests.
plugins/logger
Package logger provides a request-logger plugin that records each LLM request and response to standard output.
Package logger provides a request-logger plugin that records each LLM request and response to standard output.
plugins/maxtoken
Package maxtoken provides a max-token guardrail plugin that caps the max_tokens and message count on outgoing requests.
Package maxtoken provides a max-token guardrail plugin that caps the max_tokens and message count on outgoing requests.
plugins/ratelimit
Package ratelimit provides a gateway plugin that enforces per-request rate limits using an in-memory token bucket.
Package ratelimit provides a gateway plugin that enforces per-request rate limits using an in-memory token bucket.
plugins/wordfilter
Package wordfilter provides a word-filter guardrail plugin that rejects requests containing blocked words.
Package wordfilter provides a word-filter guardrail plugin that rejects requests containing blocked words.
ratelimit
Package ratelimit provides a simple in-memory token-bucket rate limiter.
Package ratelimit provides a simple in-memory token-bucket rate limiter.
requestlog
Package requestlog provides persistent storage primitives for request/response logs.
Package requestlog provides persistent storage primitives for request/response logs.
strategies
Package strategies implements the routing strategies used by the gateway.
Package strategies implements the routing strategies used by the gateway.
streamwrap
Package streamwrap provides a metering wrapper for streaming LLM responses.
Package streamwrap provides a metering wrapper for streaming LLM responses.
version
Package version holds build-time version information for Ferro Labs AI Gateway binaries.
Package version holds build-time version information for Ferro Labs AI Gateway binaries.
Package models provides the model catalog — a structured map of every supported model's pricing, capabilities, and lifecycle metadata.
Package models provides the model catalog — a structured map of every supported model's pricing, capabilities, and lifecycle metadata.
Package plugin defines the Plugin interface and the lifecycle stages used to hook into the gateway request pipeline.
Package plugin defines the Plugin interface and the lifecycle stages used to hook into the gateway request pipeline.
Package providers defines the Provider interface and shared data types used across all LLM provider implementations.
Package providers defines the Provider interface and shared data types used across all LLM provider implementations.
scripts
catalog-check command
catalog-check reads every "source" URL from models/catalog.json and performs a HEAD request against each one.
catalog-check reads every "source" URL from models/catalog.json and performs a HEAD request against each one.
Package web contains embedded web UI template assets.
Package web contains embedded web UI template assets.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL