aigateway

package module

v0.2.0 Latest Latest Go to latest Published: Feb 27, 2026 License: Apache-2.0 Imports: 17 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/ferro-labs/ai-gateway

Links

Open Source Insights

README ¶

Ferro AI Gateway

The high-performance, open-source control plane for your AI applications.
Route, observe, and secure requests across 100+ LLM providers via a single OpenAI-compatible API.

Ferro Gateway is a remarkably fast, lightweight routing tier built in Go. It acts as an intelligent intermediary between your applications and upstream foundation models, effectively transforming fragmented API integration into a unified, secure, and observable infrastructure layer.

Zero SDK changes required. Drop it into your existing OpenAI-reliant code in one line.

✨ Core Capabilities

Unified API: Connect to 100+ top-tier models (OpenAI, Anthropic, Gemini, Mistral, Ollama, DeepSeek, and more) using the exact same standard OpenAI request/response format.
Smart Routing Engine: Mitigate downtime and optimize costs using 4 robust routing strategies: Single, Fallback (w/ exponential backoff), Weighted Load Balancing, and Conditional (model-based).
Transparent Pass-Through Proxy: Automatically forwards requests for non-chat endpoints (like /v1/audio, /v1/images, /v1/files, etc.) directly to the provider. The gateway securely injects your auth credentials while proxying raw bytes!
Observability Built-In: Structured JSON logs with per-request trace IDs, Prometheus /metrics endpoint (request count, latency histograms, token usage), and a deep /health endpoint with per-provider status.
Resilience by Default: Per-provider circuit breakers (Closed/Open/HalfOpen) auto-disable failing backends. Token-bucket rate limiting is available as both an HTTP middleware (per-IP) and a plugin (per-provider).
Extensible Middleware: Intercept requests via pluggable plugins for Guardrails (PII/word filtering), Token Limiting, exact-match Caching, Rate Limiting, and Request Logging.
Secure Access Manager: Centrally issue scoped, auto-expiring API keys with native RBAC. Zero external database required for stand-up.

⚡ Quick Start

Run via Docker

The fastest way to get started is pulling the official image from GitHub Container Registry.

docker run --rm -p 8080:8080 \
  -e OPENAI_API_KEY=sk-your-key \
  ghcr.io/ferro-labs/ai-gateway:latest

# To run the absolute latest unreleased code from main:
# ghcr.io/ferro-labs/ai-gateway:edge

Build from Source

Ensure you have Go 1.24+ installed.

git clone https://github.com/ferro-labs/ai-gateway.git
cd ai-gateway

export OPENAI_API_KEY=sk-your-key
make run
# Server listens locally on :8080

🔌 1-Line Migration

FerroGateway natively speaks the OpenAI spec. Point your existing client SDKs to the Gateway by changing simply the baseURL—no SDK changes, no prompt edits, no refactoring.

Python

from openai import OpenAI

client = OpenAI(
    api_key="sk-ferro-...", # Managed via ferro
    base_url="http://localhost:8080/v1",  # ← Only change this line
)

TypeScript / Node.js

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "sk-ferro-...",
  baseURL: "http://localhost:8080/v1",  // ← Only change this line 
});

cURL

curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer sk-ferro-..." \
  -H "Content-Type: application/json" \
  -d '{"model":"claude-3-opus-20240229","messages":[{"role":"user","content":"Hello!"}]}'
  # The gateway automatically detects the model and routes to Anthropic!

🛣️ Project Roadmap

Ferro Gateway is actively developed to support an end-to-end AI operating environment. We are currently transitioning through major foundational and production-grade phases:

v0.1.0 — Foundation Release: Core routing, multi-provider execution, basic guardrails, and streaming capabilities.
v0.2.0 — Observability & Resilience: Structured JSON logging with trace IDs, Prometheus metrics, per-provider circuit breakers, token-bucket rate limiting, deep health checks, and consistent error schema.
v0.3.0 — Modality Expansions: Embeddings, Image generation mapping, Cost tracking via pricing tables, and Model aliasing.
v0.4.0 — Persistent State: Dedicated Admin API, SQLite/PostgreSQL persistence, robust CRUD configuration portals.
v0.5.0 — Advanced Intelligence: Least-latency and Cost-optimized algorithmic routing, A/B Testing modules, and Semantic Caching.
v1.0.0 — Production Ready: Helm charts, open-telemetry export, edge caching, and official SDK embeddings.

Review our detailed ROADMAP.md for deeper implementation plans.

🤝 Contributing

We welcome community contributions! The priority areas for ecosystem growth are:

Adding support for new niche LLM providers.
Building new middleware plugins (Guardrails, Modifiers, Analyzers).
Enhancing test coverage and documentation.

Please see our CONTRIBUTING.md for style guidelines and PR processes.

📄 License

FerroGateway is proudly open-source and released under the Apache 2.0 License.

Documentation ¶

Overview ¶

Package aigateway provides a high-performance, zero-dependency AI gateway for routing requests to large language model (LLM) providers.

The Gateway type is the main entry point: create one with New, register providers with RegisterProvider, load plugins from config with LoadPlugins, and route requests with Route or RouteStream.

Plugins and routing strategies (single, fallback, load-balance, conditional) are configured via Config which can be loaded from a YAML or JSON file using LoadConfig.

Index ¶

Constants
func ValidateConfig(cfg Config) error
type CircuitBreakerConfig
type Condition
type Config
- func LoadConfig(path string) (*Config, error)
type EventHookFunc
type Gateway
- func New(cfg Config) (*Gateway, error)
type PluginConfig
type RetryConfig
type StrategyConfig
type StrategyMode
type Target

Constants ¶

View Source

const (
	SubjectRequestCompleted = "gateway.request.completed"
	SubjectRequestFailed    = "gateway.request.failed"
)

Event subject constants used when invoking gateway hooks.

Variables ¶

This section is empty.

Functions ¶

func ValidateConfig ¶

func ValidateConfig(cfg Config) error

ValidateConfig validates a Config for correctness.

Types ¶

type CircuitBreakerConfig ¶ added in v0.2.0

type CircuitBreakerConfig struct {
	// FailureThreshold is the number of consecutive failures before the circuit
	// opens. Defaults to 5.
	FailureThreshold int `json:"failure_threshold" yaml:"failure_threshold"`
	// SuccessThreshold is the number of consecutive successes in half-open state
	// required to close the circuit. Defaults to 1.
	SuccessThreshold int `json:"success_threshold" yaml:"success_threshold"`
	// Timeout is the duration the circuit stays open before transitioning to
	// half-open (e.g. "30s"). Defaults to "30s".
	Timeout string `json:"timeout" yaml:"timeout"`
}

CircuitBreakerConfig configures the per-provider circuit breaker.

type Condition ¶

type Condition struct {
	Key       string `json:"key" yaml:"key"`
	Value     string `json:"value" yaml:"value"`
	TargetKey string `json:"target_key" yaml:"target_key"`
}

Condition represents a condition for conditional routing.

type Config ¶

type Config struct {
	// Strategy defines how requests are routed (e.g., single, fallback, loadbalance).
	Strategy StrategyConfig `json:"strategy" yaml:"strategy"`
	// Targets is a list of provider targets to route requests to.
	Targets []Target `json:"targets" yaml:"targets"`
	// Plugins configuration (optional).
	Plugins []PluginConfig `json:"plugins,omitempty" yaml:"plugins,omitempty"`
}

Config holds the configuration for the AI Gateway.

func LoadConfig ¶

func LoadConfig(path string) (*Config, error)

LoadConfig reads and parses a config file from the given path. Supported formats: JSON (.json), YAML (.yaml, .yml).

type EventHookFunc ¶ added in v0.2.0

type EventHookFunc func(ctx context.Context, subject string, data map[string]interface{})

EventHookFunc is called asynchronously after a gateway event (request completed or failed). It replaces the old EventPublisher interface with a simpler function-based hook pattern.

type Gateway ¶

type Gateway struct {
	// contains filtered or unexported fields
}

Gateway is the main entry point for routing LLM requests.

func New ¶

func New(cfg Config) (*Gateway, error)

New creates a new Gateway instance with the given configuration.

func (*Gateway) AddHook ¶ added in v0.2.0

func (g *Gateway) AddHook(fn EventHookFunc)

AddHook registers an EventHookFunc that is called asynchronously on each completed or failed request. Multiple hooks may be registered; all are invoked for every event.

func (*Gateway) AllModels ¶ added in v0.2.0

func (g *Gateway) AllModels() []providers.ModelInfo

AllModels returns ModelInfo from all registered providers.

func (*Gateway) Close ¶

func (g *Gateway) Close() error

Close cleans up resources.

func (*Gateway) FindByModel ¶ added in v0.2.0

func (g *Gateway) FindByModel(model string) (providers.Provider, bool)

FindByModel returns the first registered provider that supports the given model.

func (*Gateway) Get ¶ added in v0.2.0

func (g *Gateway) Get(name string) (providers.Provider, bool)

Get satisfies providers.ProviderSource (alias for GetProvider).

func (*Gateway) GetConfig ¶

func (g *Gateway) GetConfig() Config

GetConfig returns a copy of the current configuration.

func (*Gateway) GetProvider ¶ added in v0.2.0

func (g *Gateway) GetProvider(name string) (providers.Provider, bool)

GetProvider returns a registered provider by name.

func (*Gateway) List ¶ added in v0.2.0

func (g *Gateway) List() []string

List satisfies providers.ProviderSource (alias for ListProviders).

func (*Gateway) ListProviders ¶ added in v0.2.0

func (g *Gateway) ListProviders() []string

ListProviders returns the names of all registered providers.

func (*Gateway) LoadPlugins ¶

func (g *Gateway) LoadPlugins() error

LoadPlugins initializes and registers plugins from the gateway configuration.

func (*Gateway) RegisterPlugin ¶

func (g *Gateway) RegisterPlugin(stage plugin.Stage, p plugin.Plugin) error

RegisterPlugin registers a plugin at the given lifecycle stage.

func (*Gateway) RegisterProvider ¶

func (g *Gateway) RegisterProvider(p providers.Provider)

RegisterProvider registers a provider with the gateway.

func (*Gateway) ReloadConfig ¶

func (g *Gateway) ReloadConfig(cfg Config) error

ReloadConfig validates and applies a new configuration, forcing strategy rebuild on next request.

func (*Gateway) Route ¶

func (g *Gateway) Route(ctx context.Context, req providers.Request) (*providers.Response, error)

Route routes a request to the appropriate provider based on the configuration.

func (*Gateway) RouteStream ¶

func (g *Gateway) RouteStream(ctx context.Context, req providers.Request) (<-chan providers.StreamChunk, error)

RouteStream runs before-request plugins then returns a streaming response channel. Provider resolution follows the configured strategy mode, then falls back to any registered provider that supports the requested model and streaming.

type PluginConfig ¶

type PluginConfig struct {
	Name    string                 `json:"name" yaml:"name"`
	Type    string                 `json:"type" yaml:"type"`
	Stage   string                 `json:"stage" yaml:"stage"`
	Enabled bool                   `json:"enabled" yaml:"enabled"`
	Config  map[string]interface{} `json:"config" yaml:"config"`
}

PluginConfig holds plugin configuration.

type RetryConfig ¶

type RetryConfig struct {
	Attempts int `json:"attempts" yaml:"attempts"`
}

RetryConfig defines retry behavior.

type StrategyConfig ¶

type StrategyConfig struct {
	Mode       StrategyMode `json:"mode" yaml:"mode"`
	Conditions []Condition  `json:"conditions,omitempty" yaml:"conditions,omitempty"` // For conditional routing
}

StrategyConfig defines the routing strategy.

type StrategyMode ¶

type StrategyMode string

StrategyMode represents the routing strategy mode.

const (
	ModeSingle      StrategyMode = "single"
	ModeFallback    StrategyMode = "fallback"
	ModeLoadBalance StrategyMode = "loadbalance"
	ModeConditional StrategyMode = "conditional"
)

StrategyMode constants define the supported routing strategies.

type Target ¶

type Target struct {
	// VirtualKey is the unique identifier for the provider (or a virtual key in the vault).
	VirtualKey string `json:"virtual_key" yaml:"virtual_key"`
	// Weight is used for load balancing.
	Weight float64 `json:"weight,omitempty" yaml:"weight,omitempty"`
	// Retry configuration for this target.
	Retry *RetryConfig `json:"retry,omitempty" yaml:"retry,omitempty"`
	// CircuitBreaker configuration for this target (optional).
	CircuitBreaker *CircuitBreakerConfig `json:"circuit_breaker,omitempty" yaml:"circuit_breaker,omitempty"`
}

Target represents a specific provider target.

Source Files ¶

View all Source files

Directories ¶

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

Path	Synopsis
cmd
ferrogw command Package main provides the HTTP handlers for legacy OpenAI completions endpoint.	Package main provides the HTTP handlers for legacy OpenAI completions endpoint.
ferrogw-cli command Package main provides the ferrogw-cli command-line tool for managing the FerroGateway.	Package main provides the ferrogw-cli command-line tool for managing the FerroGateway.
examples
basic command Package main demonstrates sending a request directly to any configured LLM provider.	Package main demonstrates sending a request directly to any configured LLM provider.
custom-plugin command Package main demonstrates how to write and register a custom plugin with FerroGateway.	Package main demonstrates how to write and register a custom plugin with FerroGateway.
embedded command Package main demonstrates embedding FerroGateway inside an existing Go HTTP server.	Package main demonstrates embedding FerroGateway inside an existing Go HTTP server.
fallback command Package main demonstrates the fallback routing strategy.	Package main demonstrates the fallback routing strategy.
loadbalance command Package main demonstrates weighted load balancing across multiple providers.	Package main demonstrates weighted load balancing across multiple providers.
with-circuit-breaker command Package main demonstrates per-provider circuit breaker configuration.	Package main demonstrates per-provider circuit breaker configuration.
with-guardrails command Package main demonstrates using built-in guardrail plugins.	Package main demonstrates using built-in guardrail plugins.
with-hooks command Package main demonstrates gateway event hooks.	Package main demonstrates gateway event hooks.
internal
admin Package admin provides HTTP handlers for the gateway administration API.	Package admin provides HTTP handlers for the gateway administration API.
cache Package cache provides the CacheEntry and Cache interface used by the response-cache plugin.	Package cache provides the CacheEntry and Cache interface used by the response-cache plugin.
circuitbreaker Package circuitbreaker implements the circuit-breaker pattern for provider calls.	Package circuitbreaker implements the circuit-breaker pattern for provider calls.
logging Package logging provides structured JSON logging with trace ID propagation.	Package logging provides structured JSON logging with trace ID propagation.
metrics Package metrics registers the Prometheus metrics used by the gateway.	Package metrics registers the Prometheus metrics used by the gateway.
plugins/cache Package cache provides a response-cache plugin that stores LLM responses in memory and serves them on exact-match cache hits, reducing provider cost and latency for repeated requests.	Package cache provides a response-cache plugin that stores LLM responses in memory and serves them on exact-match cache hits, reducing provider cost and latency for repeated requests.
plugins/logger Package logger provides a request-logger plugin that records each LLM request and response to standard output.	Package logger provides a request-logger plugin that records each LLM request and response to standard output.
plugins/maxtoken Package maxtoken provides a max-token guardrail plugin that caps the max_tokens and message count on outgoing requests.	Package maxtoken provides a max-token guardrail plugin that caps the max_tokens and message count on outgoing requests.
plugins/ratelimit Package ratelimit provides a gateway plugin that enforces per-request rate limits using an in-memory token bucket.	Package ratelimit provides a gateway plugin that enforces per-request rate limits using an in-memory token bucket.
plugins/wordfilter Package wordfilter provides a word-filter guardrail plugin that rejects requests containing blocked words.	Package wordfilter provides a word-filter guardrail plugin that rejects requests containing blocked words.
ratelimit Package ratelimit provides a simple in-memory token-bucket rate limiter.	Package ratelimit provides a simple in-memory token-bucket rate limiter.
strategies Package strategies implements the routing strategies used by the gateway.	Package strategies implements the routing strategies used by the gateway.
version Package version holds build-time version information for FerroGateway binaries.	Package version holds build-time version information for FerroGateway binaries.
plugin Package plugin defines the Plugin interface and the lifecycle stages used to hook into the gateway request pipeline.	Package plugin defines the Plugin interface and the lifecycle stages used to hook into the gateway request pipeline.
providers Package providers defines the Provider interface and shared data types used across all LLM provider implementations.	Package providers defines the Provider interface and shared data types used across all LLM provider implementations.