gateway

package
v0.0.0-...-7b314fb Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 26, 2026 License: GPL-3.0 Imports: 8 Imported by: 0

Documentation

Overview

Package gateway defines domain types and interfaces for the Gandalf LLM gateway. This package has no project imports -- it is the dependency root.

Index

Constants

View Source
const APIKeyPrefix = "gnd_"

APIKeyPrefix is the prefix for all Gandalf API keys.

Variables

View Source
var (
	ErrUnauthorized    = errors.New("unauthorized")
	ErrForbidden       = errors.New("forbidden")
	ErrNotFound        = errors.New("not found")
	ErrConflict        = errors.New("conflict")
	ErrRateLimited     = errors.New("rate limited")
	ErrQuotaExceeded   = errors.New("quota exceeded")
	ErrModelNotAllowed = errors.New("model not allowed")
	ErrProviderError   = errors.New("provider error")
	ErrBadRequest      = errors.New("bad request")
	ErrKeyExpired      = errors.New("api key expired")
	ErrKeyBlocked      = errors.New("api key blocked")
)

Sentinel errors for the gateway domain.

RolePermissions maps role names to their permission bitmasks.

Functions

func ContextWithIdentity

func ContextWithIdentity(ctx context.Context, id *Identity) context.Context

ContextWithIdentity stores the identity in the existing requestMeta if present, avoiding a new context.WithValue allocation. Falls back to creating new metadata if none exists (e.g., in tests).

func ContextWithRequestID

func ContextWithRequestID(ctx context.Context, id string) context.Context

ContextWithRequestID returns a context carrying the given request ID.

func HashKey

func HashKey(raw string) string

HashKey returns the hex-encoded SHA-256 hash of a raw API key.

func RequestIDFromContext

func RequestIDFromContext(ctx context.Context) string

RequestIDFromContext extracts the request ID from context.

func ValidRole

func ValidRole(role string) bool

ValidRole reports whether role is a known role name.

Types

type APIKey

type APIKey struct {
	ID            string     `json:"id"`
	KeyHash       string     `json:"-"`          // SHA-256 hex, never exposed
	KeyPrefix     string     `json:"key_prefix"` // first 8 chars for display
	UserID        string     `json:"user_id,omitempty"`
	TeamID        string     `json:"team_id,omitempty"`
	OrgID         string     `json:"org_id"`
	Role          string     `json:"role"`                     // "admin", "member", "viewer", "service_account"
	AllowedModels []string   `json:"allowed_models,omitempty"` // nil = inherit from team
	RPMLimit      *int64     `json:"rpm_limit,omitempty"`
	TPMLimit      *int64     `json:"tpm_limit,omitempty"`
	MaxBudget     *float64   `json:"max_budget,omitempty"`
	ExpiresAt     *time.Time `json:"expires_at,omitempty"`
	Blocked       bool       `json:"blocked"`
	LastUsedAt    *time.Time `json:"last_used_at,omitempty"`
	CreatedAt     time.Time  `json:"created_at"`
}

APIKey represents an API key for authentication.

type Authenticator

type Authenticator interface {
	Authenticate(ctx context.Context, r *http.Request) (*Identity, error)
}

Authenticator validates request credentials and returns the caller identity.

type ChatRequest

type ChatRequest struct {
	Model            string          `json:"model"`
	Messages         []Message       `json:"messages"`
	Temperature      *float64        `json:"temperature,omitempty"`
	TopP             *float64        `json:"top_p,omitempty"`
	N                int             `json:"n,omitempty"`
	Stream           bool            `json:"stream,omitempty"`
	StreamOptions    *StreamOptions  `json:"stream_options,omitempty"`
	Stop             json.RawMessage `json:"stop,omitempty"`
	MaxTokens        *int            `json:"max_tokens,omitempty"`
	PresencePenalty  *float64        `json:"presence_penalty,omitempty"`
	FrequencyPenalty *float64        `json:"frequency_penalty,omitempty"`
	Seed             *int            `json:"seed,omitempty"`
	User             string          `json:"user,omitempty"`
	Tools            json.RawMessage `json:"tools,omitempty"`
	ToolChoice       json.RawMessage `json:"tool_choice,omitempty"`
	ResponseFormat   json.RawMessage `json:"response_format,omitempty"`
}

ChatRequest represents an OpenAI-compatible chat completion request.

type ChatResponse

type ChatResponse struct {
	ID                string   `json:"id"`
	Object            string   `json:"object"`
	Created           int64    `json:"created"`
	Model             string   `json:"model"`
	Choices           []Choice `json:"choices"`
	Usage             *Usage   `json:"usage,omitempty"`
	SystemFingerprint string   `json:"system_fingerprint,omitempty"`
}

ChatResponse represents an OpenAI-compatible chat completion response.

type Choice

type Choice struct {
	Index        int     `json:"index"`
	Message      Message `json:"message"`
	FinishReason string  `json:"finish_reason"`
}

Choice represents a single completion choice.

type EmbeddingRequest

type EmbeddingRequest struct {
	Model          string          `json:"model"`
	Input          json.RawMessage `json:"input"`
	EncodingFormat string          `json:"encoding_format,omitempty"`
	User           string          `json:"user,omitempty"`
}

EmbeddingRequest represents an OpenAI-compatible embedding request.

type EmbeddingResponse

type EmbeddingResponse struct {
	Object string          `json:"object"`
	Data   json.RawMessage `json:"data"`
	Model  string          `json:"model"`
	Usage  *Usage          `json:"usage,omitempty"`
}

EmbeddingResponse represents an OpenAI-compatible embedding response.

type Identity

type Identity struct {
	Subject       string     `json:"subject"` // JWT sub or key prefix
	KeyID         string     `json:"key_id"`  // API key ID for per-key bucketing
	UserID        string     `json:"user_id"`
	TeamID        string     `json:"team_id"`
	OrgID         string     `json:"org_id"`
	Role          string     `json:"role"`        // "admin", "member", "viewer", "service_account"
	Perms         Permission `json:"-"`           // resolved bitmask
	AuthMethod    string     `json:"auth_method"` // "jwt" or "apikey"
	RPMLimit      int64      `json:"-"`           // effective RPM limit (0 = unlimited)
	TPMLimit      int64      `json:"-"`           // effective TPM limit (0 = unlimited)
	MaxBudget     float64    `json:"-"`           // max spend USD (0 = unlimited)
	AllowedModels []string   `json:"-"`           // nil = all models allowed
}

Identity is the authenticated caller context attached to request context. Populated by either JWT or API key auth.

func IdentityFromContext

func IdentityFromContext(ctx context.Context) *Identity

IdentityFromContext extracts the authenticated identity from context.

func (*Identity) Can

func (id *Identity) Can(p Permission) bool

Can reports whether the identity has the given permission.

func (*Identity) IsModelAllowed

func (id *Identity) IsModelAllowed(model string) bool

IsModelAllowed checks whether model is permitted for this identity. Returns true if AllowedModels is nil/empty (no restriction). Uses linear scan -- typically 0-5 entries, no allocation.

type Message

type Message struct {
	Role       string          `json:"role"`
	Content    json.RawMessage `json:"content"`
	Name       string          `json:"name,omitempty"`
	ToolCalls  json.RawMessage `json:"tool_calls,omitempty"`
	ToolCallID string          `json:"tool_call_id,omitempty"`
}

Message represents a chat message.

type NativeProxy

type NativeProxy interface {
	// ProxyRequest forwards a raw HTTP request to the provider's API.
	// path is the provider-relative path (e.g. "/messages").
	// The implementation handles auth headers, URL construction, and
	// response streaming (flush-on-read for SSE/NDJSON).
	ProxyRequest(ctx context.Context, w http.ResponseWriter, r *http.Request, path string) error
}

NativeProxy is an optional interface that providers can implement to support raw HTTP passthrough. The gateway authenticates and routes the request, then delegates the raw HTTP exchange to the provider. Checked via type assertion.

type Organization

type Organization struct {
	ID            string    `json:"id"`
	Name          string    `json:"name"`
	AllowedModels []string  `json:"allowed_models,omitempty"` // nil = all models
	RPMLimit      *int64    `json:"rpm_limit,omitempty"`
	TPMLimit      *int64    `json:"tpm_limit,omitempty"`
	MaxBudget     *float64  `json:"max_budget,omitempty"` // USD
	CreatedAt     time.Time `json:"created_at"`
}

Organization represents a top-level tenant.

type Permission

type Permission uint32

Permission is a bitmask representing authorization capabilities.

const (
	PermUseModels       Permission = 1 << iota // call /v1/chat/completions, /v1/embeddings
	PermManageOwnKeys                          // create/delete own API keys
	PermViewOwnUsage                           // view own usage stats
	PermViewAllUsage                           // view org-wide usage
	PermManageAllKeys                          // manage any key in the org
	PermManageProviders                        // configure upstream providers
	PermManageRoutes                           // configure model routing
	PermManageOrgs                             // manage orgs and teams
)

type Provider

type Provider interface {
	// Name returns the instance identifier (e.g., "openai-us", "openai-eu").
	Name() string
	// Type returns the wire format identifier (e.g., "openai", "anthropic").
	Type() string
	// ChatCompletion sends a non-streaming chat completion request.
	ChatCompletion(ctx context.Context, req *ChatRequest) (*ChatResponse, error)
	// ChatCompletionStream sends a streaming chat completion request.
	ChatCompletionStream(ctx context.Context, req *ChatRequest) (<-chan StreamChunk, error)
	// Embeddings generates embeddings for input text.
	Embeddings(ctx context.Context, req *EmbeddingRequest) (*EmbeddingResponse, error)
	// ListModels returns the list of available model IDs.
	ListModels(ctx context.Context) ([]string, error)
	// HealthCheck verifies connectivity to the provider.
	HealthCheck(ctx context.Context) error
}

Provider is the interface that all LLM provider adapters must implement.

type ProviderConfig

type ProviderConfig struct {
	ID        string   `json:"id"`
	Name      string   `json:"name"`
	Type      string   `json:"type"`
	BaseURL   string   `json:"base_url"`
	APIKeyEnc string   `json:"-"` // deprecated: no longer persisted, kept for schema compat
	Models    []string `json:"models"`
	Priority  int      `json:"priority"`
	Weight    int      `json:"weight"`
	Enabled   bool     `json:"enabled"`
	MaxRPS    int      `json:"max_rps"`
	TimeoutMs int      `json:"timeout_ms"`
}

ProviderConfig represents a configured upstream LLM provider.

type RollupFilter

type RollupFilter struct {
	OrgID  string
	KeyID  string
	Model  string
	Period string
	Since  string
	Until  string
}

RollupFilter selects rollups for querying.

type Route

type Route struct {
	ID         string          `json:"id"`
	ModelAlias string          `json:"model_alias"`
	Targets    json.RawMessage `json:"targets"` // []RouteTarget as JSON
	Strategy   string          `json:"strategy"`
	CacheTTLs  int             `json:"cache_ttl_s"`
}

Route maps a model alias to provider targets.

type RouteTarget

type RouteTarget struct {
	ProviderID string `json:"provider_id"`
	Model      string `json:"model"`
	Priority   int    `json:"priority"`
	Weight     int    `json:"weight"`
}

RouteTarget is a single target within a route.

type StreamChunk

type StreamChunk struct {
	Data  []byte // raw SSE data line, forwarded as-is when possible
	Usage *Usage // non-nil on final chunk
	Done  bool
	Err   error
}

StreamChunk represents a single chunk in a streaming response.

type StreamOptions

type StreamOptions struct {
	IncludeUsage bool `json:"include_usage,omitempty"`
}

StreamOptions controls streaming behavior.

type Team

type Team struct {
	ID            string   `json:"id"`
	OrgID         string   `json:"org_id"`
	Name          string   `json:"name"`
	AllowedModels []string `json:"allowed_models,omitempty"` // nil = inherit from org
	RPMLimit      *int64   `json:"rpm_limit,omitempty"`
	TPMLimit      *int64   `json:"tpm_limit,omitempty"`
	MaxBudget     *float64 `json:"max_budget,omitempty"`
}

Team is a subdivision within an organization.

type Usage

type Usage struct {
	PromptTokens     int `json:"prompt_tokens"`
	CompletionTokens int `json:"completion_tokens"`
	TotalTokens      int `json:"total_tokens"`
}

Usage represents token usage statistics.

type UsageFilter

type UsageFilter struct {
	OrgID  string
	KeyID  string
	Model  string
	Since  string // RFC3339
	Until  string // RFC3339
	Offset int
	Limit  int
}

UsageFilter selects usage records for querying.

type UsageRecord

type UsageRecord struct {
	ID               string    `json:"id"`
	KeyID            string    `json:"key_id"`
	UserID           string    `json:"user_id,omitempty"`
	TeamID           string    `json:"team_id,omitempty"`
	OrgID            string    `json:"org_id"`
	CallerJWTSub     string    `json:"caller_jwt_sub,omitempty"`
	CallerService    string    `json:"caller_service,omitempty"`
	Model            string    `json:"model"`
	ProviderID       string    `json:"provider_id"`
	PromptTokens     int       `json:"prompt_tokens"`
	CompletionTokens int       `json:"completion_tokens"`
	TotalTokens      int       `json:"total_tokens"`
	CostUSD          float64   `json:"cost_usd,omitempty"`
	Cached           bool      `json:"cached"`
	LatencyMs        int       `json:"latency_ms"`
	StatusCode       int       `json:"status_code"`
	RequestID        string    `json:"request_id"`
	CreatedAt        time.Time `json:"created_at"`
}

UsageRecord represents a single API usage event.

type UsageRollup

type UsageRollup struct {
	OrgID            string  `json:"org_id"`
	KeyID            string  `json:"key_id"`
	Model            string  `json:"model"`
	Period           string  `json:"period"` // "hourly", "daily"
	Bucket           string  `json:"bucket"` // ISO 8601 timestamp of bucket start
	RequestCount     int     `json:"request_count"`
	PromptTokens     int     `json:"prompt_tokens"`
	CompletionTokens int     `json:"completion_tokens"`
	TotalTokens      int     `json:"total_tokens"`
	CostUSD          float64 `json:"cost_usd"`
	CachedCount      int     `json:"cached_count"`
}

UsageRollup represents a pre-aggregated usage summary for a time bucket.

Directories

Path Synopsis
Package app implements application-level services for the Gandalf LLM gateway.
Package app implements application-level services for the Gandalf LLM gateway.
Package auth implements API key authentication for the Gandalf gateway.
Package auth implements API key authentication for the Gandalf gateway.
Package cache provides response caching for the gateway.
Package cache provides response caching for the gateway.
Package circuitbreaker implements a per-provider circuit breaker with a sliding-window error rate detector.
Package circuitbreaker implements a per-provider circuit breaker with a sliding-window error rate detector.
Package cloudauth provides http.RoundTripper decorators that inject authentication headers for cloud-hosted LLM providers (direct API keys, GCP OAuth, Azure Entra).
Package cloudauth provides http.RoundTripper decorators that inject authentication headers for cloud-hosted LLM providers (direct API keys, GCP OAuth, Azure Entra).
Package config provides configuration loading and database bootstrapping.
Package config provides configuration loading and database bootstrapping.
Package provider contains shared utilities for LLM provider adapters.
Package provider contains shared utilities for LLM provider adapters.
anthropic
Package anthropic implements the gateway.Provider adapter for the Anthropic API.
Package anthropic implements the gateway.Provider adapter for the Anthropic API.
gemini
Package gemini implements the gateway.Provider adapter for the Google Gemini API.
Package gemini implements the gateway.Provider adapter for the Google Gemini API.
ollama
Package ollama implements the gateway.Provider and gateway.NativeProxy adapters for local Ollama instances.
Package ollama implements the gateway.Provider and gateway.NativeProxy adapters for local Ollama instances.
openai
Package openai implements the gateway.Provider adapter for the OpenAI API.
Package openai implements the gateway.Provider adapter for the OpenAI API.
sseutil
Package sseutil provides shared SSE line reading utilities for provider adapters.
Package sseutil provides shared SSE line reading utilities for provider adapters.
Package ratelimit implements per-key RPM and TPM rate limiting with lazy-refill token buckets.
Package ratelimit implements per-key RPM and TPM rate limiting with lazy-refill token buckets.
Package server implements the HTTP transport layer for the Gandalf gateway.
Package server implements the HTTP transport layer for the Gandalf gateway.
Package storage defines persistence interfaces for the gateway.
Package storage defines persistence interfaces for the gateway.
sqlite
Package sqlite implements the storage interfaces using SQLite via modernc.org/sqlite.
Package sqlite implements the storage interfaces using SQLite via modernc.org/sqlite.
Package telemetry provides observability primitives for the Gandalf gateway.
Package telemetry provides observability primitives for the Gandalf gateway.
Package testutil provides configurable test fakes for gateway interfaces.
Package testutil provides configurable test fakes for gateway interfaces.
Package tokencount provides token estimation for TPM rate limiting and usage recording.
Package tokencount provides token estimation for TPM rate limiting and usage recording.
Package worker provides background task infrastructure for the gateway.
Package worker provides background task infrastructure for the gateway.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL