Documentation
¶
Overview ¶
Package gateway defines domain types and interfaces for the Gandalf LLM gateway. This package has no project imports -- it is the dependency root.
Index ¶
- Constants
- Variables
- func ContextWithIdentity(ctx context.Context, id *Identity) context.Context
- func ContextWithRequestID(ctx context.Context, id string) context.Context
- func HashKey(raw string) string
- func RequestIDFromContext(ctx context.Context) string
- func ValidRole(role string) bool
- type APIKey
- type Authenticator
- type ChatRequest
- type ChatResponse
- type Choice
- type EmbeddingRequest
- type EmbeddingResponse
- type Identity
- type Message
- type NativeProxy
- type Organization
- type Permission
- type Provider
- type ProviderConfig
- type RollupFilter
- type Route
- type RouteTarget
- type StreamChunk
- type StreamOptions
- type Team
- type Usage
- type UsageFilter
- type UsageRecord
- type UsageRollup
Constants ¶
const APIKeyPrefix = "gnd_"
APIKeyPrefix is the prefix for all Gandalf API keys.
Variables ¶
var ( ErrForbidden = errors.New("forbidden") ErrNotFound = errors.New("not found") ErrConflict = errors.New("conflict") ErrRateLimited = errors.New("rate limited") ErrQuotaExceeded = errors.New("quota exceeded") ErrModelNotAllowed = errors.New("model not allowed") ErrProviderError = errors.New("provider error") ErrBadRequest = errors.New("bad request") ErrKeyExpired = errors.New("api key expired") ErrKeyBlocked = errors.New("api key blocked") )
Sentinel errors for the gateway domain.
var RolePermissions = map[string]Permission{ "admin": PermUseModels | PermManageOwnKeys | PermViewOwnUsage | PermViewAllUsage | PermManageAllKeys | PermManageProviders | PermManageRoutes | PermManageOrgs, "member": PermUseModels | PermManageOwnKeys | PermViewOwnUsage, "viewer": PermViewOwnUsage | PermViewAllUsage, "service_account": PermUseModels, }
RolePermissions maps role names to their permission bitmasks.
Functions ¶
func ContextWithIdentity ¶
ContextWithIdentity stores the identity in the existing requestMeta if present, avoiding a new context.WithValue allocation. Falls back to creating new metadata if none exists (e.g., in tests).
func ContextWithRequestID ¶
ContextWithRequestID returns a context carrying the given request ID.
func RequestIDFromContext ¶
RequestIDFromContext extracts the request ID from context.
Types ¶
type APIKey ¶
type APIKey struct {
ID string `json:"id"`
KeyHash string `json:"-"` // SHA-256 hex, never exposed
KeyPrefix string `json:"key_prefix"` // first 8 chars for display
UserID string `json:"user_id,omitempty"`
TeamID string `json:"team_id,omitempty"`
OrgID string `json:"org_id"`
Role string `json:"role"` // "admin", "member", "viewer", "service_account"
AllowedModels []string `json:"allowed_models,omitempty"` // nil = inherit from team
RPMLimit *int64 `json:"rpm_limit,omitempty"`
TPMLimit *int64 `json:"tpm_limit,omitempty"`
MaxBudget *float64 `json:"max_budget,omitempty"`
ExpiresAt *time.Time `json:"expires_at,omitempty"`
Blocked bool `json:"blocked"`
LastUsedAt *time.Time `json:"last_used_at,omitempty"`
CreatedAt time.Time `json:"created_at"`
}
APIKey represents an API key for authentication.
type Authenticator ¶
type Authenticator interface {
Authenticate(ctx context.Context, r *http.Request) (*Identity, error)
}
Authenticator validates request credentials and returns the caller identity.
type ChatRequest ¶
type ChatRequest struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Temperature *float64 `json:"temperature,omitempty"`
TopP *float64 `json:"top_p,omitempty"`
N int `json:"n,omitempty"`
Stream bool `json:"stream,omitempty"`
StreamOptions *StreamOptions `json:"stream_options,omitempty"`
Stop json.RawMessage `json:"stop,omitempty"`
MaxTokens *int `json:"max_tokens,omitempty"`
PresencePenalty *float64 `json:"presence_penalty,omitempty"`
FrequencyPenalty *float64 `json:"frequency_penalty,omitempty"`
Seed *int `json:"seed,omitempty"`
User string `json:"user,omitempty"`
Tools json.RawMessage `json:"tools,omitempty"`
ToolChoice json.RawMessage `json:"tool_choice,omitempty"`
ResponseFormat json.RawMessage `json:"response_format,omitempty"`
}
ChatRequest represents an OpenAI-compatible chat completion request.
type ChatResponse ¶
type ChatResponse struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
Model string `json:"model"`
Choices []Choice `json:"choices"`
Usage *Usage `json:"usage,omitempty"`
SystemFingerprint string `json:"system_fingerprint,omitempty"`
}
ChatResponse represents an OpenAI-compatible chat completion response.
type Choice ¶
type Choice struct {
Index int `json:"index"`
Message Message `json:"message"`
FinishReason string `json:"finish_reason"`
}
Choice represents a single completion choice.
type EmbeddingRequest ¶
type EmbeddingRequest struct {
Model string `json:"model"`
Input json.RawMessage `json:"input"`
EncodingFormat string `json:"encoding_format,omitempty"`
User string `json:"user,omitempty"`
}
EmbeddingRequest represents an OpenAI-compatible embedding request.
type EmbeddingResponse ¶
type EmbeddingResponse struct {
Object string `json:"object"`
Data json.RawMessage `json:"data"`
Model string `json:"model"`
Usage *Usage `json:"usage,omitempty"`
}
EmbeddingResponse represents an OpenAI-compatible embedding response.
type Identity ¶
type Identity struct {
Subject string `json:"subject"` // JWT sub or key prefix
KeyID string `json:"key_id"` // API key ID for per-key bucketing
UserID string `json:"user_id"`
TeamID string `json:"team_id"`
OrgID string `json:"org_id"`
Role string `json:"role"` // "admin", "member", "viewer", "service_account"
Perms Permission `json:"-"` // resolved bitmask
AuthMethod string `json:"auth_method"` // "jwt" or "apikey"
RPMLimit int64 `json:"-"` // effective RPM limit (0 = unlimited)
TPMLimit int64 `json:"-"` // effective TPM limit (0 = unlimited)
MaxBudget float64 `json:"-"` // max spend USD (0 = unlimited)
AllowedModels []string `json:"-"` // nil = all models allowed
}
Identity is the authenticated caller context attached to request context. Populated by either JWT or API key auth.
func IdentityFromContext ¶
IdentityFromContext extracts the authenticated identity from context.
func (*Identity) Can ¶
func (id *Identity) Can(p Permission) bool
Can reports whether the identity has the given permission.
func (*Identity) IsModelAllowed ¶
IsModelAllowed checks whether model is permitted for this identity. Returns true if AllowedModels is nil/empty (no restriction). Uses linear scan -- typically 0-5 entries, no allocation.
type Message ¶
type Message struct {
Role string `json:"role"`
Content json.RawMessage `json:"content"`
Name string `json:"name,omitempty"`
ToolCalls json.RawMessage `json:"tool_calls,omitempty"`
ToolCallID string `json:"tool_call_id,omitempty"`
}
Message represents a chat message.
type NativeProxy ¶
type NativeProxy interface {
// ProxyRequest forwards a raw HTTP request to the provider's API.
// path is the provider-relative path (e.g. "/messages").
// The implementation handles auth headers, URL construction, and
// response streaming (flush-on-read for SSE/NDJSON).
ProxyRequest(ctx context.Context, w http.ResponseWriter, r *http.Request, path string) error
}
NativeProxy is an optional interface that providers can implement to support raw HTTP passthrough. The gateway authenticates and routes the request, then delegates the raw HTTP exchange to the provider. Checked via type assertion.
type Organization ¶
type Organization struct {
ID string `json:"id"`
Name string `json:"name"`
AllowedModels []string `json:"allowed_models,omitempty"` // nil = all models
RPMLimit *int64 `json:"rpm_limit,omitempty"`
TPMLimit *int64 `json:"tpm_limit,omitempty"`
MaxBudget *float64 `json:"max_budget,omitempty"` // USD
CreatedAt time.Time `json:"created_at"`
}
Organization represents a top-level tenant.
type Permission ¶
type Permission uint32
Permission is a bitmask representing authorization capabilities.
const ( PermUseModels Permission = 1 << iota // call /v1/chat/completions, /v1/embeddings PermManageOwnKeys // create/delete own API keys PermViewOwnUsage // view own usage stats PermViewAllUsage // view org-wide usage PermManageAllKeys // manage any key in the org PermManageProviders // configure upstream providers PermManageRoutes // configure model routing PermManageOrgs // manage orgs and teams )
type Provider ¶
type Provider interface {
// Name returns the instance identifier (e.g., "openai-us", "openai-eu").
Name() string
// Type returns the wire format identifier (e.g., "openai", "anthropic").
Type() string
// ChatCompletion sends a non-streaming chat completion request.
ChatCompletion(ctx context.Context, req *ChatRequest) (*ChatResponse, error)
// ChatCompletionStream sends a streaming chat completion request.
ChatCompletionStream(ctx context.Context, req *ChatRequest) (<-chan StreamChunk, error)
// Embeddings generates embeddings for input text.
Embeddings(ctx context.Context, req *EmbeddingRequest) (*EmbeddingResponse, error)
// ListModels returns the list of available model IDs.
ListModels(ctx context.Context) ([]string, error)
// HealthCheck verifies connectivity to the provider.
HealthCheck(ctx context.Context) error
}
Provider is the interface that all LLM provider adapters must implement.
type ProviderConfig ¶
type ProviderConfig struct {
ID string `json:"id"`
Name string `json:"name"`
Type string `json:"type"`
BaseURL string `json:"base_url"`
APIKeyEnc string `json:"-"` // deprecated: no longer persisted, kept for schema compat
Models []string `json:"models"`
Priority int `json:"priority"`
Weight int `json:"weight"`
Enabled bool `json:"enabled"`
MaxRPS int `json:"max_rps"`
TimeoutMs int `json:"timeout_ms"`
}
ProviderConfig represents a configured upstream LLM provider.
type RollupFilter ¶
type RollupFilter struct {
OrgID string
KeyID string
Model string
Period string
Since string
Until string
}
RollupFilter selects rollups for querying.
type Route ¶
type Route struct {
ID string `json:"id"`
ModelAlias string `json:"model_alias"`
Targets json.RawMessage `json:"targets"` // []RouteTarget as JSON
Strategy string `json:"strategy"`
CacheTTLs int `json:"cache_ttl_s"`
}
Route maps a model alias to provider targets.
type RouteTarget ¶
type RouteTarget struct {
ProviderID string `json:"provider_id"`
Model string `json:"model"`
Priority int `json:"priority"`
Weight int `json:"weight"`
}
RouteTarget is a single target within a route.
type StreamChunk ¶
type StreamChunk struct {
Data []byte // raw SSE data line, forwarded as-is when possible
Usage *Usage // non-nil on final chunk
Done bool
Err error
}
StreamChunk represents a single chunk in a streaming response.
type StreamOptions ¶
type StreamOptions struct {
IncludeUsage bool `json:"include_usage,omitempty"`
}
StreamOptions controls streaming behavior.
type Team ¶
type Team struct {
ID string `json:"id"`
OrgID string `json:"org_id"`
Name string `json:"name"`
AllowedModels []string `json:"allowed_models,omitempty"` // nil = inherit from org
RPMLimit *int64 `json:"rpm_limit,omitempty"`
TPMLimit *int64 `json:"tpm_limit,omitempty"`
MaxBudget *float64 `json:"max_budget,omitempty"`
}
Team is a subdivision within an organization.
type Usage ¶
type Usage struct {
PromptTokens int `json:"prompt_tokens"`
CompletionTokens int `json:"completion_tokens"`
TotalTokens int `json:"total_tokens"`
}
Usage represents token usage statistics.
type UsageFilter ¶
type UsageFilter struct {
OrgID string
KeyID string
Model string
Since string // RFC3339
Until string // RFC3339
Offset int
Limit int
}
UsageFilter selects usage records for querying.
type UsageRecord ¶
type UsageRecord struct {
ID string `json:"id"`
KeyID string `json:"key_id"`
UserID string `json:"user_id,omitempty"`
TeamID string `json:"team_id,omitempty"`
OrgID string `json:"org_id"`
CallerJWTSub string `json:"caller_jwt_sub,omitempty"`
CallerService string `json:"caller_service,omitempty"`
Model string `json:"model"`
ProviderID string `json:"provider_id"`
PromptTokens int `json:"prompt_tokens"`
CompletionTokens int `json:"completion_tokens"`
TotalTokens int `json:"total_tokens"`
CostUSD float64 `json:"cost_usd,omitempty"`
Cached bool `json:"cached"`
LatencyMs int `json:"latency_ms"`
StatusCode int `json:"status_code"`
RequestID string `json:"request_id"`
CreatedAt time.Time `json:"created_at"`
}
UsageRecord represents a single API usage event.
type UsageRollup ¶
type UsageRollup struct {
OrgID string `json:"org_id"`
KeyID string `json:"key_id"`
Model string `json:"model"`
Period string `json:"period"` // "hourly", "daily"
Bucket string `json:"bucket"` // ISO 8601 timestamp of bucket start
RequestCount int `json:"request_count"`
PromptTokens int `json:"prompt_tokens"`
CompletionTokens int `json:"completion_tokens"`
TotalTokens int `json:"total_tokens"`
CostUSD float64 `json:"cost_usd"`
CachedCount int `json:"cached_count"`
}
UsageRollup represents a pre-aggregated usage summary for a time bucket.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package app implements application-level services for the Gandalf LLM gateway.
|
Package app implements application-level services for the Gandalf LLM gateway. |
|
Package auth implements API key authentication for the Gandalf gateway.
|
Package auth implements API key authentication for the Gandalf gateway. |
|
Package cache provides response caching for the gateway.
|
Package cache provides response caching for the gateway. |
|
Package circuitbreaker implements a per-provider circuit breaker with a sliding-window error rate detector.
|
Package circuitbreaker implements a per-provider circuit breaker with a sliding-window error rate detector. |
|
Package cloudauth provides http.RoundTripper decorators that inject authentication headers for cloud-hosted LLM providers (direct API keys, GCP OAuth, Azure Entra).
|
Package cloudauth provides http.RoundTripper decorators that inject authentication headers for cloud-hosted LLM providers (direct API keys, GCP OAuth, Azure Entra). |
|
Package config provides configuration loading and database bootstrapping.
|
Package config provides configuration loading and database bootstrapping. |
|
Package provider contains shared utilities for LLM provider adapters.
|
Package provider contains shared utilities for LLM provider adapters. |
|
anthropic
Package anthropic implements the gateway.Provider adapter for the Anthropic API.
|
Package anthropic implements the gateway.Provider adapter for the Anthropic API. |
|
gemini
Package gemini implements the gateway.Provider adapter for the Google Gemini API.
|
Package gemini implements the gateway.Provider adapter for the Google Gemini API. |
|
ollama
Package ollama implements the gateway.Provider and gateway.NativeProxy adapters for local Ollama instances.
|
Package ollama implements the gateway.Provider and gateway.NativeProxy adapters for local Ollama instances. |
|
openai
Package openai implements the gateway.Provider adapter for the OpenAI API.
|
Package openai implements the gateway.Provider adapter for the OpenAI API. |
|
sseutil
Package sseutil provides shared SSE line reading utilities for provider adapters.
|
Package sseutil provides shared SSE line reading utilities for provider adapters. |
|
Package ratelimit implements per-key RPM and TPM rate limiting with lazy-refill token buckets.
|
Package ratelimit implements per-key RPM and TPM rate limiting with lazy-refill token buckets. |
|
Package server implements the HTTP transport layer for the Gandalf gateway.
|
Package server implements the HTTP transport layer for the Gandalf gateway. |
|
Package storage defines persistence interfaces for the gateway.
|
Package storage defines persistence interfaces for the gateway. |
|
sqlite
Package sqlite implements the storage interfaces using SQLite via modernc.org/sqlite.
|
Package sqlite implements the storage interfaces using SQLite via modernc.org/sqlite. |
|
Package telemetry provides observability primitives for the Gandalf gateway.
|
Package telemetry provides observability primitives for the Gandalf gateway. |
|
Package testutil provides configurable test fakes for gateway interfaces.
|
Package testutil provides configurable test fakes for gateway interfaces. |
|
Package tokencount provides token estimation for TPM rate limiting and usage recording.
|
Package tokencount provides token estimation for TPM rate limiting and usage recording. |
|
Package worker provides background task infrastructure for the gateway.
|
Package worker provides background task infrastructure for the gateway. |