Documentation
¶
Overview ¶
Package llmbridge provides a unified interface to multiple LLM providers.
Every provider implements the Provider interface, so you can swap between OpenAI, Anthropic, Ollama, LM Studio, or any OpenAI-compatible endpoint without changing your application code.
Quick start:
p := llmbridge.NewOpenAI("gpt-4o-mini", os.Getenv("OPENAI_API_KEY"))
resp, err := p.Complete(ctx, llmbridge.Request{
System: "You are a helpful assistant.",
Messages: []llmbridge.Message{{Role: "user", Content: "Hello!"}},
})
Index ¶
- Constants
- Variables
- func AComplete(ctx context.Context, p Provider, req Request) <-chan AsyncResult
- func CompletionCost(resp *types.Response) (float64, error)
- func Embed(ctx context.Context, p EmbedProvider, texts []string) ([][]float64, error)
- func EmbeddingCost(provider, model string, tokens int) (float64, error)
- func ResolveModel(req types.Request, providerName string) string
- func SanitizeRequest(req types.Request) types.Request
- func ValidateModel(modelName string) bool
- type AsyncResult
- type BatchResult
- type CallType
- type Delta
- type EmbedProvider
- type ErrAuth
- type ErrProvider
- type ErrRateLimit
- type ErrTimeout
- type GeneratedImage
- type Handler
- type HealthStatus
- type ImageGenerator
- type ImageRequest
- type ImageResponse
- type Message
- type Middleware
- type ModelInfo
- type ModerationRequest
- type ModerationResponse
- type ModerationResult
- type Moderator
- type Property
- type Provider
- type Request
- type RerankRequest
- type RerankResponse
- type RerankResult
- type Reranker
- type Response
- type RetryPolicy
- type Router
- type RouterOption
- func WithAutoVisionRouting() RouterOption
- func WithCircuitBreaker(threshold int, cooldown time.Duration) RouterOption
- func WithContentPolicyFallback(enabled bool) RouterOption
- func WithContextWindowFallback(enabled bool) RouterOption
- func WithHealthChecks(interval time.Duration) RouterOption
- func WithMaxCostPerRequest(dollars float64) RouterOption
- func WithRequiredTags(tags []string) RouterOption
- func WithRetryPolicy(p RetryPolicy) RouterOption
- func WithRoutingGroups(groups []RoutingGroup) RouterOption
- func WithStrategy(s Strategy) RouterOption
- func WithTrafficSplit(groups []TrafficSplitGroup) RouterOption
- func WithWeightedStrategy() RouterOption
- type RoutingGroup
- type Schema
- type Session
- type SpeechProvider
- type SpeechRequest
- type SpeechResponse
- type Strategy
- type Streamer
- type TaggedProvider
- type TextCompleter
- type TextRequest
- type TextResponse
- type Tool
- type ToolCall
- type TrafficSplitGroup
- type Transcriber
- type TranscriptionRequest
- type TranscriptionResponse
- type UsageData
Constants ¶
const DefaultHTTPTimeout = 60 // seconds
Default HTTP timeout for provider requests.
const Version = "0.3.0"
Version is the current module version.
Variables ¶
var DefaultModels = map[string]string{
"openai": "gpt-4o-mini",
"anthropic": "claude-sonnet-4-6",
"gemini": "gemini-2.0-flash",
"azure": "gpt-4o",
"cohere": "command-r-plus-08-2024",
"bedrock": "anthropic.claude-3-5-sonnet-20241022-v2:0",
"ollama": "llama3.2",
"groq": "llama-3.3-70b-versatile",
"together": "meta-llama/Llama-3-8b-chat-hf",
"lmstudio": "local-model",
"deepseek": "deepseek-chat",
"perplexity": "llama-3.1-sonar-large-128k-online",
"fireworks": "accounts/fireworks/models/llama-v3p1-70b-instruct",
"cerebras": "llama3.1-70b",
"sambanova": "Meta-Llama-3.1-70B-Instruct",
"mistral": "mistral-large-latest",
"hyperbolic": "meta-llama/Meta-Llama-3.1-70B-Instruct",
"novita": "meta-llama/llama-3.1-70b-instruct",
"xai": "grok-2-latest",
}
Default model aliases used when no model is specified on a Request.
var DefaultRetryPolicy = RetryPolicy{ MaxAttempts: 2, InitialDelay: time.Second, Multiplier: 2.0, MaxDelay: 8 * time.Second, }
DefaultRetryPolicy is a sensible starting point: two attempts per provider with 1-second initial backoff doubling to at most 8 seconds.
var ModelInfoDB = map[string]ModelInfo{ "gpt-4o": { MaxTokens: 128000, MaxInputTokens: 128000, SupportsFunctionCalling: true, SupportsVision: true, SupportsStreaming: true, InputCostPerToken: 0.0000025, OutputCostPerToken: 0.000010, }, "gpt-4o-mini": { MaxTokens: 128000, MaxInputTokens: 128000, SupportsFunctionCalling: true, SupportsVision: true, SupportsStreaming: true, InputCostPerToken: 0.00000015, OutputCostPerToken: 0.0000006, }, "gpt-4-turbo": { MaxTokens: 128000, MaxInputTokens: 128000, SupportsFunctionCalling: true, SupportsVision: true, SupportsStreaming: true, InputCostPerToken: 0.00001, OutputCostPerToken: 0.00003, }, "gpt-3.5-turbo": { MaxTokens: 16385, MaxInputTokens: 16385, SupportsFunctionCalling: true, SupportsStreaming: true, InputCostPerToken: 0.0000005, OutputCostPerToken: 0.0000015, }, "o1": { MaxTokens: 200000, MaxInputTokens: 200000, SupportsFunctionCalling: true, SupportsStreaming: true, InputCostPerToken: 0.000015, OutputCostPerToken: 0.00006, }, "claude-opus-4-7": { MaxTokens: 200000, MaxInputTokens: 200000, SupportsFunctionCalling: true, SupportsVision: true, SupportsStreaming: true, InputCostPerToken: 0.000015, OutputCostPerToken: 0.000075, }, "claude-sonnet-4-6": { MaxTokens: 200000, MaxInputTokens: 200000, SupportsFunctionCalling: true, SupportsVision: true, SupportsStreaming: true, InputCostPerToken: 0.000003, OutputCostPerToken: 0.000015, }, "claude-haiku-4-5-20251001": { MaxTokens: 200000, MaxInputTokens: 200000, SupportsFunctionCalling: true, SupportsVision: true, SupportsStreaming: true, InputCostPerToken: 0.0000008, OutputCostPerToken: 0.000004, }, "claude-3-5-sonnet-20241022": { MaxTokens: 200000, MaxInputTokens: 200000, SupportsFunctionCalling: true, SupportsVision: true, SupportsStreaming: true, InputCostPerToken: 0.000003, OutputCostPerToken: 0.000015, }, "claude-3-haiku-20240307": { MaxTokens: 200000, MaxInputTokens: 200000, SupportsFunctionCalling: true, SupportsVision: true, SupportsStreaming: true, InputCostPerToken: 0.00000025, OutputCostPerToken: 0.00000125, }, "gemini-2.0-flash": { MaxTokens: 1048576, MaxInputTokens: 1048576, SupportsFunctionCalling: true, SupportsVision: true, SupportsStreaming: true, InputCostPerToken: 0.0000001, OutputCostPerToken: 0.0000004, }, "gemini-1.5-pro": { MaxTokens: 2097152, MaxInputTokens: 2097152, SupportsFunctionCalling: true, SupportsVision: true, SupportsStreaming: true, InputCostPerToken: 0.00000125, OutputCostPerToken: 0.000005, }, "gemini-1.5-flash": { MaxTokens: 1048576, MaxInputTokens: 1048576, SupportsFunctionCalling: true, SupportsVision: true, SupportsStreaming: true, InputCostPerToken: 0.000000075, OutputCostPerToken: 0.0000003, }, "command-r-plus-08-2024": { MaxTokens: 128000, MaxInputTokens: 128000, SupportsFunctionCalling: true, SupportsStreaming: true, InputCostPerToken: 0.0000025, OutputCostPerToken: 0.00001, }, "command-r-08-2024": { MaxTokens: 128000, MaxInputTokens: 128000, SupportsFunctionCalling: true, SupportsStreaming: true, InputCostPerToken: 0.00000015, OutputCostPerToken: 0.0000006, }, "deepseek-chat": { MaxTokens: 65536, MaxInputTokens: 65536, SupportsFunctionCalling: true, SupportsStreaming: true, InputCostPerToken: 0.00000027, OutputCostPerToken: 0.0000011, }, "deepseek-coder": { MaxTokens: 65536, MaxInputTokens: 65536, SupportsFunctionCalling: true, SupportsStreaming: true, InputCostPerToken: 0.00000014, OutputCostPerToken: 0.00000028, }, "mistral-large-latest": { MaxTokens: 131072, MaxInputTokens: 131072, SupportsFunctionCalling: true, SupportsStreaming: true, InputCostPerToken: 0.000003, OutputCostPerToken: 0.000009, }, "mistral-small-latest": { MaxTokens: 131072, MaxInputTokens: 131072, SupportsFunctionCalling: true, SupportsStreaming: true, InputCostPerToken: 0.000001, OutputCostPerToken: 0.000003, }, "llama-3.3-70b-versatile": { MaxTokens: 128000, MaxInputTokens: 128000, SupportsFunctionCalling: true, SupportsStreaming: true, InputCostPerToken: 0.00000059, OutputCostPerToken: 0.00000079, }, "llama-3.1-8b-instant": { MaxTokens: 128000, MaxInputTokens: 128000, SupportsFunctionCalling: true, SupportsStreaming: true, InputCostPerToken: 0.00000005, OutputCostPerToken: 0.00000008, }, }
ModelInfoDB is a static registry of known models and their capabilities. Sourced from public provider documentation; update as new models are released.
var SupportedProviders = []string{
"openai",
"anthropic",
"gemini",
"azure",
"cohere",
"bedrock",
"ollama",
"lmstudio",
"groq",
"together",
"deepseek",
"perplexity",
"fireworks",
"cerebras",
"sambanova",
"mistral",
"hyperbolic",
"novita",
"xai",
}
SupportedProviders lists the built-in provider names.
Functions ¶
func AComplete ¶
func AComplete(ctx context.Context, p Provider, req Request) <-chan AsyncResult
AComplete sends a completion request asynchronously and returns a channel that will receive exactly one AsyncResult.
func CompletionCost ¶
CompletionCost calculates the estimated cost in USD for a completed response. It dispatches to the provider-specific pricing table based on resp.Provider. Returns 0 and an error if the provider or model is not in the pricing tables.
func EmbeddingCost ¶
EmbeddingCost calculates the cost for an embedding request. provider is the provider name; tokens is the input token count.
func ResolveModel ¶
ResolveModel returns req.Model if non-empty, otherwise the provider's default.
func SanitizeRequest ¶
SanitizeRequest applies provider-safe defaults to req and trims whitespace. It does not mutate the original; it returns a copy.
func ValidateModel ¶
ValidateModel returns true if modelName is in the built-in registry.
Types ¶
type AsyncResult ¶
type AsyncResult = types.AsyncResult
AsyncResult wraps a Response and error for async operations.
type BatchResult ¶
type BatchResult = types.BatchResult
BatchResult holds the outcome of one request in a BatchComplete call.
func BatchComplete ¶
func BatchComplete(ctx context.Context, p Provider, reqs []Request) []BatchResult
BatchComplete sends all requests concurrently and returns one BatchResult per request. Results are ordered by their original index regardless of completion order.
type EmbedProvider ¶
type EmbedProvider = base.EmbedProvider
EmbedProvider is the optional interface for embedding generation.
type ErrAuth ¶
type ErrAuth = exceptions.AuthenticationError
ErrAuth indicates an authentication or authorization failure. Deprecated: use exceptions.AuthenticationError directly.
type ErrProvider ¶
type ErrProvider = exceptions.ProviderError
ErrProvider wraps a provider-level failure. Deprecated: use exceptions.ProviderError directly.
type ErrRateLimit ¶
type ErrRateLimit = exceptions.RateLimitError
ErrRateLimit indicates the provider throttled the request. Deprecated: use exceptions.RateLimitError directly.
type ErrTimeout ¶
type ErrTimeout = exceptions.TimeoutError
ErrTimeout indicates the request exceeded the HTTP deadline. Deprecated: use exceptions.TimeoutError directly.
type GeneratedImage ¶
type GeneratedImage = types.GeneratedImage
GeneratedImage is a single image returned by an image generation call.
type HealthStatus ¶
type HealthStatus struct {
Healthy bool
LastCheck time.Time
LastError error
Failures int // consecutive failure count (reset on success)
CooldownUntil time.Time // skip provider until this time (circuit breaker)
}
HealthStatus records the last known health of a provider.
type ImageGenerator ¶
type ImageGenerator = base.ImageGenerator
ImageGenerator is the optional interface for image generation.
type ImageRequest ¶
type ImageRequest = types.ImageRequest
ImageRequest is the input to an image generation call.
type ImageResponse ¶
type ImageResponse = types.ImageResponse
ImageResponse is the output from an image generation call.
func ImageGenerate ¶
func ImageGenerate(ctx context.Context, p ImageGenerator, req ImageRequest) (*ImageResponse, error)
ImageGenerate generates images from a text prompt using the given ImageGenerator.
type Middleware ¶
Middleware wraps a Handler to add cross-cutting behavior such as logging, metrics, caching, request transformation, or response post-processing.
func Logger(log *slog.Logger) llmbridge.Middleware {
return func(ctx context.Context, req llmbridge.Request, next llmbridge.Handler) (*llmbridge.Response, error) {
log.Info("llm request", "provider", ctx.Value("provider"))
resp, err := next(ctx, req)
log.Info("llm response", "tokens", len(resp.Content), "err", err)
return resp, err
}
}
type ModelInfo ¶
ModelInfo describes the capabilities and pricing of a specific model.
func GetModelInfo ¶
GetModelInfo looks up metadata for a known model. Returns (ModelInfo{}, false) for unrecognized model names.
type ModerationRequest ¶
type ModerationRequest = types.ModerationRequest
ModerationRequest is the input to a content moderation call.
type ModerationResponse ¶
type ModerationResponse = types.ModerationResponse
ModerationResponse is the output from a content moderation call.
func Moderate ¶
func Moderate(ctx context.Context, p Moderator, req ModerationRequest) (*ModerationResponse, error)
Moderate classifies content for policy violations using the given Moderator.
type ModerationResult ¶
type ModerationResult = types.ModerationResult
ModerationResult is the moderation verdict for a single input.
type Provider ¶
Provider is the unified interface every LLM backend must satisfy.
func Chain ¶
func Chain(provider Provider, mw ...Middleware) Provider
Chain wraps provider with the given middleware in order: the first middleware in the slice is the outermost (first to run on a request, last on a response). The returned Provider satisfies the Provider interface; it is NOT a Streamer even when the inner provider implements streaming.
type RerankRequest ¶
type RerankRequest = types.RerankRequest
RerankRequest is the input to a document reranking call.
type RerankResponse ¶
type RerankResponse = types.RerankResponse
RerankResponse is the output from a document reranking call.
func Rerank ¶
func Rerank(ctx context.Context, p Reranker, req RerankRequest) (*RerankResponse, error)
Rerank reorders documents by relevance to a query using the given Reranker.
type RerankResult ¶
type RerankResult = types.RerankResult
RerankResult is a single ranked document in a RerankResponse.
type RetryPolicy ¶
type RetryPolicy struct {
// MaxAttempts is the number of tries per provider. 1 = no retry.
MaxAttempts int
// InitialDelay before the first retry.
InitialDelay time.Duration
// Multiplier applied to the delay on each subsequent retry.
Multiplier float64
// MaxDelay caps the backoff growth.
MaxDelay time.Duration
}
RetryPolicy controls per-provider retry behavior inside the Router.
type Router ¶
type Router struct {
// contains filtered or unexported fields
}
Router dispatches requests across multiple Provider instances with automatic failover and load balancing. It implements Provider itself.
func NewRouter ¶
func NewRouter(providers []Provider, opts ...RouterOption) *Router
NewRouter returns a Router that dispatches across the given providers.
func NewTagRouter ¶
func NewTagRouter(providers []TaggedProvider, opts ...RouterOption) *Router
NewTagRouter returns a Router where each provider carries routing tags and optional weights. Use WithRequiredTags to filter providers by tag at request time. Use WithStrategy(Weighted) to route proportionally by Weight.
func (*Router) Stop ¶
func (r *Router) Stop()
Stop cancels the health check goroutine if one was started.
func (*Router) ValidateEnvironment ¶
ValidateEnvironment implements Provider.
type RouterOption ¶
type RouterOption func(*Router)
RouterOption configures a Router.
func WithAutoVisionRouting ¶
func WithAutoVisionRouting() RouterOption
WithAutoVisionRouting makes the router prefer providers tagged "vision" when the incoming request contains image_url content parts. Falls back to all eligible providers if none are tagged "vision".
func WithCircuitBreaker ¶
func WithCircuitBreaker(threshold int, cooldown time.Duration) RouterOption
WithCircuitBreaker enables the circuit breaker. After threshold consecutive failures on a provider, it is placed in cooldown for the given duration. Set threshold to 0 to disable (default).
func WithContentPolicyFallback ¶
func WithContentPolicyFallback(enabled bool) RouterOption
WithContentPolicyFallback enables failover when a provider returns a ContentPolicyViolationError, trying the next provider in the order. Useful when providers have different content policies and a stricter provider is listed first.
func WithContextWindowFallback ¶
func WithContextWindowFallback(enabled bool) RouterOption
WithContextWindowFallback enables failover when a provider returns ContextWindowExceededError, trying the next provider in the order.
func WithHealthChecks ¶
func WithHealthChecks(interval time.Duration) RouterOption
WithHealthChecks starts a background goroutine that calls ValidateEnvironment() on each provider every interval. Providers that error are marked unhealthy and skipped in routing until they recover.
func WithMaxCostPerRequest ¶
func WithMaxCostPerRequest(dollars float64) RouterOption
WithMaxCostPerRequest limits each request to the given USD budget. The router estimates input cost from message length and the request model's pricing entry. Requests that are estimated to exceed the budget are rejected with an error before any provider is contacted.
func WithRequiredTags ¶
func WithRequiredTags(tags []string) RouterOption
WithRequiredTags restricts routing to providers whose tag set is a superset of all the given tags. Only meaningful when using NewTagRouter.
func WithRetryPolicy ¶
func WithRetryPolicy(p RetryPolicy) RouterOption
WithRetryPolicy sets the per-provider retry policy.
func WithRoutingGroups ¶
func WithRoutingGroups(groups []RoutingGroup) RouterOption
WithRoutingGroups registers named routing groups for per-model strategies.
func WithStrategy ¶
func WithStrategy(s Strategy) RouterOption
WithStrategy sets the selection strategy.
func WithTrafficSplit ¶
func WithTrafficSplit(groups []TrafficSplitGroup) RouterOption
WithTrafficSplit configures explicit experiment arms and switches the strategy to TrafficSplit. Each group specifies a provider index and relative weight. On every request one arm is chosen by weighted random selection; the remaining eligible providers serve as ordered fallbacks if that arm fails.
func WithWeightedStrategy ¶
func WithWeightedStrategy() RouterOption
WithWeightedStrategy is a convenience option that sets the Weighted strategy.
type RoutingGroup ¶
type RoutingGroup struct {
Name string
Providers []Provider
Strategy Strategy
Policy RetryPolicy
}
RoutingGroup defines a named group of providers with a dedicated routing strategy. Useful when different models need different failover behavior.
type Session ¶
type Session struct {
ID string `json:"id"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
Provider string `json:"provider"`
Model string `json:"model"`
Messages []types.Message `json:"messages"`
}
Session stores a conversation history that can be saved to disk and resumed in future processes, similar to claude --continue.
func ListSessions ¶
ListSessions returns all saved sessions sorted by creation time (newest first).
func LoadLatestSession ¶
LoadLatestSession loads the most recently saved session. Returns (nil, nil) if no sessions have been saved yet.
func LoadSession ¶
LoadSession loads a session by its ID from disk.
func NewSession ¶
NewSession creates an empty session for the given provider and model.
type SpeechProvider ¶
type SpeechProvider = base.SpeechProvider
SpeechProvider is the optional interface for text-to-speech.
type SpeechRequest ¶
type SpeechRequest = types.SpeechRequest
SpeechRequest is the input to a text-to-speech call.
type SpeechResponse ¶
type SpeechResponse = types.SpeechResponse
SpeechResponse is the output from a text-to-speech call.
func Speech ¶
func Speech(ctx context.Context, p SpeechProvider, req SpeechRequest) (*SpeechResponse, error)
Speech converts text to audio using the given SpeechProvider.
type Strategy ¶
type Strategy int
Strategy controls how the Router picks a provider for each request.
const ( // PriorityOrder tries providers in declaration order, failing over on retryable errors. PriorityOrder Strategy = iota // RoundRobin distributes requests evenly across all providers. RoundRobin // LeastLatency routes to the provider with the lowest EMA latency. LeastLatency // LeastBusy routes to the provider currently handling the fewest requests. LeastBusy // UsageBased routes based on observed token/request metrics. UsageBased // CostBased routes to minimize estimated cost per request. CostBased // Weighted distributes traffic proportionally to each provider's Weight field. Weighted // TrafficSplit routes by explicit percentage splits across labeled experiment groups. // Configure via WithTrafficSplit. TrafficSplit )
type TaggedProvider ¶
type TaggedProvider struct {
Provider Provider
Tags []string // e.g. ["fast", "cheap", "vision"]
Weight int // relative traffic weight for Weighted strategy; 0 treated as 1
}
TaggedProvider pairs a Provider with routing tags and an optional weight.
type TextCompleter ¶
type TextCompleter = base.TextCompleter
TextCompleter is the optional interface for legacy text completion.
type TextRequest ¶
type TextRequest = types.TextRequest
TextRequest is the input to a legacy text completion call.
type TextResponse ¶
type TextResponse = types.TextResponse
TextResponse is the output from a legacy text completion call.
func TextComplete ¶
func TextComplete(ctx context.Context, p TextCompleter, req TextRequest) (*TextResponse, error)
TextComplete sends a legacy (non-chat) text completion request.
type TrafficSplitGroup ¶
type TrafficSplitGroup struct {
Label string // experiment arm label (for observability)
ProviderIdx int // index into the Router's provider slice
Weight int // relative traffic weight; 0 treated as 1
}
TrafficSplitGroup defines one labeled experiment arm for TrafficSplit routing. ProviderIdx is the index into the Router's provider slice; Weight controls how often this arm is selected relative to the others.
type Transcriber ¶
type Transcriber = base.Transcriber
Transcriber is the optional interface for audio transcription.
type TranscriptionRequest ¶
type TranscriptionRequest = types.TranscriptionRequest
TranscriptionRequest is the input to an audio transcription call.
type TranscriptionResponse ¶
type TranscriptionResponse = types.TranscriptionResponse
TranscriptionResponse is the output from an audio transcription call.
func Transcribe ¶
func Transcribe(ctx context.Context, p Transcriber, req TranscriptionRequest) (*TranscriptionResponse, error)
Transcribe converts audio to text using the given Transcriber.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package budget provides per-key spend tracking and budget enforcement.
|
Package budget provides per-key spend tracking and budget enforcement. |
|
Package caching provides request/response caching for llmbridge providers.
|
Package caching provides request/response caching for llmbridge providers. |
|
Package callbacks provides an event-driven observability system for llmbridge.
|
Package callbacks provides an event-driven observability system for llmbridge. |
|
cmd
|
|
|
llmbridge
command
Command llmbridge is a CLI for running and managing an llmbridge proxy server.
|
Command llmbridge is a CLI for running and managing an llmbridge proxy server. |
|
Package exceptions defines the error hierarchy for llmbridge provider failures.
|
Package exceptions defines the error hierarchy for llmbridge provider failures. |
|
Package guardrails provides configurable safety rules for LLM requests and responses.
|
Package guardrails provides configurable safety rules for LLM requests and responses. |
|
llms
|
|
|
anthropic
Package anthropic provides a base.LLM backed by the Anthropic Messages API (Claude Opus, Sonnet, Haiku families).
|
Package anthropic provides a base.LLM backed by the Anthropic Messages API (Claude Opus, Sonnet, Haiku families). |
|
anthropic/chat
Package chat implements Anthropic Messages API request/response transformation.
|
Package chat implements Anthropic Messages API request/response transformation. |
|
azure
Package azure provides a base.LLM backed by Azure OpenAI Service.
|
Package azure provides a base.LLM backed by Azure OpenAI Service. |
|
base
Package base defines the core interfaces that all LLM provider implementations must satisfy.
|
Package base defines the core interfaces that all LLM provider implementations must satisfy. |
|
bedrock
Package bedrock provides a base.LLM backed by AWS Bedrock Converse API.
|
Package bedrock provides a base.LLM backed by AWS Bedrock Converse API. |
|
bedrock/chat
Package chat handles AWS Bedrock Converse API wire-format transformations.
|
Package chat handles AWS Bedrock Converse API wire-format transformations. |
|
cohere
Package cohere provides a base.LLM backed by the Cohere API.
|
Package cohere provides a base.LLM backed by the Cohere API. |
|
cohere/chat
Package chat handles Cohere API wire-format transformations.
|
Package chat handles Cohere API wire-format transformations. |
|
compatible
Package compatible provides llmbridge Providers for endpoints that speak the OpenAI chat completions wire format.
|
Package compatible provides llmbridge Providers for endpoints that speak the OpenAI chat completions wire format. |
|
gemini
Package gemini provides a base.LLM backed by the Google Gemini API.
|
Package gemini provides a base.LLM backed by the Google Gemini API. |
|
gemini/chat
Package chat handles Google Gemini API wire-format transformations.
|
Package chat handles Google Gemini API wire-format transformations. |
|
openai
Package openai provides a base.LLM backed by the OpenAI chat completions API.
|
Package openai provides a base.LLM backed by the OpenAI chat completions API. |
|
openai/chat
Package chat implements OpenAI chat completions request/response transformation.
|
Package chat implements OpenAI chat completions request/response transformation. |
|
Package prompttpl provides simple {{variable}} interpolation for prompt templates.
|
Package prompttpl provides simple {{variable}} interpolation for prompt templates. |
|
Package proxy implements an OpenAI-compatible HTTP proxy server that dispatches requests to any llmbridge Provider backend.
|
Package proxy implements an OpenAI-compatible HTTP proxy server that dispatches requests to any llmbridge Provider backend. |
|
audit
Package audit provides a fixed-size ring buffer of request audit entries for the llmbridge proxy.
|
Package audit provides a fixed-size ring buffer of request audit entries for the llmbridge proxy. |
|
auth
Package auth provides API key authentication for the llmbridge proxy server.
|
Package auth provides API key authentication for the llmbridge proxy server. |
|
config
Package config defines the JSON configuration file format for the llmbridge proxy server.
|
Package config defines the JSON configuration file format for the llmbridge proxy server. |
|
management
Package management provides admin endpoints for the llmbridge proxy server.
|
Package management provides admin endpoints for the llmbridge proxy server. |
|
metrics
Package metrics provides a minimal Prometheus-compatible /metrics endpoint for the llmbridge proxy server.
|
Package metrics provides a minimal Prometheus-compatible /metrics endpoint for the llmbridge proxy server. |
|
middleware
Package middleware provides HTTP middleware for the llmbridge proxy server.
|
Package middleware provides HTTP middleware for the llmbridge proxy server. |
|
persistence
Package persistence provides a SQLite-backed store for proxy state.
|
Package persistence provides a SQLite-backed store for proxy state. |
|
prompts
Package prompts provides server-side prompt template storage with versioning.
|
Package prompts provides server-side prompt template storage with versioning. |
|
secrets
Package secrets provides pluggable secret loading from AWS Secrets Manager, GCP Secret Manager, and HashiCorp Vault — all implemented with stdlib only.
|
Package secrets provides pluggable secret loading from AWS Secrets Manager, GCP Secret Manager, and HashiCorp Vault — all implemented with stdlib only. |
|
ui
Package ui embeds the admin SPA static assets into the binary.
|
Package ui embeds the admin SPA static assets into the binary. |
|
webhooks
Package webhooks provides configurable outbound webhook delivery for llmbridge proxy events.
|
Package webhooks provides configurable outbound webhook delivery for llmbridge proxy events. |
|
Package tokencount provides heuristic token-count estimates for LLM requests and responses without requiring any external tokenizer library.
|
Package tokencount provides heuristic token-count estimates for LLM requests and responses without requiring any external tokenizer library. |
|
Package toolbuilder provides a fluent API for constructing types.Tool values without manually assembling nested structs.
|
Package toolbuilder provides a fluent API for constructing types.Tool values without manually assembling nested structs. |
|
Package types defines all core data structures shared across llmbridge packages.
|
Package types defines all core data structures shared across llmbridge packages. |