llmrouter

package module

v0.4.2 Latest Latest Go to latest Published: May 31, 2026 License: Apache-2.0 Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/bluefunda/llmrouter

Links

Open Source Insights

README ¶

llmrouter

A Go library that provides a unified interface for routing requests across multiple LLM providers. Write against one API, deploy across OpenAI, Anthropic, Google Gemini, and any OpenAI-compatible service.

Prerequisites

Go 1.25+
API key for at least one supported provider

Installation

go get github.com/bluefunda/llmrouter

Quick Start

package main

import (
    "context"
    "fmt"
    "time"

    llmrouter "github.com/bluefunda/llmrouter"
    "github.com/bluefunda/llmrouter/middleware"
    "github.com/bluefunda/llmrouter/providers/anthropic"
    "github.com/bluefunda/llmrouter/providers/openai"
)

func main() {
    router := llmrouter.New(
        llmrouter.WithProvider("openai", openai.NewFromEnv("openai", "OPENAI_API_KEY")),
        llmrouter.WithProvider("anthropic", anthropic.NewFromEnv()),
        llmrouter.WithMiddleware(
            middleware.Retry(3, time.Second),
            middleware.Timeout(60*time.Second),
        ),
    )

    resp, err := router.Complete(context.Background(), &llmrouter.Request{
        Model: "gpt-4o-mini",
        Messages: []llmrouter.Message{
            {Role: llmrouter.RoleUser, Content: "Hello!"},
        },
    })
    if err != nil {
        panic(err)
    }

    fmt.Println(resp.Choices[0].Message.Content)
}

Providers

Each provider is configured via environment variables or explicit options.

Provider	Package	Env Variable	Models
OpenAI	`providers/openai`	`OPENAI_API_KEY`	gpt-4o, gpt-4o-mini, gpt-4.1, o4-mini
Anthropic	`providers/anthropic`	`ANTHROPIC_API_KEY`	claude-opus-4, claude-sonnet-4, claude-haiku-3.5
Gemini	`providers/gemini`	`GEMINI_API_KEY`	gemini-1.5-pro, gemini-1.5-flash, gemini-2.0-flash-exp
DeepSeek	`providers/openai` (preset: `deepseek`)	`DEEPSEEK_API_KEY`	deepseek-chat, deepseek-coder
Groq	`providers/openai` (preset: `groq`)	`GROQ_API_KEY`	llama-3.3-70b-versatile, mixtral-8x7b
Together	`providers/openai` (preset: `together`)	`TOGETHER_API_KEY`	llama-3.3-70b, mixtral-8x7b
Ollama	`providers/openai` (preset: `ollama`)	—	Any locally hosted model
Sarvam	`providers/openai` (preset: `sarvam`)	`SARVAM_API_KEY`	sarvam-m, sarvam-30b, sarvam-105b

OpenAI-compatible providers

DeepSeek, Groq, Together AI, Ollama, and Sarvam all use the OpenAI provider with a preset name:

openai.NewFromEnv("deepseek", "DEEPSEEK_API_KEY")
openai.NewFromEnv("groq", "GROQ_API_KEY")
openai.NewFromEnv("ollama", "")  // no key needed for local
openai.NewFromEnv("sarvam", "SARVAM_API_KEY")

Gemini

Gemini requires explicit error handling at construction time and holds a gRPC connection — call router.Close() (or provider.Close() directly) on shutdown:

geminiProvider, err := gemini.NewFromEnv()
if err != nil {
    log.Fatal(err)
}
defer router.Close()

Configuration

Router options

router := llmrouter.New(
    llmrouter.WithProvider("openai", openaiProvider),
    llmrouter.WithProvider("anthropic", anthropicProvider),
    llmrouter.WithModelMapping("gpt-4o", "openai"),
    llmrouter.WithModelMapping("claude-sonnet-4-20250514", "anthropic"),
    llmrouter.WithFallback("openai", "anthropic"),
    llmrouter.WithMiddleware(retryMw, cb.Wrap, timeoutMw),
)

Option	Description
`WithProvider`	Register a named provider
`WithModelMapping`	Route a model name to a specific provider
`WithFallback`	Set fallback provider order on primary failure
`WithMiddleware`	Attach middleware to the processing chain

Model resolution

The router resolves a model to a provider in this order:

Explicit mapping — WithModelMapping("gpt-4o", "openai")
Provider name match — model name equals a registered provider name
Provider model list — iterates providers in registration order and checks Models()

Middleware

Middleware is a MiddlewareFunc — a plain func(Provider) Provider. It is applied in declaration order (first declared = outermost wrapper).

Retry

Exponential backoff with configurable max attempts. Non-retryable errors (auth failures, invalid requests, context cancellation) short-circuit immediately.

middleware.Retry(3, time.Second)
middleware.Retry(3, time.Second, middleware.WithMaxDelay(10*time.Second))
middleware.Retry(3, time.Second, middleware.WithRetryFunc(myRetryPolicy))

Circuit Breaker

Stdlib-only three-state circuit breaker (Closed → Open → HalfOpen). Opens after consecutive failures exceed the threshold; recovers after the timeout period. No external dependencies.

Because the circuit breaker has observable state, it is constructed separately and passed via cb.Wrap:

cb := middleware.NewCircuitBreaker(5, 30*time.Second)
router := llmrouter.New(
    llmrouter.WithMiddleware(cb.Wrap),
)
fmt.Println(cb.State()) // CBStateClosed / CBStateOpen / CBStateHalfOpen

Timeout

Enforces a deadline on both Complete and Stream calls. On timeout, Stream surfaces the error through StreamResult.Err().

middleware.Timeout(60 * time.Second)

Custom middleware

Any func(llmrouter.Provider) llmrouter.Provider satisfies MiddlewareFunc directly:

func Logging(next llmrouter.Provider) llmrouter.Provider {
    return &loggingProvider{Provider: next}
}

router := llmrouter.New(llmrouter.WithMiddleware(Logging))

Streaming

Stream returns a *StreamResult iterator. Advance it with Next(), read the current event with Event(), and check errors after the loop with Err(). Always defer stream.Close() to release resources.

stream, err := router.Stream(ctx, &llmrouter.Request{
    Model:    "claude-sonnet-4-20250514",
    Messages: []llmrouter.Message{
        {Role: llmrouter.RoleUser, Content: "Write a haiku about Go."},
    },
})
if err != nil {
    log.Fatal(err)
}
defer stream.Close()

for stream.Next() {
    event := stream.Event()
    switch event.Type {
    case llmrouter.EventContentDelta:
        fmt.Print(event.Content)
    case llmrouter.EventToolCallDelta:
        // handle tool call delta
    case llmrouter.EventDone:
        // event.Response holds the final response with usage stats
    }
}
if err := stream.Err(); err != nil {
    log.Fatal(err)
}

Tool Calling

Define tools once and use them across any provider that supports function calling:

tool := llmrouter.Tool{
    Type: "function",
    Function: llmrouter.Function{
        Name:        "get_weather",
        Description: "Get current weather for a location",
        Parameters:  json.RawMessage(`{
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            },
            "required": ["location"]
        }`),
    },
}

resp, _ := router.Complete(ctx, &llmrouter.Request{
    Model:    "gpt-4o-mini",
    Messages: messages,
    Tools:    []llmrouter.Tool{tool},
})

Multimodal

Messages support text, images, and documents via ContentParts:

msg := llmrouter.Message{
    Role: llmrouter.RoleUser,
    ContentParts: []llmrouter.ContentPart{
        {Type: "text", Text: "What's in this image?"},
        {Type: "image_url", ImageURL: &llmrouter.ImageURL{URL: "https://..."}},
    },
}

Prompt Caching

Mark static content for provider-level caching. Anthropic uses explicit CacheControl annotations; OpenAI and Gemini cache automatically. Observe savings via Usage.CachedPromptTokens:

req := &llmrouter.Request{
    Model: "claude-sonnet-4-20250514",
    Messages: []llmrouter.Message{
        {
            Role:         llmrouter.RoleSystem,
            Content:      longSystemPrompt,
            CacheControl: &llmrouter.CacheControl{Type: "ephemeral"},
        },
        {Role: llmrouter.RoleUser, Content: userQuery},
    },
}

Error Handling

The library classifies errors for intelligent retry and routing decisions:

Error	Retryable	Description
`ErrRateLimited`	Yes	Provider rate limit (429)
`ErrAuthFailed`	No	Invalid API key (401/403)
`ErrInvalidRequest`	No	Malformed request (400)
`ErrCircuitOpen`	No	Circuit breaker is open
`ErrMaxRetriesExceeded`	No	All retry attempts exhausted
`ErrUnknownModel`	No	Model not found in any provider
`ErrNoProviders`	No	No providers registered

Use llmrouter.IsRetryable(err) and llmrouter.IsRateLimited(err) for programmatic checks.

Project Structure

router.go                      # Core router — provider registry, model resolution, middleware chain
provider.go                    # Provider interface and MiddlewareFunc type
types.go                       # Unified request/response types, streaming events, tool definitions
options.go                     # Functional options for router configuration
errors.go                      # Error types and retryability classification
middleware/
  retry.go                     # Retry with exponential backoff
  timeout.go                   # Request timeout enforcement
  breaker.go                   # Circuit breaker state machine (stdlib only)
  circuitbreaker.go            # Circuit breaker middleware wrapper
providers/
  openai/                      # OpenAI + compatible providers (DeepSeek, Groq, Together, Ollama, Sarvam)
  anthropic/                   # Anthropic Claude
  gemini/                      # Google Gemini
examples/
  simple/                      # Basic completion
  streaming/                   # Streaming responses
  tools/                       # Function calling
  fallback/                    # Multi-provider with middleware

License

Apache 2.0 — see LICENSE.

Built by BlueFunda — open-sourced under Apache 2.0.

Documentation ¶

Overview ¶

Package llmrouter provides a unified interface for routing LLM requests across multiple AI providers. Write once against a single API and deploy across OpenAI, Anthropic Claude, Google Gemini, or any OpenAI-compatible service — DeepSeek, Groq, Together AI, Ollama, Sarvam, and more.

Installation ¶

go get github.com/bluefunda/llmrouter

Quick start ¶

import (
    llmrouter "github.com/bluefunda/llmrouter"
    "github.com/bluefunda/llmrouter/middleware"
    "github.com/bluefunda/llmrouter/providers/anthropic"
    "github.com/bluefunda/llmrouter/providers/openai"
)

router := llmrouter.New(
    llmrouter.WithProvider("openai", openai.NewFromEnv("openai", "OPENAI_API_KEY")),
    llmrouter.WithProvider("anthropic", anthropic.NewFromEnv()),
    llmrouter.WithMiddleware(
        middleware.Retry(3, time.Second),
        middleware.Timeout(60*time.Second),
    ),
)

resp, err := router.Complete(ctx, &llmrouter.Request{
    Model:    "gpt-4o-mini",
    Messages: []llmrouter.Message{{Role: llmrouter.RoleUser, Content: "Hello!"}},
})

Providers ¶

Three native provider packages are included:

github.com/bluefunda/llmrouter/providers/openai — OpenAI (gpt-4o, gpt-4o-mini, o1, ...)
github.com/bluefunda/llmrouter/providers/anthropic — Anthropic Claude (claude-sonnet-4, claude-haiku-4, ...)
github.com/bluefunda/llmrouter/providers/gemini — Google Gemini (gemini-2.0-flash, gemini-2.5-pro, ...)

The openai package also covers any OpenAI-compatible API via built-in presets:

openai.NewFromEnv("deepseek", "DEEPSEEK_API_KEY")   // DeepSeek
openai.NewFromEnv("groq",     "GROQ_API_KEY")       // Groq
openai.NewFromEnv("together", "TOGETHER_API_KEY")   // Together AI
openai.NewFromEnv("ollama",   "")                   // Ollama (local)
openai.NewFromEnv("sarvam",   "SARVAM_API_KEY")     // Sarvam

Streaming ¶

Use Router.Stream to receive tokens as they arrive:

stream, err := router.Stream(ctx, &llmrouter.Request{
    Model:    "claude-sonnet-4-20250514",
    Messages: []llmrouter.Message{{Role: llmrouter.RoleUser, Content: "Write a haiku."}},
})
if err != nil {
    log.Fatal(err)
}
defer stream.Close()
for stream.Next() {
    event := stream.Event()
    switch event.Type {
    case llmrouter.EventContentDelta:
        fmt.Print(event.Content)
    case llmrouter.EventDone:
        fmt.Println()
    }
}
if err := stream.Err(); err != nil {
    log.Fatal(err)
}

Fallback routing ¶

Register multiple providers and declare a fallback order. On primary failure the router tries each fallback in sequence, returning the first success:

router := llmrouter.New(
    llmrouter.WithProvider("openai",    openai.NewFromEnv("openai", "OPENAI_API_KEY")),
    llmrouter.WithProvider("anthropic", anthropic.NewFromEnv()),
    llmrouter.WithModelMapping("gpt-4o", "openai"),
    llmrouter.WithFallback("anthropic"), // tried if openai fails
)

Prompt caching ¶

Mark static blocks for provider-level caching. Anthropic uses explicit cache_control annotations; OpenAI and Gemini cache automatically. Observe savings via [Usage.CachedPromptTokens] and [Usage.CacheCreationTokens]:

req := &llmrouter.Request{
    Model: "claude-sonnet-4-20250514",
    Messages: []llmrouter.Message{
        {
            Role:         llmrouter.RoleSystem,
            Content:      longSystemPrompt, // paid once, reused on every call
            CacheControl: &llmrouter.CacheControl{Type: "ephemeral"},
        },
        {Role: llmrouter.RoleUser, Content: userQuery},
    },
}
resp, _ := router.Complete(ctx, req)
fmt.Printf("cached=%d creation=%d\n",
    resp.Usage.CachedPromptTokens, resp.Usage.CacheCreationTokens)

Tool calling ¶

Pass tool definitions in the request; the model returns tool calls which your code executes and returns as RoleTool messages:

req := &llmrouter.Request{
    Model: "gpt-4o-mini",
    Messages: []llmrouter.Message{
        {Role: llmrouter.RoleUser, Content: "What's the weather in Tokyo?"},
    },
    Tools: []llmrouter.Tool{weatherTool},
}
resp, _ := router.Complete(ctx, req)
if resp.Choices[0].FinishReason == "tool_calls" {
    tc := resp.Choices[0].Message.ToolCalls[0]
    result := callWeatherAPI(tc.Function.Arguments)
    // send result back in a follow-up request
}

Middleware ¶

Middleware is applied in declaration order; each wraps the next. The github.com/bluefunda/llmrouter/middleware package provides three built-ins:

github.com/bluefunda/llmrouter/middleware.Retry — exponential backoff on retryable errors (429, 5xx)
github.com/bluefunda/llmrouter/middleware.Timeout — per-request context deadline
github.com/bluefunda/llmrouter/middleware.NewCircuitBreaker — open circuit after N consecutive failures

Custom middleware is a MiddlewareFunc — a function that wraps a Provider:

func Logging(next llmrouter.Provider) llmrouter.Provider {
    return &loggingProvider{Provider: next}
}

router := llmrouter.New(
    llmrouter.WithMiddleware(Logging),
)

Model resolution ¶

The router resolves a model name to a provider in this order:

Explicit mapping via WithModelMapping
Provider name match (model name equals a registered provider name)
Provider model list scan via Provider.Models()

Error handling ¶

Errors are classified for intelligent retry decisions. Use IsRetryable and IsRateLimited for programmatic checks, or match typed sentinels directly:

resp, err := router.Complete(ctx, req)
if errors.Is(err, llmrouter.ErrRateLimited) {
    // back off and retry later
}
if errors.Is(err, llmrouter.ErrCircuitOpen) {
    // provider is temporarily unavailable
}

Other sentinels: ErrUnknownModel, ErrNoProviders, ErrAuthFailed, ErrMaxRetriesExceeded.

Packages ¶

github.com/bluefunda/llmrouter/middleware — retry, timeout, and circuit breaker middleware
github.com/bluefunda/llmrouter/providers/openai — OpenAI and OpenAI-compatible providers (DeepSeek, Groq, Together AI, Ollama, Sarvam)
github.com/bluefunda/llmrouter/providers/anthropic — Anthropic Claude
github.com/bluefunda/llmrouter/providers/gemini — Google Gemini

Index ¶

Variables
func CalculateCost(model string, usage *Usage, prices map[string]ModelPrice) float64
func IsRateLimited(err error) bool
func IsRetryable(err error) bool
type APIError
- func (e *APIError) Error() string
- func (e *APIError) Unwrap() error
type CacheControl
type Choice
type ContentPart
type Delta
type Document
type Event
type EventType
type FuncCall
type FuncRef
type Function
type ImageURL
type Message
type MiddlewareFunc
type ModelPrice
type Option
- func WithFallback(providers ...string) Option
- func WithMiddleware(m ...MiddlewareFunc) Option
- func WithModelMapping(model, provider string) Option
- func WithPriceTable(prices map[string]ModelPrice) Option
- func WithProvider(name string, p Provider) Option
type Provider
type ProviderConfig
type Request
type Response
type Role
type Router
- func New(opts ...Option) *Router
- func (r *Router) AddMiddleware(m MiddlewareFunc)
- func (r *Router) Close() error
- func (r *Router) Complete(ctx context.Context, req *Request) (*Response, error)
- func (r *Router) GetProvider(name string) (Provider, bool)
- func (r *Router) MapModel(model, provider string)
- func (r *Router) Providers() []string
- func (r *Router) RegisterProvider(name string, p Provider)
- func (r *Router) SetFallbacks(providers ...string)
- func (r *Router) Stream(ctx context.Context, req *Request) (*StreamResult, error)
type StreamResult
- func NewStreamResult(ch <-chan Event) *StreamResult
- func (s *StreamResult) Close() error
- func (s *StreamResult) Err() error
- func (s *StreamResult) Event() Event
- func (s *StreamResult) Next() bool
- func (s *StreamResult) OnClose(fn func() error)
type Tool
type ToolCall
type ToolChoice
type Usage
- func (u *Usage) CacheHitRate() float64

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	ErrUnknownModel       = errors.New("unknown model")
	ErrUnknownProvider    = errors.New("unknown provider")
	ErrNoProviders        = errors.New("no providers registered")
	ErrRateLimited        = errors.New("rate limited")
	ErrInvalidRequest     = errors.New("invalid request")
	ErrAuthFailed         = errors.New("authentication failed")
	ErrProviderError      = errors.New("provider error")
	ErrCircuitOpen        = errors.New("circuit breaker is open")
	ErrMaxRetriesExceeded = errors.New("max retries exceeded")
)

Sentinel errors

View Source

var DefaultPrices = map[string]ModelPrice{

	"gpt-4.1":      {InputPerMillion: 2.00, OutputPerMillion: 8.00, CacheReadPerMillion: 0.50},
	"gpt-4.1-mini": {InputPerMillion: 0.40, OutputPerMillion: 1.60, CacheReadPerMillion: 0.10},
	"gpt-4.1-nano": {InputPerMillion: 0.10, OutputPerMillion: 0.40, CacheReadPerMillion: 0.025},
	"gpt-4o":       {InputPerMillion: 2.50, OutputPerMillion: 10.00, CacheReadPerMillion: 1.25},
	"gpt-4o-mini":  {InputPerMillion: 0.15, OutputPerMillion: 0.60, CacheReadPerMillion: 0.075},
	"o4-mini":      {InputPerMillion: 1.10, OutputPerMillion: 4.40, CacheReadPerMillion: 0.275},

	"claude-opus-4-20250514":     {InputPerMillion: 15.00, OutputPerMillion: 75.00, CacheReadPerMillion: 1.50},
	"claude-sonnet-4-20250514":   {InputPerMillion: 3.00, OutputPerMillion: 15.00, CacheReadPerMillion: 0.30},
	"claude-3-5-haiku-20241022":  {InputPerMillion: 0.80, OutputPerMillion: 4.00, CacheReadPerMillion: 0.08},
	"claude-3-5-sonnet-20241022": {InputPerMillion: 3.00, OutputPerMillion: 15.00, CacheReadPerMillion: 0.30},
	"claude-3-opus-20240229":     {InputPerMillion: 15.00, OutputPerMillion: 75.00, CacheReadPerMillion: 1.50},
	"claude-3-sonnet-20240229":   {InputPerMillion: 3.00, OutputPerMillion: 15.00, CacheReadPerMillion: 0.30},
	"claude-3-haiku-20240307":    {InputPerMillion: 0.25, OutputPerMillion: 1.25, CacheReadPerMillion: 0.03},

	"deepseek-chat":  {InputPerMillion: 0.07, OutputPerMillion: 1.10},
	"deepseek-coder": {InputPerMillion: 0.07, OutputPerMillion: 1.10},

	"gemini-2.5-pro":   {InputPerMillion: 1.25, OutputPerMillion: 10.00},
	"gemini-2.5-flash": {InputPerMillion: 0.15, OutputPerMillion: 0.60},
	"gemini-2.0-flash": {InputPerMillion: 0.10, OutputPerMillion: 0.40},
}

DefaultPrices is the built-in price table for known models. Cost is 0 for models not present in the map. Prices reflect standard API rates as of mid-2025; override with WithPriceTable if needed.

Functions ¶

func CalculateCost ¶ added in v0.4.1

func CalculateCost(model string, usage *Usage, prices map[string]ModelPrice) float64

CalculateCost returns the estimated USD cost for a request given token usage and a price table. Cached tokens are billed at CacheReadPerMillion; uncached prompt tokens at InputPerMillion. Returns 0 if the model is not in the price table or usage is nil.

func IsRateLimited ¶

func IsRateLimited(err error) bool

IsRateLimited returns true if the error indicates rate limiting

func IsRetryable ¶

func IsRetryable(err error) bool

IsRetryable returns true if the error is retryable

Types ¶

type APIError ¶

type APIError struct {
	Provider   string
	StatusCode int
	Message    string
	Type       string
	Err        error
}

APIError represents an error from an LLM provider API

func (*APIError) Error ¶

func (e *APIError) Error() string

func (*APIError) Unwrap ¶

func (e *APIError) Unwrap() error

type CacheControl ¶

type CacheControl struct {
	Type string `json:"type"` // "ephemeral"
}

CacheControl marks a content block for provider-level prompt caching. Only "ephemeral" is currently supported. OpenAI and Gemini cache automatically and ignore this field; set it only when targeting Anthropic.

type Choice ¶

type Choice struct {
	Index        int      `json:"index"`
	Message      *Message `json:"message,omitempty"`
	Delta        *Delta   `json:"delta,omitempty"`
	FinishReason string   `json:"finish_reason,omitempty"`
}

Choice represents a completion choice

type ContentPart ¶

type ContentPart struct {
	Type         string        `json:"type"` // "text", "image_url", or "document"
	Text         string        `json:"text,omitempty"`
	ImageURL     *ImageURL     `json:"image_url,omitempty"`
	Document     *Document     `json:"document,omitempty"`
	CacheControl *CacheControl `json:"cache_control,omitempty"`
}

ContentPart represents a part of a multimodal message

type Delta ¶

type Delta struct {
	Role      Role       `json:"role,omitempty"`
	Content   string     `json:"content,omitempty"`
	ToolCalls []ToolCall `json:"tool_calls,omitempty"`
}

Delta represents streaming content delta

type Document ¶

type Document struct {
	Base64    string `json:"base64"`
	MediaType string `json:"media_type"` // e.g. "application/pdf"
}

Document represents a document (PDF, etc.) for providers that support it natively

type Event ¶

type Event struct {
	Type     EventType
	Content  string
	Delta    *Delta
	Response *Response
	Error    error
}

Event represents a streaming event

type EventType ¶

type EventType int

EventType represents the type of streaming event

const (
	EventContentDelta  EventType = iota // Text content chunk
	EventToolCallDelta                  // Tool call chunk
	EventDone                           // Stream completed
	EventError                          // Error occurred
)

type FuncCall ¶

type FuncCall struct {
	Name      string `json:"name"`
	Arguments string `json:"arguments"`
}

FuncCall represents a function call

type FuncRef ¶

type FuncRef struct {
	Name string `json:"name"`
}

FuncRef references a specific function

type Function ¶

type Function struct {
	Name        string          `json:"name"`
	Description string          `json:"description,omitempty"`
	Parameters  json.RawMessage `json:"parameters,omitempty"`
}

Function represents a function definition

type ImageURL ¶

type ImageURL struct {
	URL       string `json:"url"`
	Detail    string `json:"detail,omitempty"`
	Base64    string `json:"base64,omitempty"`
	MediaType string `json:"media_type,omitempty"`
}

ImageURL represents an image reference with both URL and base64 forms

type Message ¶

type Message struct {
	Role         Role          `json:"role"`
	Content      string        `json:"content"`
	ContentParts []ContentPart `json:"content_parts,omitempty"`
	Name         string        `json:"name,omitempty"`
	ToolCalls    []ToolCall    `json:"tool_calls,omitempty"`
	ToolCallID   string        `json:"tool_call_id,omitempty"`
	// CacheControl marks this message's content for prompt caching (Anthropic only).
	// For user messages with ContentParts, set CacheControl on individual parts instead.
	CacheControl *CacheControl `json:"cache_control,omitempty"`
}

Message represents a chat message

type MiddlewareFunc ¶ added in v0.4.0

type MiddlewareFunc func(Provider) Provider

MiddlewareFunc wraps a Provider with additional functionality. It is a plain function type; any func(Provider) Provider satisfies it directly.

type ModelPrice ¶ added in v0.4.1

type ModelPrice struct {
	InputPerMillion     float64 // USD per million input (prompt) tokens
	OutputPerMillion    float64 // USD per million output (completion) tokens
	CacheReadPerMillion float64 // USD per million cache-read tokens; 0 if not applicable
}

ModelPrice holds the per-token USD pricing for a model.

type Option ¶

type Option func(*Router)

Option configures the Router

func WithFallback ¶

func WithFallback(providers ...string) Option

WithFallback sets fallback providers in priority order

func WithMiddleware ¶

func WithMiddleware(m ...MiddlewareFunc) Option

WithMiddleware adds middleware to the processing chain. Use this with middleware from the middleware package:

import "github.com/bluefunda/llmrouter/middleware"

router := llmrouter.New(
    llmrouter.WithMiddleware(
        middleware.Retry(3, time.Second),
        middleware.Timeout(60*time.Second),
    ),
)

func WithModelMapping ¶

func WithModelMapping(model, provider string) Option

WithModelMapping maps a model to a specific provider

func WithPriceTable ¶ added in v0.4.1

func WithPriceTable(prices map[string]ModelPrice) Option

WithPriceTable replaces the default price table used for cost calculation. Callers can start from DefaultPrices and extend it, or supply a fully custom map.

func WithProvider ¶

func WithProvider(name string, p Provider) Option

WithProvider registers a provider with the router

type Provider ¶

type Provider interface {
	// Name returns the provider identifier (e.g., "openai", "anthropic")
	Name() string

	// Models returns the list of supported model IDs
	Models() []string

	// Complete performs a non-streaming completion
	Complete(ctx context.Context, req *Request) (*Response, error)

	// Stream performs a streaming completion
	Stream(ctx context.Context, req *Request) (*StreamResult, error)
}

Provider is the core interface that all LLM providers must implement.

type ProviderConfig ¶

type ProviderConfig struct {
	Name          string
	APIKey        string
	BaseURL       string
	Model         string
	Models        []string
	Timeout       time.Duration
	CustomHeaders map[string]string // custom HTTP headers (e.g. api-subscription-key)
	// StringContentOnly forces message content to be sent as plain strings
	// instead of structured arrays. Required for some OpenAI-compatible APIs
	// (e.g. Sarvam) that don't support the array content format.
	StringContentOnly bool
}

ProviderConfig holds common configuration for providers

type Request ¶

type Request struct {
	Messages    []Message      `json:"messages"`
	Model       string         `json:"model,omitempty"`
	Tools       []Tool         `json:"tools,omitempty"`
	ToolChoice  *ToolChoice    `json:"tool_choice,omitempty"`
	Temperature *float64       `json:"temperature,omitempty"`
	MaxTokens   *int           `json:"max_tokens,omitempty"`
	TopP        *float64       `json:"top_p,omitempty"`
	Stop        []string       `json:"stop,omitempty"`
	Metadata    map[string]any `json:"metadata,omitempty"`
}

Request represents a unified LLM request

type Response ¶

type Response struct {
	ID       string   `json:"id"`
	Object   string   `json:"object"`
	Created  int64    `json:"created"`
	Model    string   `json:"model"`
	Choices  []Choice `json:"choices"`
	Usage    *Usage   `json:"usage,omitempty"`
	Provider string   `json:"provider"`
}

Response represents a unified LLM response (OpenAI-compatible)

type Role ¶

type Role string

Role represents the message role

const (
	RoleSystem    Role = "system"
	RoleUser      Role = "user"
	RoleAssistant Role = "assistant"
	RoleTool      Role = "tool"
)

type Router ¶

type Router struct {
	// contains filtered or unexported fields
}

Router manages multiple LLM providers and routes requests

func New ¶

func New(opts ...Option) *Router

New creates a new Router with the given options

func (*Router) AddMiddleware ¶

func (r *Router) AddMiddleware(m MiddlewareFunc)

AddMiddleware adds middleware to the router

func (*Router) Close ¶ added in v0.4.0

func (r *Router) Close() error

Close releases resources held by registered providers that implement io.Closer. Call this when the router is no longer needed (e.g. on application shutdown).

func (*Router) Complete ¶

func (r *Router) Complete(ctx context.Context, req *Request) (*Response, error)

Complete performs a non-streaming completion

func (*Router) GetProvider ¶

func (r *Router) GetProvider(name string) (Provider, bool)

GetProvider returns a provider by name

func (*Router) MapModel ¶

func (r *Router) MapModel(model, provider string)

MapModel maps a model name to a specific provider

func (*Router) Providers ¶

func (r *Router) Providers() []string

Providers returns list of registered provider names in insertion order

func (*Router) RegisterProvider ¶

func (r *Router) RegisterProvider(name string, p Provider)

RegisterProvider adds a provider to the router

func (*Router) SetFallbacks ¶

func (r *Router) SetFallbacks(providers ...string)

SetFallbacks sets the fallback provider order

func (*Router) Stream ¶

func (r *Router) Stream(ctx context.Context, req *Request) (*StreamResult, error)

Stream sends a request to the appropriate provider and streams the response.

type StreamResult ¶ added in v0.3.1

type StreamResult struct {
	// contains filtered or unexported fields
}

StreamResult is the iterator type for streaming LLM responses. Usage:

stream, err := router.Stream(ctx, req)
if err != nil { return err }
defer stream.Close()
for stream.Next() {
    event := stream.Event()
    // handle event
}
if err := stream.Err(); err != nil { ... }

func NewStreamResult ¶ added in v0.3.1

func NewStreamResult(ch <-chan Event) *StreamResult

NewStreamResult creates a StreamResult from an event channel. Providers and middleware use this to construct a StreamResult.

func (*StreamResult) Close ¶ added in v0.3.1

func (s *StreamResult) Close() error

Close stops the stream and releases resources. Safe to call multiple times.

func (*StreamResult) Err ¶ added in v0.3.1

func (s *StreamResult) Err() error

Err returns the streaming error, if any. Check after Next returns false.

func (*StreamResult) Event ¶ added in v0.3.1

func (s *StreamResult) Event() Event

Event returns the current event (valid after Next returns true).

func (*StreamResult) Next ¶ added in v0.3.1

func (s *StreamResult) Next() bool

Next advances to the next event. Returns false when the stream ends or an error occurs.

func (*StreamResult) OnClose ¶ added in v0.3.1

func (s *StreamResult) OnClose(fn func() error)

OnClose registers a function called when Close is invoked (e.g. context cancel).

type Tool ¶

type Tool struct {
	Type     string   `json:"type"`
	Function Function `json:"function"`
}

Tool represents a function/tool definition

type ToolCall ¶

type ToolCall struct {
	ID       string   `json:"id"`
	Type     string   `json:"type"`
	Function FuncCall `json:"function"`
	Index    *int     `json:"index,omitempty"`
}

ToolCall represents a tool invocation

type ToolChoice ¶

type ToolChoice struct {
	Type     string   `json:"type,omitempty"`
	Function *FuncRef `json:"function,omitempty"`
}

ToolChoice controls tool selection

type Usage ¶

type Usage struct {
	PromptTokens        int     `json:"prompt_tokens"`
	CompletionTokens    int     `json:"completion_tokens"`
	TotalTokens         int     `json:"total_tokens"`
	CachedPromptTokens  int     `json:"cached_prompt_tokens,omitempty"`  // tokens served from cache (all providers)
	CacheCreationTokens int     `json:"cache_creation_tokens,omitempty"` // tokens written to cache (Anthropic only)
	Cost                float64 `json:"cost_usd,omitempty"`              // estimated USD cost; 0 if model not in price table
}

Usage represents token usage

func (*Usage) CacheHitRate ¶ added in v0.4.1

func (u *Usage) CacheHitRate() float64

CacheHitRate returns the fraction of prompt tokens served from cache (0–1). Returns 0 if no prompt tokens were recorded.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
examples
fallback command
simple command
streaming command
tools command
middleware Package middleware provides composable cross-cutting concerns for LLM provider calls: retry with exponential backoff, per-request timeouts, and a circuit breaker to prevent cascading failures.	Package middleware provides composable cross-cutting concerns for LLM provider calls: retry with exponential backoff, per-request timeouts, and a circuit breaker to prevent cascading failures.
providers
anthropic Package anthropic implements the llmrouter.Provider interface for Anthropic Claude models using the official Anthropic Go SDK.	Package anthropic implements the llmrouter.Provider interface for Anthropic Claude models using the official Anthropic Go SDK.
gemini Package gemini implements the llmrouter.Provider interface for Google Gemini models using the official Google Generative AI Go SDK.	Package gemini implements the llmrouter.Provider interface for Google Gemini models using the official Google Generative AI Go SDK.
openai Package openai implements the llmrouter.Provider interface for OpenAI and any OpenAI-compatible API.	Package openai implements the llmrouter.Provider interface for OpenAI and any OpenAI-compatible API.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL