llmrouter

package module
v0.4.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 31, 2026 License: Apache-2.0 Imports: 8 Imported by: 0

README

llmrouter

Go Reference License Go Report Card

A Go library that provides a unified interface for routing requests across multiple LLM providers. Write against one API, deploy across OpenAI, Anthropic, Google Gemini, and any OpenAI-compatible service.

Prerequisites

  • Go 1.25+
  • API key for at least one supported provider

Installation

go get github.com/bluefunda/llmrouter

Quick Start

package main

import (
    "context"
    "fmt"
    "time"

    llmrouter "github.com/bluefunda/llmrouter"
    "github.com/bluefunda/llmrouter/middleware"
    "github.com/bluefunda/llmrouter/providers/anthropic"
    "github.com/bluefunda/llmrouter/providers/openai"
)

func main() {
    router := llmrouter.New(
        llmrouter.WithProvider("openai", openai.NewFromEnv("openai", "OPENAI_API_KEY")),
        llmrouter.WithProvider("anthropic", anthropic.NewFromEnv()),
        llmrouter.WithMiddleware(
            middleware.Retry(3, time.Second),
            middleware.Timeout(60*time.Second),
        ),
    )

    resp, err := router.Complete(context.Background(), &llmrouter.Request{
        Model: "gpt-4o-mini",
        Messages: []llmrouter.Message{
            {Role: llmrouter.RoleUser, Content: "Hello!"},
        },
    })
    if err != nil {
        panic(err)
    }

    fmt.Println(resp.Choices[0].Message.Content)
}

Providers

Each provider is configured via environment variables or explicit options.

Provider Package Env Variable Models
OpenAI providers/openai OPENAI_API_KEY gpt-4o, gpt-4o-mini, gpt-4.1, o4-mini
Anthropic providers/anthropic ANTHROPIC_API_KEY claude-opus-4, claude-sonnet-4, claude-haiku-3.5
Gemini providers/gemini GEMINI_API_KEY gemini-1.5-pro, gemini-1.5-flash, gemini-2.0-flash-exp
DeepSeek providers/openai (preset: deepseek) DEEPSEEK_API_KEY deepseek-chat, deepseek-coder
Groq providers/openai (preset: groq) GROQ_API_KEY llama-3.3-70b-versatile, mixtral-8x7b
Together providers/openai (preset: together) TOGETHER_API_KEY llama-3.3-70b, mixtral-8x7b
Ollama providers/openai (preset: ollama) Any locally hosted model
Sarvam providers/openai (preset: sarvam) SARVAM_API_KEY sarvam-m, sarvam-30b, sarvam-105b
OpenAI-compatible providers

DeepSeek, Groq, Together AI, Ollama, and Sarvam all use the OpenAI provider with a preset name:

openai.NewFromEnv("deepseek", "DEEPSEEK_API_KEY")
openai.NewFromEnv("groq", "GROQ_API_KEY")
openai.NewFromEnv("ollama", "")  // no key needed for local
openai.NewFromEnv("sarvam", "SARVAM_API_KEY")
Gemini

Gemini requires explicit error handling at construction time and holds a gRPC connection — call router.Close() (or provider.Close() directly) on shutdown:

geminiProvider, err := gemini.NewFromEnv()
if err != nil {
    log.Fatal(err)
}
defer router.Close()

Configuration

Router options
router := llmrouter.New(
    llmrouter.WithProvider("openai", openaiProvider),
    llmrouter.WithProvider("anthropic", anthropicProvider),
    llmrouter.WithModelMapping("gpt-4o", "openai"),
    llmrouter.WithModelMapping("claude-sonnet-4-20250514", "anthropic"),
    llmrouter.WithFallback("openai", "anthropic"),
    llmrouter.WithMiddleware(retryMw, cb.Wrap, timeoutMw),
)
Option Description
WithProvider Register a named provider
WithModelMapping Route a model name to a specific provider
WithFallback Set fallback provider order on primary failure
WithMiddleware Attach middleware to the processing chain
Model resolution

The router resolves a model to a provider in this order:

  1. Explicit mappingWithModelMapping("gpt-4o", "openai")
  2. Provider name match — model name equals a registered provider name
  3. Provider model list — iterates providers in registration order and checks Models()

Middleware

Middleware is a MiddlewareFunc — a plain func(Provider) Provider. It is applied in declaration order (first declared = outermost wrapper).

Retry

Exponential backoff with configurable max attempts. Non-retryable errors (auth failures, invalid requests, context cancellation) short-circuit immediately.

middleware.Retry(3, time.Second)
middleware.Retry(3, time.Second, middleware.WithMaxDelay(10*time.Second))
middleware.Retry(3, time.Second, middleware.WithRetryFunc(myRetryPolicy))
Circuit Breaker

Stdlib-only three-state circuit breaker (Closed → Open → HalfOpen). Opens after consecutive failures exceed the threshold; recovers after the timeout period. No external dependencies.

Because the circuit breaker has observable state, it is constructed separately and passed via cb.Wrap:

cb := middleware.NewCircuitBreaker(5, 30*time.Second)
router := llmrouter.New(
    llmrouter.WithMiddleware(cb.Wrap),
)
fmt.Println(cb.State()) // CBStateClosed / CBStateOpen / CBStateHalfOpen
Timeout

Enforces a deadline on both Complete and Stream calls. On timeout, Stream surfaces the error through StreamResult.Err().

middleware.Timeout(60 * time.Second)
Custom middleware

Any func(llmrouter.Provider) llmrouter.Provider satisfies MiddlewareFunc directly:

func Logging(next llmrouter.Provider) llmrouter.Provider {
    return &loggingProvider{Provider: next}
}

router := llmrouter.New(llmrouter.WithMiddleware(Logging))

Streaming

Stream returns a *StreamResult iterator. Advance it with Next(), read the current event with Event(), and check errors after the loop with Err(). Always defer stream.Close() to release resources.

stream, err := router.Stream(ctx, &llmrouter.Request{
    Model:    "claude-sonnet-4-20250514",
    Messages: []llmrouter.Message{
        {Role: llmrouter.RoleUser, Content: "Write a haiku about Go."},
    },
})
if err != nil {
    log.Fatal(err)
}
defer stream.Close()

for stream.Next() {
    event := stream.Event()
    switch event.Type {
    case llmrouter.EventContentDelta:
        fmt.Print(event.Content)
    case llmrouter.EventToolCallDelta:
        // handle tool call delta
    case llmrouter.EventDone:
        // event.Response holds the final response with usage stats
    }
}
if err := stream.Err(); err != nil {
    log.Fatal(err)
}

Tool Calling

Define tools once and use them across any provider that supports function calling:

tool := llmrouter.Tool{
    Type: "function",
    Function: llmrouter.Function{
        Name:        "get_weather",
        Description: "Get current weather for a location",
        Parameters:  json.RawMessage(`{
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            },
            "required": ["location"]
        }`),
    },
}

resp, _ := router.Complete(ctx, &llmrouter.Request{
    Model:    "gpt-4o-mini",
    Messages: messages,
    Tools:    []llmrouter.Tool{tool},
})

Multimodal

Messages support text, images, and documents via ContentParts:

msg := llmrouter.Message{
    Role: llmrouter.RoleUser,
    ContentParts: []llmrouter.ContentPart{
        {Type: "text", Text: "What's in this image?"},
        {Type: "image_url", ImageURL: &llmrouter.ImageURL{URL: "https://..."}},
    },
}

Prompt Caching

Mark static content for provider-level caching. Anthropic uses explicit CacheControl annotations; OpenAI and Gemini cache automatically. Observe savings via Usage.CachedPromptTokens:

req := &llmrouter.Request{
    Model: "claude-sonnet-4-20250514",
    Messages: []llmrouter.Message{
        {
            Role:         llmrouter.RoleSystem,
            Content:      longSystemPrompt,
            CacheControl: &llmrouter.CacheControl{Type: "ephemeral"},
        },
        {Role: llmrouter.RoleUser, Content: userQuery},
    },
}

Error Handling

The library classifies errors for intelligent retry and routing decisions:

Error Retryable Description
ErrRateLimited Yes Provider rate limit (429)
ErrAuthFailed No Invalid API key (401/403)
ErrInvalidRequest No Malformed request (400)
ErrCircuitOpen No Circuit breaker is open
ErrMaxRetriesExceeded No All retry attempts exhausted
ErrUnknownModel No Model not found in any provider
ErrNoProviders No No providers registered

Use llmrouter.IsRetryable(err) and llmrouter.IsRateLimited(err) for programmatic checks.

Project Structure

router.go                      # Core router — provider registry, model resolution, middleware chain
provider.go                    # Provider interface and MiddlewareFunc type
types.go                       # Unified request/response types, streaming events, tool definitions
options.go                     # Functional options for router configuration
errors.go                      # Error types and retryability classification
middleware/
  retry.go                     # Retry with exponential backoff
  timeout.go                   # Request timeout enforcement
  breaker.go                   # Circuit breaker state machine (stdlib only)
  circuitbreaker.go            # Circuit breaker middleware wrapper
providers/
  openai/                      # OpenAI + compatible providers (DeepSeek, Groq, Together, Ollama, Sarvam)
  anthropic/                   # Anthropic Claude
  gemini/                      # Google Gemini
examples/
  simple/                      # Basic completion
  streaming/                   # Streaming responses
  tools/                       # Function calling
  fallback/                    # Multi-provider with middleware

License

Apache 2.0 — see LICENSE.

Built by BlueFunda — open-sourced under Apache 2.0.

Documentation

Overview

Package llmrouter provides a unified interface for routing LLM requests across multiple AI providers. Write once against a single API and deploy across OpenAI, Anthropic Claude, Google Gemini, or any OpenAI-compatible service — DeepSeek, Groq, Together AI, Ollama, Sarvam, and more.

Installation

go get github.com/bluefunda/llmrouter

Quick start

import (
    llmrouter "github.com/bluefunda/llmrouter"
    "github.com/bluefunda/llmrouter/middleware"
    "github.com/bluefunda/llmrouter/providers/anthropic"
    "github.com/bluefunda/llmrouter/providers/openai"
)

router := llmrouter.New(
    llmrouter.WithProvider("openai", openai.NewFromEnv("openai", "OPENAI_API_KEY")),
    llmrouter.WithProvider("anthropic", anthropic.NewFromEnv()),
    llmrouter.WithMiddleware(
        middleware.Retry(3, time.Second),
        middleware.Timeout(60*time.Second),
    ),
)

resp, err := router.Complete(ctx, &llmrouter.Request{
    Model:    "gpt-4o-mini",
    Messages: []llmrouter.Message{{Role: llmrouter.RoleUser, Content: "Hello!"}},
})

Providers

Three native provider packages are included:

The openai package also covers any OpenAI-compatible API via built-in presets:

openai.NewFromEnv("deepseek", "DEEPSEEK_API_KEY")   // DeepSeek
openai.NewFromEnv("groq",     "GROQ_API_KEY")       // Groq
openai.NewFromEnv("together", "TOGETHER_API_KEY")   // Together AI
openai.NewFromEnv("ollama",   "")                   // Ollama (local)
openai.NewFromEnv("sarvam",   "SARVAM_API_KEY")     // Sarvam

Streaming

Use Router.Stream to receive tokens as they arrive:

stream, err := router.Stream(ctx, &llmrouter.Request{
    Model:    "claude-sonnet-4-20250514",
    Messages: []llmrouter.Message{{Role: llmrouter.RoleUser, Content: "Write a haiku."}},
})
if err != nil {
    log.Fatal(err)
}
defer stream.Close()
for stream.Next() {
    event := stream.Event()
    switch event.Type {
    case llmrouter.EventContentDelta:
        fmt.Print(event.Content)
    case llmrouter.EventDone:
        fmt.Println()
    }
}
if err := stream.Err(); err != nil {
    log.Fatal(err)
}

Fallback routing

Register multiple providers and declare a fallback order. On primary failure the router tries each fallback in sequence, returning the first success:

router := llmrouter.New(
    llmrouter.WithProvider("openai",    openai.NewFromEnv("openai", "OPENAI_API_KEY")),
    llmrouter.WithProvider("anthropic", anthropic.NewFromEnv()),
    llmrouter.WithModelMapping("gpt-4o", "openai"),
    llmrouter.WithFallback("anthropic"), // tried if openai fails
)

Prompt caching

Mark static blocks for provider-level caching. Anthropic uses explicit cache_control annotations; OpenAI and Gemini cache automatically. Observe savings via [Usage.CachedPromptTokens] and [Usage.CacheCreationTokens]:

req := &llmrouter.Request{
    Model: "claude-sonnet-4-20250514",
    Messages: []llmrouter.Message{
        {
            Role:         llmrouter.RoleSystem,
            Content:      longSystemPrompt, // paid once, reused on every call
            CacheControl: &llmrouter.CacheControl{Type: "ephemeral"},
        },
        {Role: llmrouter.RoleUser, Content: userQuery},
    },
}
resp, _ := router.Complete(ctx, req)
fmt.Printf("cached=%d creation=%d\n",
    resp.Usage.CachedPromptTokens, resp.Usage.CacheCreationTokens)

Tool calling

Pass tool definitions in the request; the model returns tool calls which your code executes and returns as RoleTool messages:

req := &llmrouter.Request{
    Model: "gpt-4o-mini",
    Messages: []llmrouter.Message{
        {Role: llmrouter.RoleUser, Content: "What's the weather in Tokyo?"},
    },
    Tools: []llmrouter.Tool{weatherTool},
}
resp, _ := router.Complete(ctx, req)
if resp.Choices[0].FinishReason == "tool_calls" {
    tc := resp.Choices[0].Message.ToolCalls[0]
    result := callWeatherAPI(tc.Function.Arguments)
    // send result back in a follow-up request
}

Middleware

Middleware is applied in declaration order; each wraps the next. The github.com/bluefunda/llmrouter/middleware package provides three built-ins:

Custom middleware is a MiddlewareFunc — a function that wraps a Provider:

func Logging(next llmrouter.Provider) llmrouter.Provider {
    return &loggingProvider{Provider: next}
}

router := llmrouter.New(
    llmrouter.WithMiddleware(Logging),
)

Model resolution

The router resolves a model name to a provider in this order:

  1. Explicit mapping via WithModelMapping
  2. Provider name match (model name equals a registered provider name)
  3. Provider model list scan via Provider.Models()

Error handling

Errors are classified for intelligent retry decisions. Use IsRetryable and IsRateLimited for programmatic checks, or match typed sentinels directly:

resp, err := router.Complete(ctx, req)
if errors.Is(err, llmrouter.ErrRateLimited) {
    // back off and retry later
}
if errors.Is(err, llmrouter.ErrCircuitOpen) {
    // provider is temporarily unavailable
}

Other sentinels: ErrUnknownModel, ErrNoProviders, ErrAuthFailed, ErrMaxRetriesExceeded.

Packages

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrUnknownModel       = errors.New("unknown model")
	ErrUnknownProvider    = errors.New("unknown provider")
	ErrNoProviders        = errors.New("no providers registered")
	ErrRateLimited        = errors.New("rate limited")
	ErrInvalidRequest     = errors.New("invalid request")
	ErrAuthFailed         = errors.New("authentication failed")
	ErrProviderError      = errors.New("provider error")
	ErrCircuitOpen        = errors.New("circuit breaker is open")
	ErrMaxRetriesExceeded = errors.New("max retries exceeded")
)

Sentinel errors

View Source
var DefaultPrices = map[string]ModelPrice{

	"gpt-4.1":      {InputPerMillion: 2.00, OutputPerMillion: 8.00, CacheReadPerMillion: 0.50},
	"gpt-4.1-mini": {InputPerMillion: 0.40, OutputPerMillion: 1.60, CacheReadPerMillion: 0.10},
	"gpt-4.1-nano": {InputPerMillion: 0.10, OutputPerMillion: 0.40, CacheReadPerMillion: 0.025},
	"gpt-4o":       {InputPerMillion: 2.50, OutputPerMillion: 10.00, CacheReadPerMillion: 1.25},
	"gpt-4o-mini":  {InputPerMillion: 0.15, OutputPerMillion: 0.60, CacheReadPerMillion: 0.075},
	"o4-mini":      {InputPerMillion: 1.10, OutputPerMillion: 4.40, CacheReadPerMillion: 0.275},

	"claude-opus-4-20250514":     {InputPerMillion: 15.00, OutputPerMillion: 75.00, CacheReadPerMillion: 1.50},
	"claude-sonnet-4-20250514":   {InputPerMillion: 3.00, OutputPerMillion: 15.00, CacheReadPerMillion: 0.30},
	"claude-3-5-haiku-20241022":  {InputPerMillion: 0.80, OutputPerMillion: 4.00, CacheReadPerMillion: 0.08},
	"claude-3-5-sonnet-20241022": {InputPerMillion: 3.00, OutputPerMillion: 15.00, CacheReadPerMillion: 0.30},
	"claude-3-opus-20240229":     {InputPerMillion: 15.00, OutputPerMillion: 75.00, CacheReadPerMillion: 1.50},
	"claude-3-sonnet-20240229":   {InputPerMillion: 3.00, OutputPerMillion: 15.00, CacheReadPerMillion: 0.30},
	"claude-3-haiku-20240307":    {InputPerMillion: 0.25, OutputPerMillion: 1.25, CacheReadPerMillion: 0.03},

	"deepseek-chat":  {InputPerMillion: 0.07, OutputPerMillion: 1.10},
	"deepseek-coder": {InputPerMillion: 0.07, OutputPerMillion: 1.10},

	"gemini-2.5-pro":   {InputPerMillion: 1.25, OutputPerMillion: 10.00},
	"gemini-2.5-flash": {InputPerMillion: 0.15, OutputPerMillion: 0.60},
	"gemini-2.0-flash": {InputPerMillion: 0.10, OutputPerMillion: 0.40},
}

DefaultPrices is the built-in price table for known models. Cost is 0 for models not present in the map. Prices reflect standard API rates as of mid-2025; override with WithPriceTable if needed.

Functions

func CalculateCost added in v0.4.1

func CalculateCost(model string, usage *Usage, prices map[string]ModelPrice) float64

CalculateCost returns the estimated USD cost for a request given token usage and a price table. Cached tokens are billed at CacheReadPerMillion; uncached prompt tokens at InputPerMillion. Returns 0 if the model is not in the price table or usage is nil.

func IsRateLimited

func IsRateLimited(err error) bool

IsRateLimited returns true if the error indicates rate limiting

func IsRetryable

func IsRetryable(err error) bool

IsRetryable returns true if the error is retryable

Types

type APIError

type APIError struct {
	Provider   string
	StatusCode int
	Message    string
	Type       string
	Err        error
}

APIError represents an error from an LLM provider API

func (*APIError) Error

func (e *APIError) Error() string

func (*APIError) Unwrap

func (e *APIError) Unwrap() error

type CacheControl

type CacheControl struct {
	Type string `json:"type"` // "ephemeral"
}

CacheControl marks a content block for provider-level prompt caching. Only "ephemeral" is currently supported. OpenAI and Gemini cache automatically and ignore this field; set it only when targeting Anthropic.

type Choice

type Choice struct {
	Index        int      `json:"index"`
	Message      *Message `json:"message,omitempty"`
	Delta        *Delta   `json:"delta,omitempty"`
	FinishReason string   `json:"finish_reason,omitempty"`
}

Choice represents a completion choice

type ContentPart

type ContentPart struct {
	Type         string        `json:"type"` // "text", "image_url", or "document"
	Text         string        `json:"text,omitempty"`
	ImageURL     *ImageURL     `json:"image_url,omitempty"`
	Document     *Document     `json:"document,omitempty"`
	CacheControl *CacheControl `json:"cache_control,omitempty"`
}

ContentPart represents a part of a multimodal message

type Delta

type Delta struct {
	Role      Role       `json:"role,omitempty"`
	Content   string     `json:"content,omitempty"`
	ToolCalls []ToolCall `json:"tool_calls,omitempty"`
}

Delta represents streaming content delta

type Document

type Document struct {
	Base64    string `json:"base64"`
	MediaType string `json:"media_type"` // e.g. "application/pdf"
}

Document represents a document (PDF, etc.) for providers that support it natively

type Event

type Event struct {
	Type     EventType
	Content  string
	Delta    *Delta
	Response *Response
	Error    error
}

Event represents a streaming event

type EventType

type EventType int

EventType represents the type of streaming event

const (
	EventContentDelta  EventType = iota // Text content chunk
	EventToolCallDelta                  // Tool call chunk
	EventDone                           // Stream completed
	EventError                          // Error occurred
)

type FuncCall

type FuncCall struct {
	Name      string `json:"name"`
	Arguments string `json:"arguments"`
}

FuncCall represents a function call

type FuncRef

type FuncRef struct {
	Name string `json:"name"`
}

FuncRef references a specific function

type Function

type Function struct {
	Name        string          `json:"name"`
	Description string          `json:"description,omitempty"`
	Parameters  json.RawMessage `json:"parameters,omitempty"`
}

Function represents a function definition

type ImageURL

type ImageURL struct {
	URL       string `json:"url"`
	Detail    string `json:"detail,omitempty"`
	Base64    string `json:"base64,omitempty"`
	MediaType string `json:"media_type,omitempty"`
}

ImageURL represents an image reference with both URL and base64 forms

type Message

type Message struct {
	Role         Role          `json:"role"`
	Content      string        `json:"content"`
	ContentParts []ContentPart `json:"content_parts,omitempty"`
	Name         string        `json:"name,omitempty"`
	ToolCalls    []ToolCall    `json:"tool_calls,omitempty"`
	ToolCallID   string        `json:"tool_call_id,omitempty"`
	// CacheControl marks this message's content for prompt caching (Anthropic only).
	// For user messages with ContentParts, set CacheControl on individual parts instead.
	CacheControl *CacheControl `json:"cache_control,omitempty"`
}

Message represents a chat message

type MiddlewareFunc added in v0.4.0

type MiddlewareFunc func(Provider) Provider

MiddlewareFunc wraps a Provider with additional functionality. It is a plain function type; any func(Provider) Provider satisfies it directly.

type ModelPrice added in v0.4.1

type ModelPrice struct {
	InputPerMillion     float64 // USD per million input (prompt) tokens
	OutputPerMillion    float64 // USD per million output (completion) tokens
	CacheReadPerMillion float64 // USD per million cache-read tokens; 0 if not applicable
}

ModelPrice holds the per-token USD pricing for a model.

type Option

type Option func(*Router)

Option configures the Router

func WithFallback

func WithFallback(providers ...string) Option

WithFallback sets fallback providers in priority order

func WithMiddleware

func WithMiddleware(m ...MiddlewareFunc) Option

WithMiddleware adds middleware to the processing chain. Use this with middleware from the middleware package:

import "github.com/bluefunda/llmrouter/middleware"

router := llmrouter.New(
    llmrouter.WithMiddleware(
        middleware.Retry(3, time.Second),
        middleware.Timeout(60*time.Second),
    ),
)

func WithModelMapping

func WithModelMapping(model, provider string) Option

WithModelMapping maps a model to a specific provider

func WithPriceTable added in v0.4.1

func WithPriceTable(prices map[string]ModelPrice) Option

WithPriceTable replaces the default price table used for cost calculation. Callers can start from DefaultPrices and extend it, or supply a fully custom map.

func WithProvider

func WithProvider(name string, p Provider) Option

WithProvider registers a provider with the router

type Provider

type Provider interface {
	// Name returns the provider identifier (e.g., "openai", "anthropic")
	Name() string

	// Models returns the list of supported model IDs
	Models() []string

	// Complete performs a non-streaming completion
	Complete(ctx context.Context, req *Request) (*Response, error)

	// Stream performs a streaming completion
	Stream(ctx context.Context, req *Request) (*StreamResult, error)
}

Provider is the core interface that all LLM providers must implement.

type ProviderConfig

type ProviderConfig struct {
	Name          string
	APIKey        string
	BaseURL       string
	Model         string
	Models        []string
	Timeout       time.Duration
	CustomHeaders map[string]string // custom HTTP headers (e.g. api-subscription-key)
	// StringContentOnly forces message content to be sent as plain strings
	// instead of structured arrays. Required for some OpenAI-compatible APIs
	// (e.g. Sarvam) that don't support the array content format.
	StringContentOnly bool
}

ProviderConfig holds common configuration for providers

type Request

type Request struct {
	Messages    []Message      `json:"messages"`
	Model       string         `json:"model,omitempty"`
	Tools       []Tool         `json:"tools,omitempty"`
	ToolChoice  *ToolChoice    `json:"tool_choice,omitempty"`
	Temperature *float64       `json:"temperature,omitempty"`
	MaxTokens   *int           `json:"max_tokens,omitempty"`
	TopP        *float64       `json:"top_p,omitempty"`
	Stop        []string       `json:"stop,omitempty"`
	Metadata    map[string]any `json:"metadata,omitempty"`
}

Request represents a unified LLM request

type Response

type Response struct {
	ID       string   `json:"id"`
	Object   string   `json:"object"`
	Created  int64    `json:"created"`
	Model    string   `json:"model"`
	Choices  []Choice `json:"choices"`
	Usage    *Usage   `json:"usage,omitempty"`
	Provider string   `json:"provider"`
}

Response represents a unified LLM response (OpenAI-compatible)

type Role

type Role string

Role represents the message role

const (
	RoleSystem    Role = "system"
	RoleUser      Role = "user"
	RoleAssistant Role = "assistant"
	RoleTool      Role = "tool"
)

type Router

type Router struct {
	// contains filtered or unexported fields
}

Router manages multiple LLM providers and routes requests

func New

func New(opts ...Option) *Router

New creates a new Router with the given options

func (*Router) AddMiddleware

func (r *Router) AddMiddleware(m MiddlewareFunc)

AddMiddleware adds middleware to the router

func (*Router) Close added in v0.4.0

func (r *Router) Close() error

Close releases resources held by registered providers that implement io.Closer. Call this when the router is no longer needed (e.g. on application shutdown).

func (*Router) Complete

func (r *Router) Complete(ctx context.Context, req *Request) (*Response, error)

Complete performs a non-streaming completion

func (*Router) GetProvider

func (r *Router) GetProvider(name string) (Provider, bool)

GetProvider returns a provider by name

func (*Router) MapModel

func (r *Router) MapModel(model, provider string)

MapModel maps a model name to a specific provider

func (*Router) Providers

func (r *Router) Providers() []string

Providers returns list of registered provider names in insertion order

func (*Router) RegisterProvider

func (r *Router) RegisterProvider(name string, p Provider)

RegisterProvider adds a provider to the router

func (*Router) SetFallbacks

func (r *Router) SetFallbacks(providers ...string)

SetFallbacks sets the fallback provider order

func (*Router) Stream

func (r *Router) Stream(ctx context.Context, req *Request) (*StreamResult, error)

Stream sends a request to the appropriate provider and streams the response.

type StreamResult added in v0.3.1

type StreamResult struct {
	// contains filtered or unexported fields
}

StreamResult is the iterator type for streaming LLM responses. Usage:

stream, err := router.Stream(ctx, req)
if err != nil { return err }
defer stream.Close()
for stream.Next() {
    event := stream.Event()
    // handle event
}
if err := stream.Err(); err != nil { ... }

func NewStreamResult added in v0.3.1

func NewStreamResult(ch <-chan Event) *StreamResult

NewStreamResult creates a StreamResult from an event channel. Providers and middleware use this to construct a StreamResult.

func (*StreamResult) Close added in v0.3.1

func (s *StreamResult) Close() error

Close stops the stream and releases resources. Safe to call multiple times.

func (*StreamResult) Err added in v0.3.1

func (s *StreamResult) Err() error

Err returns the streaming error, if any. Check after Next returns false.

func (*StreamResult) Event added in v0.3.1

func (s *StreamResult) Event() Event

Event returns the current event (valid after Next returns true).

func (*StreamResult) Next added in v0.3.1

func (s *StreamResult) Next() bool

Next advances to the next event. Returns false when the stream ends or an error occurs.

func (*StreamResult) OnClose added in v0.3.1

func (s *StreamResult) OnClose(fn func() error)

OnClose registers a function called when Close is invoked (e.g. context cancel).

type Tool

type Tool struct {
	Type     string   `json:"type"`
	Function Function `json:"function"`
}

Tool represents a function/tool definition

type ToolCall

type ToolCall struct {
	ID       string   `json:"id"`
	Type     string   `json:"type"`
	Function FuncCall `json:"function"`
	Index    *int     `json:"index,omitempty"`
}

ToolCall represents a tool invocation

type ToolChoice

type ToolChoice struct {
	Type     string   `json:"type,omitempty"`
	Function *FuncRef `json:"function,omitempty"`
}

ToolChoice controls tool selection

type Usage

type Usage struct {
	PromptTokens        int     `json:"prompt_tokens"`
	CompletionTokens    int     `json:"completion_tokens"`
	TotalTokens         int     `json:"total_tokens"`
	CachedPromptTokens  int     `json:"cached_prompt_tokens,omitempty"`  // tokens served from cache (all providers)
	CacheCreationTokens int     `json:"cache_creation_tokens,omitempty"` // tokens written to cache (Anthropic only)
	Cost                float64 `json:"cost_usd,omitempty"`              // estimated USD cost; 0 if model not in price table
}

Usage represents token usage

func (*Usage) CacheHitRate added in v0.4.1

func (u *Usage) CacheHitRate() float64

CacheHitRate returns the fraction of prompt tokens served from cache (0–1). Returns 0 if no prompt tokens were recorded.

Directories

Path Synopsis
examples
fallback command
simple command
streaming command
tools command
Package middleware provides composable cross-cutting concerns for LLM provider calls: retry with exponential backoff, per-request timeouts, and a circuit breaker to prevent cascading failures.
Package middleware provides composable cross-cutting concerns for LLM provider calls: retry with exponential backoff, per-request timeouts, and a circuit breaker to prevent cascading failures.
providers
anthropic
Package anthropic implements the llmrouter.Provider interface for Anthropic Claude models using the official Anthropic Go SDK.
Package anthropic implements the llmrouter.Provider interface for Anthropic Claude models using the official Anthropic Go SDK.
gemini
Package gemini implements the llmrouter.Provider interface for Google Gemini models using the official Google Generative AI Go SDK.
Package gemini implements the llmrouter.Provider interface for Google Gemini models using the official Google Generative AI Go SDK.
openai
Package openai implements the llmrouter.Provider interface for OpenAI and any OpenAI-compatible API.
Package openai implements the llmrouter.Provider interface for OpenAI and any OpenAI-compatible API.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL