llm

package module
v0.25.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 23, 2026 License: MIT Imports: 18 Imported by: 0

README

LLM Provider Abstraction Library

A unified Go library for interacting with multiple LLM providers through a consistent interface. Supports streaming responses, tool calling, reasoning, prompt caching, and zero-config multi-provider setup.

Features

  • Unified Provider Interface — Single API for multiple LLM providers
  • Streaming Support — Channel-based streaming with structured delta events
  • Tool Calling — Consistent tool/function calling across providers
  • Typed Tool DispatchStreamResponse handles tool calls with strongly-typed handlers
  • Reasoning Support — Extended thinking / reasoning tokens (Anthropic, OpenAI o-series, Bedrock)
  • Prompt Caching — Transparent cache control for Anthropic, Bedrock, and OpenAI
  • Context Cancellation — Proper cancellation support for long-running streams
  • Zero-config Setupprovider/auto auto-detects providers from environment variables
  • llmtest Package — Test helpers for stream consumers (net/http/httptest style)

Supported Providers

Provider Name Description
Anthropic API anthropic Direct Anthropic API with API key
Claude OAuth claude OAuth-based Claude access (auto-detects local credentials)
OpenAI openai OpenAI GPT models (GPT-4o, GPT-5, o-series, Codex)
AWS Bedrock bedrock AWS Bedrock models (Claude, Llama, etc.)
Ollama ollama Local Ollama models
OpenRouter openrouter 200+ tool-enabled models via OpenRouter proxy
Router router Combines multiple providers with failover and aliases

Installation

go get github.com/codewandler/llm

Quick Start

package main

import (
    "context"
    "fmt"

    "github.com/codewandler/llm"
    "github.com/codewandler/llm/provider/auto"
)

func main() {
    ctx := context.Background()

    // Auto-detects providers from environment variables
    p, err := auto.New(ctx)
    if err != nil {
        panic(err)
    }

    events, err := p.CreateStream(ctx, llm.StreamRequest{
        Model: "anthropic/claude-sonnet-4-5",
        Messages: llm.Messages{
            &llm.UserMsg{Content: "What is the capital of France?"},
        },
    })
    if err != nil {
        panic(err)
    }

    for event := range events {
        switch event.Type {
        case llm.StreamEventDelta:
            fmt.Print(event.Text())
        case llm.StreamEventDone:
            fmt.Println()
            if event.Usage != nil {
                fmt.Printf("Tokens: %d in, %d out\n",
                    event.Usage.InputTokens, event.Usage.OutputTokens)
            }
        case llm.StreamEventError:
            fmt.Printf("Error: %v\n", event.Error)
        }
    }
}

Provider Setup

provider/auto — Zero-Config Multi-Provider

auto.New(ctx, ...Option) auto-detects providers from environment variables and returns a ready-to-use llm.Provider:

import "github.com/codewandler/llm/provider/auto"

// Auto-detect everything from environment variables
p, err := auto.New(ctx)

// Or explicitly opt in to specific providers
p, err := auto.New(ctx,
    auto.WithAnthropic(),     // ANTHROPIC_API_KEY
    auto.WithOpenAI(),        // OPENAI_KEY or OPENAI_API_KEY
    auto.WithBedrock(),       // AWS credentials
    auto.WithOpenRouter(),    // OPENROUTER_API_KEY
    auto.WithClaudeLocal(),   // ~/.claude/.credentials.json
)

// Add a Claude OAuth account from a token store
p, err := auto.New(ctx, auto.WithClaude(myTokenStore))

// Custom global aliases with failover
p, err := auto.New(ctx,
    auto.WithOpenAI(),
    auto.WithOpenRouter(),
    auto.WithGlobalAlias("o3", "openai/o3", "openrouter/openai/o3"),
)
Direct Provider Usage

Each provider can also be used directly without auto:

import "github.com/codewandler/llm/provider/anthropic"

p := anthropic.New(llm.APIKeyFromEnv("ANTHROPIC_API_KEY"))

events, err := p.CreateStream(ctx, llm.StreamRequest{
    Model:    "claude-sonnet-4-5",
    Messages: llm.Messages{&llm.UserMsg{Content: "Hello!"}},
})
import "github.com/codewandler/llm/provider/openai"

p := openai.New(llm.APIKeyFromEnv("OPENAI_KEY"))
import "github.com/codewandler/llm/provider/bedrock"

p := bedrock.New() // uses default AWS credential chain
p := bedrock.New(bedrock.WithRegion("us-east-1"))
import "github.com/codewandler/llm/provider/ollama"

p := ollama.New("http://localhost:11434")
import "github.com/codewandler/llm/provider/openrouter"

p := openrouter.New(llm.APIKeyFromEnv("OPENROUTER_API_KEY"))
Claude OAuth Provider
import "github.com/codewandler/llm/provider/anthropic/claude"

// Auto-detect local Claude credentials (default)
p := claude.New()

// Or with explicit token provider
p := claude.New(
    claude.WithManagedTokenProvider("my-key", tokenStore, nil),
)

Token management interfaces:

  • TokenStore — stores and retrieves tokens (implement for your storage backend)
  • LocalTokenStore — reads from ~/.claude/.credentials.json
  • ManagedTokenProvider — wraps a TokenStore with automatic refresh
Router Provider

For custom multi-provider routing with failover:

import "github.com/codewandler/llm/provider/router"

p, err := router.New(cfg, factories)

Stream Events

type StreamEvent struct {
    // Metadata stamped on every event
    RequestID string
    Seq       uint64
    Timestamp time.Time

    Type     StreamEventType
    Start    *StreamStart  // StreamEventStart
    Delta    *Delta        // StreamEventDelta
    ToolCall *ToolCall     // StreamEventToolCall
    Routed   *Routed       // StreamEventRouted
    Usage    *Usage        // StreamEventDone
    Error    *ProviderError // StreamEventError
}

const (
    StreamEventCreated  // emitted immediately when stream is opened
    StreamEventStart    // first content event; carries request metadata
    StreamEventDelta    // text or reasoning token
    StreamEventToolCall // completed tool call
    StreamEventRouted   // router selected a backend
    StreamEventDone     // stream complete; carries usage
    StreamEventError    // error occurred
)
Reading Deltas
for event := range stream {
    switch event.Type {
    case llm.StreamEventDelta:
        fmt.Print(event.Text())          // text tokens
        fmt.Print(event.ReasoningText()) // reasoning tokens (thinking models)
    }
}

event.Delta is a *Delta struct:

type Delta struct {
    Type      DeltaType // DeltaTypeText | DeltaTypeReasoning
    Text      string
    Reasoning string
    Index     *uint32   // block index, provider-dependent
}
StreamStart Metadata
case llm.StreamEventStart:
    fmt.Printf("Request ID: %s\n", event.Start.RequestID)
    fmt.Printf("Model: %s\n", event.Start.Model)
    fmt.Printf("TTFT: %s\n", event.Start.TimeToFirstToken)
Error Handling
events, err := p.CreateStream(ctx, req)
if err != nil {
    // Initial request failed (auth, bad params, etc.)
}

for event := range events {
    if event.Type == llm.StreamEventError {
        if errors.Is(event.Error, llm.ErrAPIError) {
            fmt.Printf("API error %d: %s\n", event.Error.StatusCode, event.Error.Body)
        }
        if errors.Is(event.Error, llm.ErrContextCancelled) {
            fmt.Println("cancelled")
        }
    }
}

Error sentinels: ErrContextCancelled, ErrRequestFailed, ErrAPIError, ErrStreamRead, ErrStreamDecode, ErrMissingAPIKey, ErrBuildRequest, ErrUnknownModel, ErrNoProviders.

Usage Information
case llm.StreamEventDone:
    if event.Usage != nil {
        fmt.Printf("in=%d out=%d cost=$%.6f\n",
            event.Usage.InputTokens, event.Usage.OutputTokens, event.Usage.Cost)
        fmt.Printf("cache read=%d write=%d\n",
            event.Usage.CacheReadTokens, event.Usage.CacheWriteTokens)
        fmt.Printf("reasoning=%d\n", event.Usage.ReasoningTokens)
    }

Tool Calling

Typed Tool Dispatch with StreamResponse

The recommended approach for agentic tool use. Process accumulates the stream and dispatches tool calls to typed handlers:

type GetWeatherParams struct {
    Location string `json:"location" jsonschema:"description=City name,required"`
    Unit     string `json:"unit"     jsonschema:"description=Unit,enum=celsius,enum=fahrenheit"`
}

type GetWeatherResult struct {
    Temp int    `json:"temp"`
    Desc string `json:"desc"`
}

spec := llm.NewToolSpec[GetWeatherParams]("get_weather", "Get current weather")

result := <-llm.Process(ctx, stream).
    HandleTool(llm.Handle(spec, func(ctx context.Context, p GetWeatherParams) (*GetWeatherResult, error) {
        return &GetWeatherResult{Temp: 22, Desc: "sunny"}, nil
    })).
    Result()

fmt.Println(result.Text)
fmt.Println(result.ToolResults) // []ToolCallResult ready to append to messages

For concurrent tool execution:

result := <-llm.Process(ctx, stream).
    DispatchAsync().
    HandleTool(...).
    Result()

Callback hooks:

result := <-llm.Process(ctx, stream).
    OnText(func(s string) { fmt.Print(s) }).
    OnReasoning(func(s string) { /* handle thinking */ }).
    OnStart(func(s *llm.StreamStart) { log.Println(s.RequestID) }).
    HandleTool(...).
    Result()

StreamResult fields: Text, Reasoning, ToolCalls, ToolResults, Usage, StopReason, Start.

Low-Level Tool Definitions

For manual tool management:

tools := []llm.ToolDefinition{
    llm.ToolDefinitionFor[GetWeatherParams]("get_weather", "Get current weather"),
}

events, err := p.CreateStream(ctx, llm.StreamRequest{
    Model:    "anthropic/claude-sonnet-4-5",
    Messages: messages,
    Tools:    tools,
})

for event := range events {
    if event.Type == llm.StreamEventToolCall {
        tc := event.ToolCall
        // tc.ID, tc.Name, tc.Arguments (map[string]any)
    }
}
ToolSet — Type-Safe Parse + Dispatch
toolset := llm.NewToolSet(
    llm.NewToolSpec[GetWeatherParams]("get_weather", "Get weather"),
)

stream, _ := p.CreateStream(ctx, llm.StreamRequest{
    Tools: toolset.Definitions(),
    ...
})

// collect raw tool calls from stream, then:
calls, err := toolset.Parse(rawToolCalls)
for _, call := range calls {
    switch c := call.(type) {
    case *llm.TypedToolCall[GetWeatherParams]:
        fmt.Println(c.Params.Location) // strongly typed
    }
}
Tool Choice
// Model decides (default)
llm.StreamRequest{ToolChoice: llm.ToolChoiceAuto{}}

// Must call at least one tool
llm.StreamRequest{ToolChoice: llm.ToolChoiceRequired{}}

// Cannot call any tools
llm.StreamRequest{ToolChoice: llm.ToolChoiceNone{}}

// Must call a specific tool
llm.StreamRequest{ToolChoice: llm.ToolChoiceTool{Name: "get_weather"}}
Type OpenAI Anthropic Ollama
ToolChoiceAuto{} "auto" {"type":"auto"} ignored
ToolChoiceRequired{} "required" {"type":"any"} ignored
ToolChoiceNone{} "none" omitted ignored
ToolChoiceTool{Name:"X"} {"type":"function",...} {"type":"tool","name":"X"} ignored
Struct Tag Reference
type Params struct {
    Location string  `json:"location" jsonschema:"description=City name,required"`
    Unit     string  `json:"unit"     jsonschema:"description=Unit,enum=celsius,enum=fahrenheit"`
    Limit    int     `json:"limit"    jsonschema:"minimum=1,maximum=100"`
    Pattern  string  `json:"pattern"  jsonschema:"pattern=^[a-z]+$"`
}

Messages

var msgs llm.Messages
msgs.AddSystemMsg("You are helpful.")
msgs.AddUserMsg("Hello")
msgs.AddAssistantMsg("Hi there")
msgs.AddToolCallResult(callID, output, false /* isError */)
msgs.Append(msg)

Or construct inline:

msgs := llm.Messages{
    &llm.SystemMsg{Content: "You are helpful."},
    &llm.UserMsg{Content: "Hello"},
    &llm.AssistantMsg{ToolCalls: []llm.ToolCall{tc}},
    &llm.ToolCallResult{ToolCallID: tc.ID, Output: result},
}

Reasoning Effort (OpenAI)

stream, _ := p.CreateStream(ctx, llm.StreamRequest{
    Model:           "openai/o3",
    Messages:        messages,
    ReasoningEffort: llm.ReasoningEffortHigh,
})

// Reasoning tokens arrive as StreamEventDelta with DeltaTypeReasoning
for event := range stream {
    if event.Type == llm.StreamEventDelta {
        fmt.Print(event.Text())          // response text
        fmt.Print(event.ReasoningText()) // thinking tokens
    }
}
Constant Value Notes
ReasoningEffortNone "none" GPT-5.1+ only
ReasoningEffortLow "low"
ReasoningEffortMedium "medium" Default for pre-5.1
ReasoningEffortHigh "high"
ReasoningEffortXHigh "xhigh" Codex-max+ only

Prompt Caching

// Top-level hint: cache the entire conversation prefix
events, err := p.CreateStream(ctx, llm.StreamRequest{
    Model:     "anthropic/claude-sonnet-4-5",
    Messages:  messages,
    CacheHint: &llm.CacheHint{Enabled: true},
})

// Per-message breakpoints (advanced)
msgs := llm.Messages{
    &llm.SystemMsg{
        Content:   largeSystemPrompt,
        CacheHint: &llm.CacheHint{Enabled: true},
    },
    &llm.UserMsg{Content: "Hello"},
}
Provider Mode TTL options
Anthropic Explicit breakpoints 5 min (default), 1 h (selected models)
Bedrock (Claude) Explicit breakpoints 5 min (default), 1 h (selected models)
OpenAI Fully automatic in-memory (default), 1 h via CacheHint{TTL:"1h"}
Ollama / OpenRouter Not supported

Cache usage is reported in event.Usage.CacheReadTokens / CacheWriteTokens.

Model Reference Format

anthropic/claude-sonnet-4-5           # Direct Anthropic API
claude/claude-sonnet-4-5              # Claude OAuth provider
openai/gpt-4o                         # OpenAI
bedrock/anthropic.claude-3-5-sonnet   # AWS Bedrock
ollama/llama3.2:1b                    # Local Ollama
openrouter/anthropic/claude-sonnet-4.5  # OpenRouter proxy

Global aliases (configured via auto.WithGlobalAlias):

  • fast — fastest/cheapest model
  • default — balanced performance
  • powerful — most capable model
  • codex — OpenAI Codex model

Testing with llmtest

import "github.com/codewandler/llm/llmtest"

ch := llmtest.SendEvents(
    llmtest.TextEvent("hello"),
    llmtest.ToolEvent("call_1", "get_weather", map[string]any{"location": "Berlin"}),
    llmtest.DoneEvent(nil),
)

result := <-llm.Process(ctx, ch).HandleTool(...).Result()

Functions: SendEvents, TextEvent, ReasoningEvent, ToolEvent, DoneEvent, ErrorEvent.

Context Cancellation

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

events, err := p.CreateStream(ctx, llm.StreamRequest{...})
for event := range events {
    if event.Type == llm.StreamEventError {
        if errors.Is(event.Error, llm.ErrContextCancelled) {
            fmt.Println("timed out")
        }
    }
}

Environment Variables

export ANTHROPIC_API_KEY="your-api-key"
export OPENAI_KEY="your-api-key"
export OPENROUTER_API_KEY="your-api-key"
export OLLAMA_BASE_URL="http://localhost:11434"  # optional, default shown
export AWS_REGION="us-east-1"
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"

Architecture

llm/
├── llm.go              # Provider interface, Streamer interface
├── stream.go           # StreamEvent, StreamRequest, Delta, EventStream, Usage
├── stream_response.go  # StreamResponse, Process(), StreamResult
├── message.go          # Message types: UserMsg, AssistantMsg, ToolCallResult, etc.
├── tool.go             # ToolDefinition, ToolSpec, ToolSet, TypedToolCall
├── errors.go           # ProviderError, error sentinels
├── model.go            # Model type
├── option.go           # Functional options (WithAPIKey, WithHTTPClient, etc.)
├── reasoning.go        # ReasoningEffort constants
├── llmtest/            # Test helpers (SendEvents, TextEvent, etc.)
│
└── provider/
    ├── anthropic/      # Direct Anthropic API
    │   └── claude/     # OAuth-based Claude provider
    ├── bedrock/        # AWS Bedrock
    ├── openai/         # OpenAI API (Chat + Responses API)
    ├── openrouter/     # OpenRouter proxy
    ├── ollama/         # Local Ollama
    ├── auto/           # Zero-config multi-provider setup
    ├── router/         # Multi-provider routing with failover
    └── fake/           # Test provider

CLI Tool

go run ./cmd/llmcli auth status          # Check Claude OAuth credentials
go run ./cmd/llmcli infer "Hello"        # Quick inference test
go run ./cmd/llmcli infer -v -m default "Explain Go channels"  # Verbose

Contributing

go test ./...         # run all tests
go test -race ./...   # race detector
go fmt ./...          # format
go vet ./...          # vet

See AGENTS.md for architecture and coding conventions.

License

MIT — see LICENSE.

Documentation

Index

Constants

View Source
const (
	ProviderNameAnthropic  = "anthropic"
	ProviderNameClaude     = "claude"
	ProviderNameBedrock    = "bedrock"
	ProviderNameOllama     = "ollama"
	ProviderNameOpenAI     = "openai"
	ProviderNameOpenRouter = "openrouter"
	ProviderNameRouter     = "router"
)

Provider name constants used in ProviderError.Provider.

Variables

View Source
var (
	// ErrContextCancelled is returned when the caller's context is cancelled
	// while a stream is in progress.
	ErrContextCancelled = errors.New("context cancelled")

	// ErrRequestFailed is returned when the HTTP transport fails before a
	// response is received (e.g. network error, DNS failure).
	ErrRequestFailed = errors.New("request failed")

	// ErrAPIError is returned when the provider API responds with a non-2xx
	// HTTP status. The ProviderError carries StatusCode and Body.
	ErrAPIError = errors.New("API error")

	// ErrStreamRead is returned when reading or scanning the response stream
	// fails at the I/O level (e.g. scanner error, connection reset).
	ErrStreamRead = errors.New("stream read error")

	// ErrStreamDecode is returned when a stream chunk cannot be decoded
	// (e.g. malformed JSON in an SSE data line).
	ErrStreamDecode = errors.New("stream decode error")

	// ErrProviderError is returned when the provider sends an explicit
	// error inside the stream (e.g. Anthropic error event, OpenRouter
	// chunk-level error).
	ErrProviderError = errors.New("provider error")

	// ErrMissingAPIKey is returned when a provider requires an API key
	// but none has been configured.
	ErrMissingAPIKey = errors.New("missing API key")

	// ErrBuildRequest is returned when serialising the outgoing request
	// fails before it is sent.
	ErrBuildRequest = errors.New("build request error")

	// ErrUnknownModel is returned when a model ID or alias cannot be resolved.
	ErrUnknownModel = errors.New("unknown model")

	// ErrNoProviders is returned when no providers are configured or all
	// failover targets have been exhausted.
	ErrNoProviders = errors.New("no providers configured")

	// ErrUnknown is used to wrap any error that is not already a ProviderError.
	// Callers can test for it with errors.Is(err, llm.ErrUnknown).
	ErrUnknown = errors.New("unknown error")
)

Sentinel errors for use with errors.Is. Each ProviderError wraps one of these so callers can inspect the error kind without string matching.

Functions

func CountMessage added in v0.24.0

func CountMessage(model string, msg Message) (int, error)

CountMessage returns the number of tokens for a single Message for the given model. The message is converted to its text representation using the same logic as CountTokens (role content + tool call names/args for AssistantMsg, output for ToolCallResult, etc.).

This is a convenience function for callers that count messages individually rather than as a batch — for example, per-entry token estimates in a conversation history manager.

func CountMessagesAndTools added in v0.24.0

func CountMessagesAndTools(tc *TokenCount, req TokenCountRequest, encoding string, perMsgOverhead int, replyPriming int) error

CountMessagesAndTools is a low-level helper for provider TokenCounter implementations. Library consumers should use the TokenCounter interface directly rather than calling this function.

It fills tc.PerMessage, tc.ToolsTokens, tc.PerTool, and tc.InputTokens using the given BPE encoding, then calls applyRoleBreakdown to populate the role breakdown fields.

Returns an error if req.Model is empty.

perMsgOverhead is added to InputTokens once per message (e.g. 4 for the OpenAI cookbook formula; 0 for approximation-only providers). replyPriming is a fixed addend for reply-priming tokens (e.g. 3 for OpenAI; 0 for others).

func CountMessagesAndToolsAnthropic added in v0.24.0

func CountMessagesAndToolsAnthropic(tc *TokenCount, req TokenCountRequest) error

CountMessagesAndToolsAnthropic is like CountMessagesAndTools but applies Anthropic-specific tool overhead constants: the hidden tool-use system preamble (~330 tokens, paid once) plus per-tool serialisation framing (~126 tokens first tool, ~85 tokens each additional). In total, a request with N tools adds 330+126+(N-1)×85 tokens on top of the raw JSON counts.

Use this for anthropic, bedrock, and claude providers.

func CountText added in v0.24.0

func CountText(model, text string) (int, error)

CountText returns the number of tokens in text for the given model. The encoding is selected automatically based on the model ID: o200k_base for GPT-4o/o-series, cl100k_base for everything else.

This is a convenience function for callers that need to count raw text without constructing a full TokenCountRequest — for example, context-budget managers that count individual history entries.

func DefaultHttpClient added in v0.23.0

func DefaultHttpClient() *http.Client

DefaultHttpClient returns the shared default HTTP client. It is safe for concurrent use and is reused across all providers that do not supply their own client.

func DeltaIndex added in v0.23.0

func DeltaIndex(i int) *uint32

DeltaIndex converts an int to a *uint32 for use as Delta.Index.

func NewHttpClient added in v0.23.0

func NewHttpClient(opts HttpClientOpts) *http.Client

NewHttpClient creates a new *http.Client with sensible defaults for LLM provider use. The client has no top-level Timeout — LLM streams can be arbitrarily long and are cancelled via context. Transport-level timeouts guard against stalled connections at the TCP/TLS layer.

When opts.Logger is non-nil, every request and response is logged at Debug level. Set opts.Debug = true to also include headers and bodies. Response bodies are tee-logged as they stream — no buffering, no broken SSE.

func NewRequestID added in v0.23.0

func NewRequestID() string

NewRequestID generates a unique correlation ID for a stream request. Uses a URL-safe nanoid with a length of 12 characters.

Types

type AssistantMsg added in v0.5.0

type AssistantMsg struct {
	Content   string
	ToolCalls []ToolCall
	CacheHint *CacheHint
}

AssistantMsg contains an assistant response, optionally with tool calls.

func (*AssistantMsg) MarshalJSON added in v0.5.0

func (m *AssistantMsg) MarshalJSON() ([]byte, error)

func (*AssistantMsg) Role added in v0.5.0

func (m *AssistantMsg) Role() Role

func (*AssistantMsg) Validate added in v0.5.0

func (m *AssistantMsg) Validate() error

type BoundToolSpec added in v0.23.0

type BoundToolSpec[In, Out any] struct {
	// contains filtered or unexported fields
}

BoundToolSpec pairs a ToolSpec[In] with a handler function, satisfying both ToolHandler (for StreamResponse.HandleTool) and toolRegistration (for NewToolSet). Create one with the package-level Handle function.

func Handle added in v0.23.0

func Handle[In, Out any](spec *ToolSpec[In], fn func(ctx context.Context, in In) (*Out, error)) *BoundToolSpec[In, Out]

Handle binds a handler function to a ToolSpec, producing a *BoundToolSpec that satisfies both ToolHandler and toolRegistration.

Because Go methods cannot introduce new type parameters, this is a package-level generic function rather than a method on ToolSpec.

Example:

weatherSpec := llm.NewToolSpec[GetWeatherParams]("get_weather", "Get weather")

// Register with StreamResponse:
llm.Process(ctx, ch).
    HandleTool(llm.Handle(weatherSpec, func(ctx context.Context, in GetWeatherParams) (*GetWeatherResult, error) {
        return &GetWeatherResult{Temp: 22}, nil
    }))

// Or pass directly to NewToolSet — BoundToolSpec satisfies toolRegistration too:
tools := llm.NewToolSet(
    llm.Handle(weatherSpec, weatherFn),
    llm.Handle(searchSpec,  searchFn),
)

func (*BoundToolSpec[In, Out]) Definition added in v0.23.0

func (b *BoundToolSpec[In, Out]) Definition() ToolDefinition

Definition implements toolRegistration — delegates to the embedded spec.

func (*BoundToolSpec[In, Out]) Handle added in v0.23.0

func (b *BoundToolSpec[In, Out]) Handle(ctx context.Context, call ToolCall) (string, error)

Handle implements ToolHandler — validates, unmarshals, calls fn, marshals result.

func (*BoundToolSpec[In, Out]) ToolName added in v0.23.0

func (b *BoundToolSpec[In, Out]) ToolName() string

ToolName implements ToolHandler — returns the spec's tool name.

type CacheHint added in v0.20.0

type CacheHint struct {
	// Enabled marks this content as a cache breakpoint candidate.
	// For Anthropic/Bedrock: emits cache_control / cachePoint at this position.
	// For OpenAI: no-op (caching is automatic).
	Enabled bool

	// TTL requests a specific cache duration.
	// Valid values: "" (provider default, typically 5m), "5m", "1h".
	// The "1h" option requires a supporting model (Claude Haiku/Sonnet/Opus 4.5+).
	TTL string
}

CacheHint requests provider-side prompt caching for a message or request. It is a provider-neutral instruction: Anthropic and Bedrock translate it to explicit cache breakpoints on content blocks; OpenAI caching is always automatic and ignores per-message hints, but honours TTL on Request.CacheHint.

type Delta added in v0.23.0

type Delta struct {
	// Type identifies which payload field is set.
	Type DeltaType `json:"type"`

	// Index is the position of this content block in the model's output array.
	// nil when the provider does not supply block-level indexing.
	//
	// Index is meaningful because a single HTTP response can contain multiple
	// blocks of the same type. With Anthropic's interleaved-thinking beta a
	// single response may produce: thinking(0) → text(1) → tool(2) → thinking(3) → text(4).
	// Without Index a consumer cannot tell which thinking or text block a delta
	// belongs to.
	//
	// Provider semantics:
	//   Anthropic          — content_block index, all block types
	//   Bedrock            — ContentBlockIndex, all block types
	//   OpenAI Responses   — output_index, all output types
	//   OpenAI Completions — tool_calls[].index, tool calls only; text=nil
	//   OpenRouter         — tool_calls[].index, tool calls only; text=nil
	//   Ollama             — nil (complete tool calls only, no streaming fragments)
	Index *uint32 `json:"index,omitempty"`

	// Text is populated for DeltaTypeText.
	Text string `json:"text,omitempty"`

	// Reasoning is populated for DeltaTypeReasoning.
	Reasoning string `json:"reasoning,omitempty"`

	// ToolID, ToolName, and ToolArgs are populated for DeltaTypeTool.
	// ToolArgs is a raw partial JSON fragment — not yet a complete object.
	ToolID   string `json:"tool_id,omitempty"`
	ToolName string `json:"tool_name,omitempty"`
	ToolArgs string `json:"tool_args,omitempty"`
}

Delta carries one incremental content chunk from the model stream. Exactly one payload field is populated, indicated by Type.

func ReasoningDelta added in v0.23.0

func ReasoningDelta(idx *uint32, text string) *Delta

ReasoningDelta creates a Delta for an incremental reasoning/thinking chunk.

func TextDelta added in v0.23.0

func TextDelta(idx *uint32, text string) *Delta

TextDelta creates a Delta for an incremental text chunk.

func ToolDelta added in v0.23.0

func ToolDelta(idx *uint32, id, name, argsFragment string) *Delta

ToolDelta creates a Delta for a partial tool-call arguments fragment.

type DeltaType added in v0.23.0

type DeltaType string

DeltaType identifies the kind of incremental content carried by a Delta.

const (
	DeltaTypeText      DeltaType = "text"
	DeltaTypeReasoning DeltaType = "reasoning"
	DeltaTypeTool      DeltaType = "tool"
)

type EventStream added in v0.23.0

type EventStream struct {
	// contains filtered or unexported fields
}

EventStream wraps a buffered StreamEvent channel and stamps every outgoing event with the same RequestID, an incrementing sequence number, and a timestamp. Providers create one at the top of CreateStream via NewEventStream, send all events through Send, and return C() to callers.

func NewEventStream added in v0.23.0

func NewEventStream() *EventStream

NewEventStream creates an EventStream with a freshly generated RequestID, records the creation time, emits a StreamEventCreated event, and returns a buffered channel of 64 events.

func (*EventStream) C added in v0.23.0

func (s *EventStream) C() <-chan StreamEvent

C returns the read-only channel to hand back to the caller of CreateStream.

func (*EventStream) Close added in v0.23.0

func (s *EventStream) Close()

Close closes the underlying channel. Safe to call multiple times.

func (*EventStream) Delta added in v0.23.0

func (s *EventStream) Delta(d *Delta)

Delta sends a StreamEventDelta event with the given delta.

func (*EventStream) Done added in v0.23.0

func (s *EventStream) Done(reason StopReason, usage *Usage)

Done sends a StreamEventDone event with the given stop reason and usage statistics. usage may be nil if the provider did not return token counts.

func (*EventStream) Error added in v0.23.0

func (s *EventStream) Error(err *ProviderError)

Error sends a StreamEventError event. It accepts a *ProviderError so the contract is enforced at compile time: every error in a stream is structured.

func (*EventStream) Routed added in v0.23.0

func (s *EventStream) Routed(r Routed)

Routed sends a StreamEventRouted event with routing metadata.

func (*EventStream) Send added in v0.23.0

func (s *EventStream) Send(ev StreamEvent)

Send stamps ev with the stream's RequestID, a monotonically incrementing sequence number, and the current timestamp, then sends it on the channel. The first event sent has Seq 1.

func (*EventStream) Start added in v0.23.0

func (s *EventStream) Start(opts StreamStartOpts)

Start sends a StreamEventStart event with the given provider metadata. TimeToFirstToken is computed automatically from the stream's createdAt time.

func (*EventStream) ToolCall added in v0.23.0

func (s *EventStream) ToolCall(tc ToolCall)

ToolCall sends a StreamEventToolCall event for the given tool call.

type HttpClientOpts added in v0.23.0

type HttpClientOpts struct {
	// Logger enables transport-level request/response logging at Debug level.
	// When nil, no logging is performed.
	Logger *slog.Logger

	// Debug extends logging to include request/response headers and bodies.
	// Has no effect when Logger is nil.
	Debug bool
}

HttpClientOpts configures the HTTP client created by NewHttpClient.

type Message

type Message interface {
	Role() Role
	Validate() error
	json.Marshaler
	// contains filtered or unexported methods
}

Message is the interface all message types implement.

type Messages added in v0.5.0

type Messages []Message

Messages is a slice of Message with JSON unmarshal support.

func (*Messages) AddAssistantMsg added in v0.23.0

func (m *Messages) AddAssistantMsg(content string, toolCalls ...ToolCall)

func (*Messages) AddSystemMsg added in v0.23.0

func (m *Messages) AddSystemMsg(content string)

func (*Messages) AddToolCallResult added in v0.23.0

func (m *Messages) AddToolCallResult(toolCallID, output string, isError bool)

func (*Messages) AddUserMsg added in v0.23.0

func (m *Messages) AddUserMsg(content string)

func (*Messages) Append added in v0.23.0

func (m *Messages) Append(msg ...Message)

func (*Messages) UnmarshalJSON added in v0.5.0

func (m *Messages) UnmarshalJSON(data []byte) error

type Model

type Model struct {
	ID       string   `json:"id"`
	Name     string   `json:"name"`
	Provider string   `json:"provider"`
	Aliases  []string `json:"aliases,omitempty"`
}

Model represents an LLM model.

type ModelFetcher

type ModelFetcher interface {
	FetchModels(ctx context.Context) ([]Model, error)
}

ModelFetcher is an optional interface providers can implement to list models dynamically from their API instead of returning a static list.

type Option added in v0.12.0

type Option func(*Options)

Option configures provider options.

func APIKeyFromEnv added in v0.12.0

func APIKeyFromEnv(candidates ...string) Option

APIKeyFromEnv returns an Option that reads the API key from environment variables. It tries each candidate in order, returning the first non-empty value. Returns an error at call time if none of the candidates are set.

func WithAPIKey added in v0.12.0

func WithAPIKey(key string) Option

WithAPIKey sets a static API key.

func WithAPIKeyFunc added in v0.12.0

func WithAPIKeyFunc(f func(ctx context.Context) (string, error)) Option

WithAPIKeyFunc sets a dynamic API key resolver. The function is called on each CreateStream() call, enabling:

  • Lazy key resolution (fetch from secret manager on first use)
  • Key rotation (fetch fresh key each time)
  • Context-aware resolution (respect timeouts/cancellation)

func WithBaseURL added in v0.12.0

func WithBaseURL(url string) Option

WithBaseURL sets a custom base URL for the provider.

func WithHTTPClient added in v0.23.0

func WithHTTPClient(c *http.Client) Option

WithHTTPClient sets a custom HTTP client for the provider. When not set, providers use DefaultHttpClient().

func WithLogger added in v0.23.0

func WithLogger(l *slog.Logger) Option

WithLogger sets a logger for providers that emit events outside the HTTP transport layer (e.g. Bedrock's binary eventstream). Events are logged at Debug level using the same format as the HTTP transport, so the same log renderer handles output from all providers.

type Options added in v0.12.0

type Options struct {
	// BaseURL is the base URL for the provider's API.
	BaseURL string

	// APIKeyFunc returns the API key for authentication.
	// It is called on each CreateStream() call, allowing for lazy/dynamic resolution.
	APIKeyFunc func(ctx context.Context) (string, error)

	// HTTPClient is the HTTP client to use for API requests.
	// When nil, providers fall back to DefaultHttpClient().
	HTTPClient *http.Client

	// Logger is used by providers that cannot log via the HTTP transport
	// (e.g. Bedrock's binary eventstream). When set, stream events are logged
	// at Debug level using the same message format as the HTTP transport logger
	// so the same renderer handles both.
	Logger *slog.Logger
}

Options holds configuration shared across providers.

func Apply added in v0.12.0

func Apply(opts ...Option) *Options

Apply applies all options to a new Options struct and returns it.

func (*Options) ResolveAPIKey added in v0.12.0

func (o *Options) ResolveAPIKey(ctx context.Context) (string, error)

ResolveAPIKey calls the APIKeyFunc to get the API key. Returns an empty string (no error) if no APIKeyFunc was configured.

type OutputFormat added in v0.25.0

type OutputFormat string

OutputFormat specifies the desired output format for the model response.

const (
	// OutputFormatText requests plain text output (default for most providers).
	OutputFormatText OutputFormat = "text"
	// OutputFormatJSON requests JSON output. The model will be constrained
	// to output valid JSON. Not all providers support this.
	OutputFormatJSON OutputFormat = "json"
)

type ParsedToolCall

type ParsedToolCall interface {
	ToolName() string
	ToolCallID() string
}

ParsedToolCall is the interface for parsed tool call results. Use a type switch on the concrete *TypedToolCall[T] to access typed params.

Example:

switch c := call.(type) {
case *TypedToolCall[GetWeatherParams]:
    fmt.Println(c.Params.Location)  // strongly typed
case *TypedToolCall[SearchParams]:
    fmt.Println(c.Params.Query)
}

type Provider

type Provider interface {
	Name() string
	Models() []Model
	Streamer
}

Provider is the interface each LLM backend must implement.

type ProviderError added in v0.23.0

type ProviderError struct {
	// Sentinel is one of the Err* vars above. errors.Is matches against it.
	Sentinel error `json:"-"`

	// Provider is the name of the provider that produced this error.
	// Use the ProviderName* constants.
	Provider string `json:"provider"`

	// Message is a human-readable description of the error.
	Message string `json:"message"`

	// Cause is the underlying error that triggered this one, if any.
	Cause error `json:"-"`

	// StatusCode is the HTTP response status code. Only set for ErrAPIError.
	StatusCode int `json:"status_code,omitempty"`

	// Body is the raw HTTP response body. Only set for ErrAPIError.
	Body string `json:"body,omitempty"`
}

ProviderError is a structured error emitted by any provider. It wraps a sentinel so errors.Is works, carries the provider name for identification, and optionally holds an HTTP status code and body for API errors.

func AsProviderError added in v0.23.0

func AsProviderError(provider string, err error) *ProviderError

AsProviderError ensures err is a *ProviderError. If it already is one, it is returned as-is. Otherwise it is wrapped in a new ProviderError with ErrUnknown as the sentinel. This guarantees that every error surface from CreateStream and EventStream.Error() is a *ProviderError.

func NewErrAPIError added in v0.23.0

func NewErrAPIError(provider string, statusCode int, body string) *ProviderError

NewErrAPIError wraps a non-2xx HTTP response from a provider API.

func NewErrBuildRequest added in v0.23.0

func NewErrBuildRequest(provider string, cause error) *ProviderError

NewErrBuildRequest wraps a failure that occurred while building the outgoing request (e.g. JSON serialisation error).

func NewErrContextCancelled added in v0.23.0

func NewErrContextCancelled(provider string, cause error) *ProviderError

NewErrContextCancelled wraps a context cancellation for a provider stream.

func NewErrMissingAPIKey added in v0.23.0

func NewErrMissingAPIKey(provider string) *ProviderError

NewErrMissingAPIKey returns an error for a provider that has no API key configured.

func NewErrNoProviders added in v0.23.0

func NewErrNoProviders(provider string) *ProviderError

NewErrNoProviders returns an error when no providers are available or all failover targets have been exhausted.

func NewErrProviderMsg added in v0.23.0

func NewErrProviderMsg(provider string, msg string) *ProviderError

NewErrProviderMsg wraps an explicit error message sent by the provider inside the stream (e.g. an Anthropic error event or OpenRouter chunk error).

func NewErrRequestFailed added in v0.23.0

func NewErrRequestFailed(provider string, cause error) *ProviderError

NewErrRequestFailed wraps an HTTP transport-level failure.

func NewErrStreamDecode added in v0.23.0

func NewErrStreamDecode(provider string, cause error) *ProviderError

NewErrStreamDecode wraps a JSON or protocol decode failure mid-stream.

func NewErrStreamRead added in v0.23.0

func NewErrStreamRead(provider string, cause error) *ProviderError

NewErrStreamRead wraps an I/O or scanner error that occurred while reading the response stream.

func NewErrUnknownModel added in v0.23.0

func NewErrUnknownModel(provider string, modelID string) *ProviderError

NewErrUnknownModel returns an error for a model ID or alias that cannot be resolved by the provider.

func (*ProviderError) Error added in v0.23.0

func (e *ProviderError) Error() string

Error returns a human-readable error string in the form: "<provider>: <sentinel>: <message>" or "<provider>: <sentinel>: <message>: <cause>"

func (*ProviderError) Is added in v0.23.0

func (e *ProviderError) Is(target error) bool

Is reports whether this error matches target. It matches if target is the same sentinel, enabling errors.Is(err, ErrAPIError) etc.

func (*ProviderError) MarshalJSON added in v0.23.0

func (e *ProviderError) MarshalJSON() ([]byte, error)

MarshalJSON serialises ProviderError to JSON. Sentinel and Cause are rendered as strings so the full error is machine-readable.

func (*ProviderError) Unwrap added in v0.23.0

func (e *ProviderError) Unwrap() error

Unwrap returns Cause when set, allowing errors.As/Is to traverse the chain. When Cause is nil, Unwrap returns Sentinel so errors.Is(err, ErrAPIError) still works even with no underlying cause.

type ReasoningEffort added in v0.7.0

type ReasoningEffort string

ReasoningEffort controls the amount of reasoning for reasoning models. Lower values result in faster responses with fewer reasoning tokens.

const (
	// ReasoningEffortNone disables reasoning (GPT-5.1+ only).
	ReasoningEffortNone ReasoningEffort = "none"
	// ReasoningEffortMinimal uses minimal reasoning effort.
	ReasoningEffortMinimal ReasoningEffort = "minimal"
	// ReasoningEffortLow uses low reasoning effort.
	ReasoningEffortLow ReasoningEffort = "low"
	// ReasoningEffortMedium uses medium reasoning effort (default for most models before GPT-5.1).
	ReasoningEffortMedium ReasoningEffort = "medium"
	// ReasoningEffortHigh uses high reasoning effort.
	ReasoningEffortHigh ReasoningEffort = "high"
	// ReasoningEffortXHigh uses extra high reasoning effort (codex-max+ only).
	ReasoningEffortXHigh ReasoningEffort = "xhigh"
)

func (ReasoningEffort) Valid added in v0.8.0

func (r ReasoningEffort) Valid() bool

Valid returns true if the ReasoningEffort is a known valid value or empty.

type Request added in v0.25.0

type Request struct {
	// Model is the model identifier or alias to use, e.g. "fast", "anthropic/claude-sonnet-4-5".
	Model string `json:"model"`

	// Messages is the conversation history to send to the model.
	Messages Messages `json:"messages"`

	// MaxTokens limits the maximum number of tokens in the response.
	// When 0, the provider's default is used.
	MaxTokens int `json:"max_tokens,omitempty"`

	// Temperature controls randomness in sampling. Higher values produce
	// more diverse outputs (0.0-2.0 for most providers). Not supported by
	// Anthropic.
	Temperature float64 `json:"temperature,omitempty"`

	// TopP is the nucleus sampling threshold. The model considers only tokens
	// comprising the top P probability mass. Not supported by Anthropic.
	TopP float64 `json:"top_p,omitempty"`

	// TopK restricts token selection to the K most likely tokens. Higher values
	// increase diversity. Not supported by Anthropic.
	TopK int `json:"top_k,omitempty"`

	// OutputFormat specifies the desired output format.
	// Supported by OpenAI and Anthropic. When set to JSON, the model will
	// be constrained to output valid JSON.
	OutputFormat OutputFormat `json:"output_format,omitempty"`

	// Tools is the set of tools the model may call during the response.
	Tools []ToolDefinition `json:"tools,omitempty"`

	// ToolChoice controls how the model selects tools. Defaults to Auto when Tools are provided.
	ToolChoice ToolChoice `json:"tool_choice,omitempty"`

	// ReasoningEffort controls the depth of reasoning for models that support it (e.g. OpenAI o-series).
	ReasoningEffort ReasoningEffort `json:"reasoning_effort,omitempty"`

	// CacheHint is a top-level prompt caching hint. Behaviour is provider-specific:
	// Anthropic auto mode, Bedrock trailing cachePoint, OpenAI extended retention.
	CacheHint *CacheHint `json:"cache_hint,omitempty"`
}

Request configures a provider CreateStream call.

func (Request) Validate added in v0.25.0

func (o Request) Validate() error

Validate checks that the options are valid.

type Resolver added in v0.16.0

type Resolver interface {
	// Resolve returns the Model for the given model ID or alias.
	// Returns an error if the model is not recognized.
	Resolve(modelID string) (Model, error)
}

Resolver resolves a model alias or ID to its full Model representation.

type Role

type Role string

Role represents the role of a message in a conversation.

const (
	RoleSystem    Role = "system"
	RoleUser      Role = "user"
	RoleAssistant Role = "assistant"
	RoleTool      Role = "tool"
)

type Routed added in v0.23.0

type Routed struct {
	// Provider is the name of the backend provider selected (e.g. "anthropic", "bedrock").
	Provider string `json:"provider"`
	// ModelRequested is the model alias or name the caller originally asked for.
	ModelRequested string `json:"model_requested,omitempty"`
	// ModelResolved is the fully qualified model identifier dispatched to the provider.
	ModelResolved string `json:"model_resolved,omitempty"`
	// Errors contains errors from any targets that were tried and failed before
	// this provider was selected. Empty when the first target succeeded.
	Errors []error `json:"-"`
}

Routed carries routing metadata emitted by meta-providers (e.g. router) when a request has been dispatched to a specific backend provider.

func (Routed) MarshalJSON added in v0.23.0

func (r Routed) MarshalJSON() ([]byte, error)

MarshalJSON serialises Routed, rendering Errors as strings since []error is not directly JSON-marshallable.

type SortedMap added in v0.22.0

type SortedMap struct {
	// contains filtered or unexported fields
}

SortedMap is a map[string]any that serialises its keys in alphabetical order. This guarantees deterministic JSON output for tool schema definitions, which is required for stable prompt-cache fingerprints on providers that hash tool definitions (Anthropic, Bedrock).

Construct with NewSortedMap. The zero value is valid and marshals as {}.

func NewSortedMap added in v0.22.0

func NewSortedMap(m map[string]any) *SortedMap

NewSortedMap converts a map[string]any into a SortedMap whose keys are sorted alphabetically at every level of nesting. Nested map[string]any values and []any arrays are recursed so that all object nodes in the tree are also sorted. A nil or empty map produces a SortedMap that marshals as {}.

func (*SortedMap) MarshalJSON added in v0.22.0

func (sm *SortedMap) MarshalJSON() ([]byte, error)

MarshalJSON implements json.Marshaler. Keys are emitted in the order they were inserted (alphabetical, because NewSortedMap sorts them on construction).

type StopReason added in v0.23.0

type StopReason string

StopReason describes why the model stopped generating.

const (
	// StopReasonEndTurn is natural completion — the model finished its response.
	StopReasonEndTurn StopReason = "end_turn"
	// StopReasonToolUse means the model emitted one or more tool calls.
	StopReasonToolUse StopReason = "tool_use"
	// StopReasonMaxTokens means the output length limit was reached.
	StopReasonMaxTokens StopReason = "max_tokens"
	// StopReasonContentFilter means output was blocked by the provider.
	StopReasonContentFilter StopReason = "content_filter"
	// StopReasonCancelled means the context was cancelled before the stream ended.
	StopReasonCancelled StopReason = "cancelled"
	// StopReasonError means the stream ended with a StreamEventError.
	StopReasonError StopReason = "error"
)

type StreamEvent

type StreamEvent struct {
	// Type identifies which kind of event this is.
	Type StreamEventType `json:"type"`

	// RequestID is the library-assigned correlation ID for this stream.
	// Generated once per CreateStream call; identical across all events in a stream.
	RequestID string `json:"request_id,omitempty"`

	// Seq is a monotonically incrementing sequence number within a stream.
	// The first event has Seq 1. Useful for detecting dropped or reordered events.
	Seq uint64 `json:"seq,omitempty"`

	// Timestamp is the wall-clock time at which this event was sent.
	Timestamp time.Time `json:"timestamp,omitempty"`

	// Delta carries incremental model output. Populated for StreamEventDelta.
	Delta *Delta `json:"delta,omitempty"`

	// ToolCall is the tool invocation requested by the model. Populated for StreamEventToolCall.
	ToolCall *ToolCall `json:"tool_call,omitempty"`

	// Error holds the error that terminated the stream. Populated for StreamEventError.
	Error *ProviderError `json:"error,omitempty"`

	// Usage holds token counts and cost for the completed request. Populated for StreamEventDone.
	Usage *Usage `json:"usage,omitempty"`

	// Start holds stream metadata. Populated for StreamEventStart.
	Start *StreamStart `json:"start,omitempty"`

	// Routed holds routing metadata. Populated for StreamEventRouted.
	Routed *Routed `json:"routed,omitempty"`

	// StopReason describes why the model stopped generating.
	// Populated for StreamEventDone with the provider-reported reason.
	StopReason StopReason `json:"stop_reason,omitempty"`
}

StreamEvent is a single event emitted by a provider during streaming.

func (StreamEvent) ReasoningText added in v0.23.0

func (e StreamEvent) ReasoningText() string

ReasoningText returns the reasoning content of the delta if this is a StreamEventDelta of type DeltaTypeReasoning, otherwise returns "".

func (StreamEvent) Text added in v0.23.0

func (e StreamEvent) Text() string

Text returns the text content of the delta if this is a StreamEventDelta of type DeltaTypeText, otherwise returns "".

type StreamEventType

type StreamEventType string

StreamEventType identifies the kind of streaming event from a provider.

const (
	StreamEventCreated  StreamEventType = "created"
	StreamEventRouted   StreamEventType = "routed"
	StreamEventStart    StreamEventType = "start"
	StreamEventDelta    StreamEventType = "delta"
	StreamEventToolCall StreamEventType = "tool_call"
	StreamEventDone     StreamEventType = "done"
	StreamEventError    StreamEventType = "error"
)

type StreamRequest added in v0.23.0

type StreamRequest = Request

type StreamResponse added in v0.23.0

type StreamResponse struct {
	// contains filtered or unexported fields
}

StreamResponse is a client-side, stateful stream processor. Create one with Process, register callbacks and tool handlers fluently, then call Result() to start consuming the stream.

Example:

weatherSpec := llm.NewToolSpec[GetWeatherParams]("get_weather", "Get weather")

ch, err := provider.CreateStream(ctx, opts)
if err != nil { ... }

result := <-llm.Process(ctx, ch).
    OnText(func(s string) { fmt.Print(s) }).
    HandleTool(llm.Handle(weatherSpec, func(ctx context.Context, in GetWeatherParams) (*GetWeatherResult, error) {
        return &GetWeatherResult{Temp: 22}, nil
    })).
    Result()

if result.Error() != nil { ... }
result.Apply(&msgs)

func Process added in v0.23.0

func Process(ctx context.Context, ch <-chan StreamEvent) *StreamResponse

Process creates a new StreamResponse that will consume ch. Call fluent methods to register callbacks and tool handlers, then call Result() to begin processing.

func (*StreamResponse) DispatchAsync added in v0.23.0

func (r *StreamResponse) DispatchAsync() *StreamResponse

DispatchAsync switches tool handler dispatch to concurrent mode: all tool calls emitted in a single response are executed in parallel, one goroutine per call. Results are collected in emission order before the stream is considered complete.

func (*StreamResponse) HandleTool added in v0.23.0

func (r *StreamResponse) HandleTool(handlers ...ToolHandler) *StreamResponse

HandleTool registers a ToolHandler that is invoked when the model emits a completed tool call matching h.ToolName(). The handler's output is stored in StreamResult.ToolResults and included in the messages returned by Next/Apply.

Pass a *BoundToolSpec (from llm.Handle) for typed, spec-aware handlers:

proc.HandleTool(llm.Handle(weatherSpec, func(ctx context.Context, in GetWeatherParams) (*GetWeatherResult, error) {
    return &GetWeatherResult{Temp: 22}, nil
}))

func (*StreamResponse) OnReasoning added in v0.23.0

func (r *StreamResponse) OnReasoning(fn func(chunk string)) *StreamResponse

OnReasoning registers a callback that is called for each incremental reasoning/thinking token.

func (*StreamResponse) OnStart added in v0.23.0

func (r *StreamResponse) OnStart(fn func(*StreamStart)) *StreamResponse

OnStart registers a callback that is called when the StreamEventStart event arrives, carrying provider metadata (request ID, model, time-to-first-token).

func (*StreamResponse) OnText added in v0.23.0

func (r *StreamResponse) OnText(fn func(chunk string)) *StreamResponse

OnText registers a callback that is called for each incremental text token. Panics in the callback are recovered and recorded on the StreamResult error.

func (*StreamResponse) OnToolDelta added in v0.23.0

func (r *StreamResponse) OnToolDelta(fn func(d *Delta)) *StreamResponse

OnToolDelta registers a callback that is called for each partial tool-call argument fragment (DeltaTypeTool deltas).

func (*StreamResponse) Result added in v0.23.0

func (r *StreamResponse) Result() <-chan *StreamResult

Result starts consuming the stream (at most once) and returns a channel that yields exactly one *StreamResult when the stream is fully processed. The channel is closed after the result is sent.

Calling Result() multiple times is safe — the stream is only consumed once and the same channel is returned on subsequent calls.

func (*StreamResponse) WithToolDispatcher added in v0.23.0

func (r *StreamResponse) WithToolDispatcher(d ToolDispatcher) *StreamResponse

WithToolDispatcher sets the tool dispatcher explicitly.

type StreamResult added in v0.23.0

type StreamResult struct {
	// Text is the concatenation of all DeltaTypeText deltas.
	Text string

	// Reasoning is the concatenation of all DeltaTypeReasoning deltas.
	Reasoning string

	// ToolCalls contains every tool call emitted by the model, in order.
	ToolCalls []ToolCall

	// ToolResults holds the output of every executed tool handler, in the same
	// order as ToolCalls. Entries are present only when a ToolHandler was
	// registered for the tool name; unhandled tools have no entry here.
	ToolResults []ToolCallResult

	// Usage holds token counts and cost. Nil if the provider did not report usage.
	Usage *Usage

	// Start holds the stream metadata emitted by StreamEventStart.
	// Nil if the provider did not emit a start event.
	Start *StreamStart

	// Routed holds routing metadata emitted by meta-providers (e.g. router).
	// Populated when the stream passed through a router that selected a backend.
	// Nil when the request was sent directly to a provider.
	Routed *Routed

	// StopReason describes why the stream ended.
	StopReason StopReason
	// contains filtered or unexported fields
}

StreamResult is the final accumulated result of a processed stream. It is delivered exactly once on the channel returned by StreamResponse.Result().

func (*StreamResult) Apply added in v0.23.0

func (r *StreamResult) Apply(msgs *Messages)

Apply appends the assistant message and any tool results to msgs. Equivalent to msgs.Append(result.Next()...).

func (*StreamResult) Error added in v0.23.0

func (r *StreamResult) Error() error

Error returns any stream-level error (e.g. provider error, context cancellation).

func (*StreamResult) Message added in v0.23.0

func (r *StreamResult) Message() *AssistantMsg

Message builds an AssistantMsg from the accumulated result. Use this to append the assistant turn to a conversation history.

func (*StreamResult) Next added in v0.23.0

func (r *StreamResult) Next() []Message

Next returns the messages that should be appended to the conversation history after this turn: the AssistantMsg followed by one ToolCallResult message for each executed tool handler. If no tool handlers ran, only the AssistantMsg is returned.

This is the primary convenience for agentic loops:

msgs.Append(result.Next()...)

type StreamStart added in v0.16.0

type StreamStart struct {
	// RequestID is the unique identifier returned by the upstream API.
	// Useful for debugging and support tickets. May be empty if the API doesn't provide one.
	RequestID string `json:"request_id,omitempty"`

	// Model is the model identifier returned by the upstream API in its response.
	// e.g., "claude-haiku-4-5-20251001". May be empty if the API doesn't echo the model back.
	Model string `json:"model,omitempty"`

	TimeToFirstToken time.Duration `json:"time_to_first_token,omitempty"`
}

StreamStart contains metadata about the stream, emitted with StreamEventStart.

func (StreamStart) MarshalJSON added in v0.23.0

func (s StreamStart) MarshalJSON() ([]byte, error)

MarshalJSON renders TimeToFirstToken as a human-readable string (e.g. "412ms") instead of raw nanoseconds. All other fields use their struct tags directly via the type alias trick to avoid infinite recursion.

type StreamStartOpts added in v0.23.0

type StreamStartOpts struct {
	// RequestID is the unique identifier returned by the upstream API.
	// Useful for debugging and support tickets. May be empty if the API doesn't provide one.
	RequestID string

	// Model is the model identifier returned by the upstream API in its response.
	// e.g., "claude-haiku-4-5-20251001". May be empty if the API doesn't echo the model back.
	Model string
}

StreamStartOpts is the input to EventStream.Start — what the provider knows from the upstream API response. Routing fields (requested model, resolved model) are not included here; they belong to a separate routing event in meta-providers.

type Streamer added in v0.19.0

type Streamer interface {
	CreateStream(ctx context.Context, opts Request) (<-chan StreamEvent, error)
}

type SystemMsg added in v0.5.0

type SystemMsg struct {
	Content   string
	CacheHint *CacheHint
}

SystemMsg contains a system prompt.

func (*SystemMsg) MarshalJSON added in v0.5.0

func (m *SystemMsg) MarshalJSON() ([]byte, error)

func (*SystemMsg) Role added in v0.5.0

func (m *SystemMsg) Role() Role

func (*SystemMsg) Validate added in v0.5.0

func (m *SystemMsg) Validate() error

type TokenCount added in v0.24.0

type TokenCount struct {
	// InputTokens is the total estimated input token count:
	// all messages + all tool definitions + any provider-specific overhead.
	InputTokens int

	// PerMessage contains the token count for each entry in TokenCountRequest.Messages,
	// in the same index order. Does not include tool definitions or overhead.
	// len(PerMessage) == len(TokenCountRequest.Messages) is guaranteed.
	PerMessage []int

	// Role breakdowns — derived from PerMessage, provided for convenience.
	// SystemTokens + UserTokens + AssistantTokens + ToolResultTokens == sum(PerMessage).
	SystemTokens     int // sum of PerMessage for all RoleSystem messages
	UserTokens       int // sum of PerMessage for all RoleUser messages
	AssistantTokens  int // sum of PerMessage for all RoleAssistant messages
	ToolResultTokens int // sum of PerMessage for all RoleTool (ToolCallResult) messages

	// ToolsTokens is the total raw token count for all tool definitions combined,
	// derived purely from the JSON-serialised tool schemas.
	// sum(values(PerTool)) == ToolsTokens.
	ToolsTokens int

	// PerTool maps each tool definition's Name to its individual raw token count.
	// sum(values(PerTool)) == ToolsTokens.
	PerTool map[string]int

	// OverheadTokens is the number of tokens the provider adds on top of the
	// caller-supplied content — tokens the caller did not write and cannot
	// control. Examples:
	//   - Anthropic: hidden tool-use system preamble + per-tool framing (~330+126+85×n)
	//   - Claude OAuth: injected billing/identity system blocks (~45 tokens)
	//
	// Zero for providers that add no hidden content (OpenAI, OpenRouter, Ollama).
	//
	// The invariant: InputTokens == sum(PerMessage) + ToolsTokens + OverheadTokens
	// (plus any per-message overhead, e.g. +4/msg for OpenAI).
	OverheadTokens int
}

TokenCount holds the result of a CountTokens call.

Invariants:

  • len(PerMessage) == len(TokenCountRequest.Messages)
  • SystemTokens + UserTokens + AssistantTokens + ToolResultTokens == sum(PerMessage)
  • sum(values(PerTool)) == ToolsTokens (raw tool JSON counts only, no overhead)
  • InputTokens == sum(PerMessage) + ToolsTokens + OverheadTokens + provider-specific per-message overhead

type TokenCountRequest added in v0.24.0

type TokenCountRequest struct {
	// Model is the model ID to count tokens for (e.g. "gpt-4o", "claude-sonnet-4-5").
	// Required — returns an error if empty.
	Model    string
	Messages Messages
	Tools    []ToolDefinition
}

TokenCountRequest is the input to TokenCounter.CountTokens. Model is required — providers use it to select the correct BPE encoding.

type TokenCounter added in v0.24.0

type TokenCounter interface {
	CountTokens(ctx context.Context, req TokenCountRequest) (*TokenCount, error)
}

TokenCounter is an optional interface providers may implement to estimate token usage before sending a request.

All implementations in this codebase are local/offline — no network call is made. Counts should be treated as estimates; accuracy varies by provider:

  • OpenAI: exact (tiktoken matches the API tokenizer)
  • OpenRouter: approximate (tiktoken, best-effort model prefix matching)
  • Anthropic: approximate (cl100k_base, ±5-10% for English; tokenizer not public)
  • Bedrock: approximate (same as Anthropic)
  • Ollama: approximate (cl100k_base; no public tokenize endpoint)

Usage:

if tc, ok := provider.(llm.TokenCounter); ok {
    count, err := tc.CountTokens(ctx, llm.TokenCountRequest{
        Model:    "gpt-4o",
        Messages: messages,
        Tools:    tools,
    })
    if err == nil && count.InputTokens > maxTokens {
        return fmt.Errorf("request too large: %d tokens (limit %d)", count.InputTokens, maxTokens)
    }
}

type ToolCall

type ToolCall struct {
	ID        string
	Name      string
	Arguments map[string]any
}

ToolCall represents a request from the LLM to invoke a tool.

func (ToolCall) MarshalJSON added in v0.5.0

func (tc ToolCall) MarshalJSON() ([]byte, error)

func (*ToolCall) UnmarshalJSON added in v0.5.0

func (tc *ToolCall) UnmarshalJSON(data []byte) error

func (ToolCall) Validate added in v0.5.0

func (tc ToolCall) Validate() error

type ToolCallResult

type ToolCallResult struct {
	ToolCallID string
	Output     string
	IsError    bool
	CacheHint  *CacheHint
}

ToolCallResult contains the result of executing a tool call.

func (*ToolCallResult) MarshalJSON added in v0.5.0

func (m *ToolCallResult) MarshalJSON() ([]byte, error)

func (*ToolCallResult) Role added in v0.5.0

func (m *ToolCallResult) Role() Role

func (*ToolCallResult) Validate added in v0.5.0

func (m *ToolCallResult) Validate() error

type ToolChoice added in v0.6.0

type ToolChoice interface {
	// contains filtered or unexported methods
}

ToolChoice controls whether and which tools the model should call.

type ToolChoiceAuto added in v0.6.0

type ToolChoiceAuto struct{}

ToolChoiceAuto lets the model decide whether to call tools. This is the default behavior when ToolChoice is nil.

type ToolChoiceNone added in v0.6.0

type ToolChoiceNone struct{}

ToolChoiceNone prevents the model from calling any tools.

type ToolChoiceRequired added in v0.6.0

type ToolChoiceRequired struct{}

ToolChoiceRequired forces the model to call at least one tool.

type ToolChoiceTool added in v0.6.0

type ToolChoiceTool struct {
	Name string
}

ToolChoiceTool forces the model to call a specific tool by name.

type ToolDefinition

type ToolDefinition struct {
	Name        string         `json:"name"`
	Description string         `json:"description"`
	Parameters  map[string]any `json:"parameters"`
}

ToolDefinition describes a tool that the model can invoke. This is used when sending tools to a provider's API.

func ToolDefinitionFor

func ToolDefinitionFor[T any](name, description string) ToolDefinition

ToolDefinitionFor creates a ToolDefinition from a Go struct type using reflection. The struct's fields are converted to a JSON Schema that describes the tool's parameters.

Field tags:

  • `json:"fieldName"` - Sets the parameter name (required)
  • `jsonschema:"description=..."` - Describes the parameter
  • `jsonschema:"required"` - Marks the parameter as required
  • `jsonschema:"enum=val1,enum=val2"` - Restricts to specific values

Example:

type GetWeatherParams struct {
    Location string `json:"location" jsonschema:"description=City name,required"`
    Unit     string `json:"unit" jsonschema:"description=Temperature unit,enum=celsius,enum=fahrenheit"`
}

tool := ToolDefinitionFor[GetWeatherParams]("get_weather", "Get current weather")

func (ToolDefinition) Validate added in v0.8.0

func (t ToolDefinition) Validate() error

Validate checks that the tool definition is valid.

type ToolDispatcher added in v0.23.0

type ToolDispatcher int

ToolDispatcher controls how tool calls are executed when multiple tools are emitted in a single response.

const (
	// ToolDispatchSync executes tool handlers one at a time in emission order.
	// This is the default.
	ToolDispatchSync ToolDispatcher = iota

	// ToolDispatchAsync executes all tool handlers concurrently, one goroutine
	// per tool call. Results are collected in emission order.
	ToolDispatchAsync
)

type ToolHandler added in v0.23.0

type ToolHandler interface {
	// ToolName returns the name this handler is registered for.
	ToolName() string
	// Handle executes the tool call and returns its output as a string.
	// The string is stored verbatim as the ToolCallResult content.
	Handle(ctx context.Context, call ToolCall) (string, error)
}

ToolHandler is a self-describing executor for a single tool. It knows its own name (used for registration) and can execute a raw ToolCall.

func NewToolHandler added in v0.23.0

func NewToolHandler[In, Out any](name string, fn func(ctx context.Context, in In) (*Out, error)) ToolHandler

NewToolHandler creates a named ToolHandler from a strongly-typed function without requiring a ToolSpec. Use this when you don't need schema validation or when the spec is defined elsewhere.

Example:

proc.HandleTool(llm.NewToolHandler("get_weather", func(ctx context.Context, in GetWeatherParams) (*GetWeatherResult, error) {
    return &GetWeatherResult{Temp: 22}, nil
}))

type ToolSet

type ToolSet struct {
	// contains filtered or unexported fields
}

ToolSet manages a collection of tool specifications. It provides tool definitions for sending to providers and parses raw tool calls into strongly-typed results with validation.

func NewToolSet

func NewToolSet(tools ...toolRegistration) *ToolSet

NewToolSet creates a ToolSet from one or more tool specs.

Example:

tools := NewToolSet(
    NewToolSpec[GetWeatherParams]("get_weather", "Get weather"),
    NewToolSpec[SearchParams]("search", "Search the web"),
)

func (*ToolSet) Definitions

func (ts *ToolSet) Definitions() []ToolDefinition

Definitions returns all tool definitions for sending to providers.

func (*ToolSet) Parse

func (ts *ToolSet) Parse(calls []ToolCall) ([]ParsedToolCall, error)

Parse converts raw ToolCalls (from stream events) into typed ParsedToolCalls. Each tool call's arguments are validated against its JSON Schema before parsing.

Successfully parsed calls are always returned. Errors from unknown tool names or validation/parse failures are collected and returned as a joined error. The error is non-fatal - you get all successfully parsed calls.

Example:

calls, err := tools.Parse(rawToolCalls)
if err != nil {
    log.Printf("parse warnings: %v", err)
}
for _, call := range calls {
    switch c := call.(type) {
    case *TypedToolCall[GetWeatherParams]:
        fmt.Println(c.Params.Location)
    }
}

type ToolSpec

type ToolSpec[T any] struct {
	// contains filtered or unexported fields
}

ToolSpec is a type-safe tool specification that pairs a tool name/description with a Go struct that defines the parameter schema. It includes a compiled JSON Schema for runtime validation.

func NewToolSpec

func NewToolSpec[T any](name, description string) *ToolSpec[T]

NewToolSpec creates a typed tool specification from a parameter struct. The struct's fields define the JSON Schema for the tool's parameters. Field tags are the same as ToolDefinitionFor: json, jsonschema.

Example:

type GetWeatherParams struct {
    Location string `json:"location" jsonschema:"description=City name,required"`
}
spec := NewToolSpec[GetWeatherParams]("get_weather", "Get current weather")

func (*ToolSpec[T]) Definition

func (s *ToolSpec[T]) Definition() ToolDefinition

Definition returns the ToolDefinition for sending to providers.

type TypedToolCall

type TypedToolCall[T any] struct {
	ID     string // Original tool call ID (for sending results back)
	Name   string // Tool name
	Params T      // Parsed, validated parameters
}

TypedToolCall holds a parsed tool call with strongly-typed parameters.

func (*TypedToolCall[T]) ToolCallID

func (c *TypedToolCall[T]) ToolCallID() string

ToolCallID returns the tool call ID.

func (*TypedToolCall[T]) ToolName

func (c *TypedToolCall[T]) ToolName() string

ToolName returns the tool name.

type Usage

type Usage struct {
	// InputTokens is the total number of input tokens processed, including
	// tokens served from cache (CacheReadTokens) and tokens written to cache
	// (CacheWriteTokens). Callers can use this as the single "how many input
	// tokens did this request consume" figure.
	InputTokens int `json:"input_tokens"`

	// OutputTokens is the number of tokens generated in the response.
	OutputTokens int `json:"output_tokens"`

	// TotalTokens is InputTokens + OutputTokens.
	TotalTokens int `json:"total_tokens"`

	// Cost is the total request cost in USD.
	// For Anthropic, Bedrock, and OpenAI this is locally calculated from
	// provider pricing tables and equals the sum of the breakdown fields below.
	// For OpenRouter this is API-reported by the proxy (already includes cache pricing).
	Cost float64 `json:"cost"`

	// Detailed token breakdown (provider-specific, may be zero).
	CacheReadTokens  int `json:"cache_read_tokens,omitempty"`  // Input tokens served from an existing cache entry (all providers).
	CacheWriteTokens int `json:"cache_write_tokens,omitempty"` // Input tokens written to a new cache entry (Anthropic, Bedrock).
	ReasoningTokens  int `json:"reasoning_tokens,omitempty"`   // Output tokens consumed by model reasoning (e.g. extended thinking).

	// Granular cost breakdown in USD (zero if provider/model pricing is unknown).
	// Sum of InputCost + CacheReadCost + CacheWriteCost + OutputCost == Cost.
	// Not populated for OpenRouter (API-reported cost is used instead).
	//
	// InputCost covers only the non-cached, non-write portion:
	// InputTokens - CacheReadTokens - CacheWriteTokens tokens at the regular input rate.
	InputCost      float64 `json:"input_cost,omitempty"`       // Cost of non-cached, non-write input tokens.
	CacheReadCost  float64 `json:"cache_read_cost,omitempty"`  // Cost of cache-read tokens.
	CacheWriteCost float64 `json:"cache_write_cost,omitempty"` // Cost of cache-write tokens.
	OutputCost     float64 `json:"output_cost,omitempty"`      // Cost of output tokens.
}

Usage holds token counts and cost from a provider response.

type UserMsg added in v0.5.0

type UserMsg struct {
	Content   string
	CacheHint *CacheHint
}

UserMsg contains user input.

func (*UserMsg) MarshalJSON added in v0.5.0

func (m *UserMsg) MarshalJSON() ([]byte, error)

func (*UserMsg) Role added in v0.5.0

func (m *UserMsg) Role() Role

func (*UserMsg) Validate added in v0.5.0

func (m *UserMsg) Validate() error

Directories

Path Synopsis
cmd
llmcli command
llmcli is a command-line tool for testing LLM providers.
llmcli is a command-line tool for testing LLM providers.
llmcli/cmds
Package cmds provides CLI commands for llmcli.
Package cmds provides CLI commands for llmcli.
llmcli/store
Package store provides token storage implementations.
Package store provides token storage implementations.
Package llmtest provides helpers for testing code that consumes llm.StreamEvent channels, following the convention of packages like net/http/httptest.
Package llmtest provides helpers for testing code that consumes llm.StreamEvent channels, following the convention of packages like net/http/httptest.
Package modeldb provides access to the models.dev model database.
Package modeldb provides access to the models.dev model database.
provider
anthropic/claude
Package claude provides an Anthropic provider using Claude OAuth tokens.
Package claude provides an Anthropic provider using Claude OAuth tokens.
auto
Package auto provides zero-config multi-provider setup for LLM providers.
Package auto provides zero-config multi-provider setup for LLM providers.
minimax
Package minimax provides a MiniMax LLM provider using the Anthropic-compatible API.
Package minimax provides a MiniMax LLM provider using the Anthropic-compatible API.
Package tokencount provides a shared offline tiktoken wrapper for LLM token estimation.
Package tokencount provides a shared offline tiktoken wrapper for LLM token estimation.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL