llm

package module
v0.40.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 19, 2026 License: MIT Imports: 25 Imported by: 0

README

LLM Provider Abstraction Library

A unified Go library for interacting with multiple LLM providers through a consistent interface. Supports streaming responses, tool calling, reasoning, prompt caching, and zero-config multi-provider setup.

Features

  • Unified backend interface — real backends implement llm.Provider
  • Service runtimellm.Service resolves model strings, applies intent aliases, and performs fallback
  • Streaming support — channel-based streaming with structured event envelopes
  • Tool calling — consistent tool/function calling across providers
  • Reasoning support — Anthropic, OpenAI reasoning models, Bedrock-compatible providers
  • Prompt caching — transparent cache control where supported
  • Zero-config setupauto builds a ready-to-use *llm.Service
  • Model catalog integration — catalog-backed model resolution, aliases, and preference-aware routing

Supported Providers

Provider Name Description
Anthropic API anthropic Direct Anthropic API with API key
Claude OAuth claude OAuth-based Claude access
OpenAI openai OpenAI GPT models
AWS Bedrock bedrock AWS Bedrock models
MiniMax minimax MiniMax models via Anthropic-compatible API
Ollama ollama Local Ollama models
OpenRouter openrouter OpenRouter proxy
Docker Model Runner dockermr Local Docker model runtime

Installation

go get github.com/codewandler/llm

Quick Start

package main

import (
    "context"
    "fmt"

    "github.com/codewandler/llm"
    "github.com/codewandler/llm/provider/auto"
)

func main() {
    ctx := context.Background()

    svc, err := auto.New(ctx)
    if err != nil {
        panic(err)
    }

    stream, err := svc.CreateStream(ctx, llm.Request{
        Model: "default",
        Messages: llm.Messages{
            llm.User("What is the capital of France?"),
        },
    })
    if err != nil {
        panic(err)
    }

    for ev := range stream {
        switch ev.Type {
        case llm.StreamEventDelta:
            if d, ok := ev.Data.(*llm.DeltaEvent); ok {
                fmt.Print(d.Text())
            }
        case llm.StreamEventCompleted:
            fmt.Println()
        case llm.StreamEventError:
            fmt.Printf("error: %v
", ev.Data)
        }
    }
}

Service-first API

llm.New(opts...) builds the main orchestration runtime.

svc, err := llm.New(
    llm.WithAutoDetect(),
)

Or register providers explicitly:

svc, err := llm.New(
    llm.WithProvider(openai.New(llm.APIKeyFromEnv("OPENAI_API_KEY"))),
    llm.WithProviderNamed("work", anthropic.New(llm.APIKeyFromEnv("ANTHROPIC_API_KEY"))),
    llm.WithIntentAlias("fast", llm.IntentSelector{Model: "openai/gpt-4o-mini"}),
)
Model reference styles

Recommended reference ladder:

  • fast, default, powerful — intent aliases
  • provider/model — preferred explicit form, e.g. openai/gpt-4o
  • instance/provider/model — exact instance targeting, e.g. work/anthropic/claude-sonnet-4-6
  • bare IDs only when convenient and unambiguous

auto

auto is now a convenience layer over llm.New(...).

import "github.com/codewandler/llm/auto"

svc, err := auto.New(ctx)

svc, err := auto.New(ctx,
    auto.WithAnthropic(),
    auto.WithOpenAI(),
    auto.WithBedrock(),
    auto.WithClaudeLocal(),
)

svc, err := auto.New(ctx,
    auto.WithOpenAI(),
    auto.WithGlobalAlias("review", "openai/gpt-4o"),
)

Direct provider usage

Real backends still implement llm.Provider directly:

p := openai.New(llm.APIKeyFromEnv("OPENAI_API_KEY"))
stream, err := p.CreateStream(ctx, llm.Request{
    Model: "gpt-4o",
    Messages: llm.Messages{llm.User("Hello")},
})

Streams and events

Streams are llm.Stream (<-chan llm.Envelope). Common event types include:

  • StreamEventStarted
  • StreamEventTokenEstimate
  • StreamEventDelta
  • StreamEventToolCall
  • StreamEventUsageUpdated
  • StreamEventCompleted
  • StreamEventError
  • StreamEventRequest

Use llm.NewEventProcessor(ctx, stream) for high-level consumption.

Tool calling

Type-safe tools are built with github.com/codewandler/llm/tool.

spec := tool.NewSpec[GetWeatherParams]("get_weather", "Get current weather")

result := llm.NewEventProcessor(ctx, stream).
    HandleTool(tool.Handle(spec, func(ctx context.Context, p GetWeatherParams) (*GetWeatherResult, error) {
        return doWeather(p.Location, p.Unit)
    })).
    Result()

Architecture

llm/
├── service.go              # llm.Service, llm.New(...), service options
├── request.go              # Request, validation, effort, thinking, api type
├── request_builder.go      # Buildable + RequestBuilder
├── event.go                # Envelope, event types, stream model
├── event_processor.go      # Stream consumption helper
├── msg/                    # Canonical message model
├── tool/                   # Tool definitions and typed dispatch
├── usage/                  # Pricing and usage tracking
├── tokencount/             # Token estimation
├── internal/modelcatalog/  # Built-in catalog loading + canonicalization
├── internal/modelview/     # Catalog projections and visible-model views
├── internal/providerregistry/ # Provider detect/build registry
├── auto/                  # Convenience service builder
└── provider/
    ├── anthropic/
    ├── bedrock/
    ├── codex/
    ├── dockermr/
    ├── fake/
    ├── minimax/
    ├── ollama/
    ├── openai/
    ├── openrouter/

CLI

go run ./cmd/llmcli infer "Hello"
go run ./cmd/llmcli infer -v -m default "Explain Go channels"

Contributing

go test ./...
go fmt ./...
go vet ./...

Documentation

Index

Constants

View Source
const (
	ProviderNameAnthropic = "anthropic"
	ProviderNameClaude    = "claude"
	ProviderNameBedrock   = "bedrock"
	ProviderNameCodex     = "codex"
	ProviderNameOllama    = "ollama"
	// ProviderNameDockerMR is the identifier for the Docker Model Runner provider.
	// Docker Model Runner is built into Docker Desktop 4.40+ and available as a
	// plugin for Docker Engine on Linux. No API key is required.
	ProviderNameDockerMR   = "dockermr"
	ProviderNameOpenAI     = "openai"
	ProviderNameOpenRouter = "openrouter"
)

Provider name constants used in ProviderError.Provider.

View Source
const (
	RoleSystem    = msg.RoleSystem
	RoleUser      = msg.RoleUser
	RoleAssistant = msg.RoleAssistant
	RoleTool      = msg.RoleTool
	RoleDeveloper = msg.RoleDeveloper

	AssistantPhaseCommentary  = msg.AssistantPhaseCommentary
	AssistantPhaseFinalAnswer = msg.AssistantPhaseFinalAnswer

	// Cache TTL convenience aliases.
	CacheTTL5m = msg.CacheTTL5m
	CacheTTL1h = msg.CacheTTL1h
)
View Source
const (
	ModelDefault  = "default"
	ModelFast     = "fast"
	ModelPowerful = "powerful"
)

Variables

View Source
var (
	// ErrContextCancelled is returned when the caller's context is cancelled
	// while a eventPub is in progress.
	ErrContextCancelled = errors.New("context cancelled")

	// ErrRequestFailed is returned when the HTTP transport fails before a
	// response is received (e.g. network error, DNS failure).
	ErrRequestFailed = errors.New("request failed")

	// ErrAPIError is returned when the provider API responds with a non-2xx
	// HTTP status. The ProviderError carries StatusCode and Body.
	ErrAPIError = errors.New("API error")

	// ErrStreamRead is returned when reading or scanning the response eventPub
	// fails at the I/O level (e.g. scanner error, connection reset).
	ErrStreamRead = errors.New("stream read error")

	// ErrStreamDecode is returned when a eventPub chunk cannot be decoded
	// (e.g. malformed JSON in an SSE data line).
	ErrStreamDecode = errors.New("stream decode error")

	// ErrProviderError is returned when the provider sends an explicit
	// error inside the eventPub (e.g. Anthropic error event, OpenRouter
	// chunk-level error).
	ErrProviderError = errors.New("provider error")

	// ErrMissingAPIKey is returned when a provider requires an API key
	// but none has been configured.
	ErrMissingAPIKey = errors.New("missing API key")

	// ErrBuildRequest is returned when serialising the outgoing request
	// fails before it is sent.
	ErrBuildRequest = errors.New("build request error")

	// ErrUnknownModel is returned when a model ToolCallID or alias cannot be resolved.
	ErrUnknownModel = errors.New("unknown model")

	// ErrNoProviders is returned when no providers are configured or all
	// failover targets have been exhausted.
	ErrNoProviders = errors.New("no providers configured")

	// ErrUnknown is used to wrap any error that is not already a ProviderError.
	// Callers can test for it with errors.Is(err, llm.ErrUnknown).
	ErrUnknown = errors.New("unknown error")
)

Sentinel errors for use with errors.Is. Each ProviderError wraps one of these so callers can inspect the error kind without string matching.

Functions

func DefaultHttpClient added in v0.23.0

func DefaultHttpClient() *http.Client

DefaultHttpClient returns the shared default HTTP client. It is safe for concurrent use and is reused across all providers that do not supply their own client.

func IsRetriableHTTPStatus added in v0.36.0

func IsRetriableHTTPStatus(code int) bool

IsRetriableHTTPStatus reports whether an HTTP status code indicates a transient condition where retrying with a different provider may succeed. Providers use this to decide whether to surface an HTTP error through the event stream (non-retriable) or return it as an error from CreateStream (retriable, so the router can fail over to the next target).

func NewHttpClient added in v0.23.0

func NewHttpClient(opts HttpClientOpts) *http.Client

NewHttpClient creates a new *http.Client with sensible defaults for LLM provider use. The client has no top-level Timeout — LLM streams can be arbitrarily long and are cancelled via context. Transport-level timeouts guard against stalled connections at the TCP/TLS layer.

When opts.Logger is non-nil, every request and response is logged at Debug level. Set opts.Debug = true to also include headers and bodies. Response bodies are tee-logged as they eventPub — no buffering, no broken SSE.

Types

type ApiType added in v0.40.0

type ApiType string

ApiType identifies a wire protocol for LLM API requests. Used as a hint on Request.ApiTypeHint and as the resolved value on RequestEvent.ResolvedApiType.

const (
	// ApiTypeAuto is the zero value. The provider selects the best API.
	ApiTypeAuto ApiType = ""
	// ApiTypeOpenAIChatCompletion selects the OpenAI Chat Completions API (/v1/chat/completions).
	ApiTypeOpenAIChatCompletion ApiType = "openai-chat"
	// ApiTypeOpenAIResponses selects the OpenAI Responses API (/v1/responses).
	// Required for models that use the phase field (gpt-5.3-codex, gpt-5.4-*).
	ApiTypeOpenAIResponses ApiType = "openai-responses"
	// ApiTypeAnthropicMessages selects the Anthropic Messages API (/v1/messages).
	// Provides native cache_control, thinking blocks, and anthropic-beta headers.
	ApiTypeAnthropicMessages ApiType = "anthropic-messages"
)

func (ApiType) MarshalText added in v0.40.0

func (t ApiType) MarshalText() ([]byte, error)

MarshalText maps the zero value (ApiTypeAuto = "") to the user-visible string "auto", matching the ThinkingMode convention.

func (*ApiType) UnmarshalText added in v0.40.0

func (t *ApiType) UnmarshalText(b []byte) error

UnmarshalText maps "auto" → ApiTypeAuto (the zero value ""). An empty input is also accepted as ApiTypeAuto. Shortforms are normalized to their full names:

"chat" → "openai-chat"
"responses" → "openai-responses"
"messages" → "anthropic-messages"

func (ApiType) Valid added in v0.40.0

func (t ApiType) Valid() bool

Valid returns true if t is a known constant or the zero value (auto).

type AssistantPhase added in v0.40.0

type AssistantPhase = msg.AssistantPhase

type Buildable added in v0.38.0

type Buildable interface {
	BuildRequest(ctx context.Context) (Request, error)
}

Buildable is implemented by any value that can produce a Request for streaming. Callers may pass either a fully-constructed Request or a *RequestBuilder to CreateStream — both satisfy this interface.

type CacheHint added in v0.20.0

type CacheHint = msg.CacheHint

type CacheOpt added in v0.38.0

type CacheOpt = msg.CacheOpt

CacheOpt and CacheTTL are re-exported from the msg package so callers using RequestBuilder do not need to import msg directly.

type CacheTTL added in v0.38.0

type CacheTTL = msg.CacheTTL

type CompletedEvent added in v0.26.0

type CompletedEvent struct {
	StopReason StopReason `json:"stop_reason"`
}

func (CompletedEvent) Type added in v0.26.0

func (e CompletedEvent) Type() EventType

type ContentPartEvent added in v0.29.0

type ContentPartEvent struct {
	Part  msg.Part `json:"part"`
	Index int      `json:"index"`
}

ContentPartEvent is emitted once per content block when the provider signals block completion (content_block_stop). Index is the position of this block in the model's original output array — required to preserve the exact interleaving order of text and thinking blocks when re-serializing the assistant message.

func (ContentPartEvent) Type added in v0.29.0

func (e ContentPartEvent) Type() EventType

type DebugEvent added in v0.26.0

type DebugEvent struct {
	Message string `json:"message,omitempty"`
	Data    any    `json:"data,omitempty"`
}

func (DebugEvent) Type added in v0.26.0

func (e DebugEvent) Type() EventType

type DeltaEvent added in v0.26.0

type DeltaEvent struct {
	// Type identifies which payload field is set.
	Kind DeltaKind `json:"kind"`

	// Index is the position of this content block in the model's output array.
	// nil when the provider does not supply block-level indexing.
	//
	// Index is meaningful because a single HTTP response can contain multiple
	// blocks of the same type. Add Anthropic's interleaved-thinking beta a
	// single response may produce: thinking(0) → text(1) → tool(2) → thinking(3) → text(4).
	// Without Index a consumer cannot tell which thinking or text block a delta
	// belongs to.
	//
	// Provider semantics:
	//   Anthropic          — content_block index, all block types
	//   Bedrock            — ContentBlockIndex, all block types
	//   OpenAI Responses   — output_index, all output types
	//   OpenAI Completions — tool_calls[].index, tool calls only; text=nil
	//   OpenRouter         — tool_calls[].index, tool calls only; text=nil
	//   Ollama             — nil (complete tool calls only, no streaming fragments)
	Index *uint32 `json:"index,omitempty"`

	// Text is populated for DeltaKindText.
	Text string `json:"text,omitempty"`

	// Reasoning is populated for DeltaKindReasoning.
	Thinking string `json:"thinking,omitempty"`

	ToolDeltaPart
}

DeltaEvent carries one incremental content chunk from the model eventPub. Exactly one payload field is populated, indicated by EventType.

func TextDelta added in v0.23.0

func TextDelta(text string) *DeltaEvent

func ThinkingDelta added in v0.29.0

func ThinkingDelta(text string) *DeltaEvent

func ToolDelta added in v0.23.0

func ToolDelta(id, name, argsFragment string) *DeltaEvent

func (*DeltaEvent) Type added in v0.26.0

func (e *DeltaEvent) Type() EventType

func (*DeltaEvent) WithIndex added in v0.26.0

func (e *DeltaEvent) WithIndex(idx uint32) *DeltaEvent

type DeltaKind added in v0.26.0

type DeltaKind string

DeltaKind identifies the kind of incremental content carried by a DeltaEvent.

const (
	DeltaKindText     DeltaKind = "text"
	DeltaKindThinking DeltaKind = "thinking"
	DeltaKindTool     DeltaKind = "tool"
)

type DetectEnv added in v0.40.0

type DetectEnv struct {
	HTTPClient *http.Client
	LLMOptions []Option
}

type DetectedProvider added in v0.40.0

type DetectedProvider struct {
	Name   string
	Type   string
	Params map[string]any
	Order  int
}

type Effort added in v0.35.0

type Effort string

Effort controls how thoroughly the model works on the response. Affects thinking depth, response length, and tool call count. Universal across all providers.

const (
	// EffortUnspecified is the zero value — provider picks its default.
	EffortUnspecified Effort = ""
	// EffortLow produces fast, cheap, less thorough responses.
	EffortLow Effort = "low"
	// EffortMedium produces balanced responses.
	EffortMedium Effort = "medium"
	// EffortHigh produces thorough, slower responses.
	EffortHigh Effort = "high"
	// EffortMax produces maximum-capability responses.
	// Silently downgrades to EffortHigh on models that don't support it.
	EffortMax Effort = "max"
)

func (Effort) IsEmpty added in v0.35.0

func (e Effort) IsEmpty() bool

IsEmpty returns true when no effort has been specified.

func (Effort) MarshalText added in v0.36.0

func (e Effort) MarshalText() ([]byte, error)

func (Effort) ToBudget added in v0.35.0

func (e Effort) ToBudget(low, high int) (int, bool)

ToBudget maps this effort to a token budget in [low, high]. Used by providers that need budget_tokens (Anthropic < 4.6, Bedrock). EffortMax maps to the same budget as EffortHigh. Returns (0, false) for EffortUnspecified.

func (*Effort) UnmarshalText added in v0.36.0

func (e *Effort) UnmarshalText(b []byte) error

func (Effort) Valid added in v0.35.0

func (e Effort) Valid() bool

Valid returns true if the Effort is a known valid value or empty.

type Envelope added in v0.26.0

type Envelope struct {
	Type EventType `json:"type"`
	Meta EventMeta `json:"meta"`
	Data any       `json:"data,omitempty"`
}

type ErrorEvent added in v0.26.0

type ErrorEvent struct {
	Error error `json:"error"`
}

func (ErrorEvent) Type added in v0.26.0

func (e ErrorEvent) Type() EventType

type Event added in v0.26.0

type Event interface {
	Type() EventType
}

type EventHandler added in v0.26.0

type EventHandler interface {
	Handle(e Event)
}

type EventHandlerFunc added in v0.26.0

type EventHandlerFunc func(e Event)

func (EventHandlerFunc) Handle added in v0.26.0

func (h EventHandlerFunc) Handle(e Event)

type EventMeta added in v0.26.0

type EventMeta struct {
	RequestID string            `json:"request_id,omitempty"`
	Seq       uint64            `json:"seq,omitempty"`
	CreatedAt time.Time         `json:"created_at,omitempty"`
	After     time.Duration     `json:"after,omitempty"`
	TraceID   string            `json:"trace_id,omitempty"`
	Model     string            `json:"model,omitempty"`
	Attrs     map[string]string `json:"attrs,omitempty"`
}

type EventType added in v0.26.0

type EventType string

EventType identifies the kind of streaming event from a provider.

const (
	StreamEventCreated          EventType = "created"
	StreamEventClosed           EventType = "closed"
	StreamEventModelResolved    EventType = "model_resolved"
	StreamEventProviderFailover EventType = "provider_failover"
	StreamEventStarted          EventType = "started"
	StreamEventUsageUpdated     EventType = "usage"
	StreamEventTokenEstimate    EventType = "token_estimate"
	StreamEventDelta            EventType = "delta"
	StreamEventToolCall         EventType = "tool_call"
	StreamEventContentPart      EventType = "content_part"
	StreamEventCompleted        EventType = "completed"
	StreamEventError            EventType = "error"
	StreamEventDebug            EventType = "debug"
	StreamEventRequest          EventType = "request"
)

type Executor added in v0.40.0

type Executor interface {
	CreateStream(ctx context.Context, src Buildable) (Stream, error)
}

type HttpClientOpts added in v0.23.0

type HttpClientOpts struct {
	// Logger enables transport-level request/response logging at Debug level.
	// When nil, no logging is performed.
	Logger *slog.Logger

	// Debug extends logging to include request/response headers and bodies.
	// Has no effect when Logger is nil.
	Debug bool

	// TLSHandshakeTimeout is the maximum time allowed for a TLS handshake.
	// Defaults to 30 seconds if not set.
	TLSHandshakeTimeout time.Duration

	// ResponseHeaderTimeout is the maximum time to wait for response headers
	// after the request is sent. LLM APIs can be slow to respond (model loading,
	// queueing, cold starts), so this defaults to 120 seconds rather than the
	// typical HTTP client default.
	ResponseHeaderTimeout time.Duration
}

HttpClientOpts configures the HTTP client created by NewHttpClient.

type IntentSelector added in v0.40.0

type IntentSelector struct {
	Model          string
	PreferredKinds []string
	PreferredNames []string
	Tags           map[string]string
}

type Message

type Message = msg.Message

func Assistant added in v0.26.0

func Assistant(text string) Message

func System added in v0.26.0

func System(text string) Message

func User added in v0.26.0

func User(text string) Message

type Messages added in v0.5.0

type Messages = msg.Messages

type Model

type Model struct {
	ID       string         `json:"id"`
	Name     string         `json:"name"`
	Provider string         `json:"provider"`
	Aliases  []string       `json:"aliases,omitempty"`
	Pricing  *usage.Pricing `json:"pricing,omitempty"` // nil unless fetched dynamically
}

Model represents an LLM model.

type ModelFetcher

type ModelFetcher interface {
	FetchModels(ctx context.Context) ([]Model, error)
}

ModelFetcher is an optional interface providers can implement to list models dynamically from their API instead of returning a static list.

type ModelResolvedEvent added in v0.35.0

type ModelResolvedEvent struct {
	Resolver string `json:"resolver"`
	Name     string `json:"name,omitempty"`
	Resolved string `json:"resolved,omitempty"`
}

ModelResolvedEvent is emitted whenever a requested model name is translated to a different resolved name: by router alias lookup, by OpenRouter's default-model normalization, or by a provider revealing the actual model chosen for the request.

func (ModelResolvedEvent) Type added in v0.35.0

func (e ModelResolvedEvent) Type() EventType

type Models added in v0.29.0

type Models []Model

func (Models) ByAlias added in v0.29.0

func (m Models) ByAlias(alias string) (Model, bool)

func (Models) ByID added in v0.29.0

func (m Models) ByID(id string) (Model, bool)

func (Models) Models added in v0.29.0

func (m Models) Models() Models

func (Models) Resolve added in v0.29.0

func (m Models) Resolve(modelID string) (Model, error)

type ModelsProvider added in v0.29.0

type ModelsProvider interface {
	Models() Models
}

type Named added in v0.29.0

type Named interface {
	Name() string
}

type OfferingCandidate added in v0.40.0

type OfferingCandidate struct {
	ServiceID string
	WireModel string
	Source    string
}

type Option added in v0.12.0

type Option func(*Options)

Option configures provider options.

func APIKeyFromEnv added in v0.12.0

func APIKeyFromEnv(candidates ...string) Option

APIKeyFromEnv returns an Option that reads the API key from environment variables. It tries each candidate in order, returning the first non-empty value. Returns an error at call time if none of the candidates are set.

func WithAPIKey added in v0.12.0

func WithAPIKey(key string) Option

WithAPIKey sets a static API key.

func WithAPIKeyFunc added in v0.12.0

func WithAPIKeyFunc(f func(ctx context.Context) (string, error)) Option

WithAPIKeyFunc sets a dynamic API key resolver. The function is called on each CreateStream() call, enabling:

  • Lazy key resolution (fetch from secret manager on first use)
  • Key rotation (fetch fresh key each time)
  • Context-aware resolution (respect timeouts/cancellation)

func WithBaseURL added in v0.12.0

func WithBaseURL(url string) Option

WithBaseURL sets a custom base URL for the provider.

func WithHTTPClient added in v0.23.0

func WithHTTPClient(c *http.Client) Option

WithHTTPClient sets a custom HTTP client for the provider. When not set, providers use DefaultHttpClient().

func WithLogger added in v0.23.0

func WithLogger(l *slog.Logger) Option

WithLogger sets a logger for providers that emit events outside the HTTP transport layer (e.g. Bedrock's binary eventstream). Events are logged at Debug level using the same format as the HTTP transport, so the same log renderer handles output from all providers.

type Options added in v0.12.0

type Options struct {
	// BaseURL is the base URL for the provider's API.
	BaseURL string

	// APIKeyFunc returns the API key for authentication.
	// It is called on each CreateStream() call, allowing for lazy/dynamic resolution.
	APIKeyFunc func(ctx context.Context) (string, error)

	// HTTPClient is the HTTP client to use for API requests.
	// When nil, providers fall back to DefaultHttpClient().
	HTTPClient *http.Client

	// Logger is used by providers that cannot log via the HTTP transport
	// (e.g. Bedrock's binary eventstream). When set, eventPub events are logged
	// at Debug level using the same message format as the HTTP transport logger
	// so the same renderer handles both.
	Logger *slog.Logger
}

Options holds configuration shared across providers.

func Apply added in v0.12.0

func Apply(opts ...Option) *Options

Apply applies all options to a new Options struct and returns it.

func (*Options) ResolveAPIKey added in v0.12.0

func (o *Options) ResolveAPIKey(ctx context.Context) (string, error)

ResolveAPIKey calls the APIKeyFunc to get the API key. Returns an empty string (no error) if no APIKeyFunc was configured.

type OutputFormat added in v0.25.0

type OutputFormat string

OutputFormat specifies the desired output format for the model response.

const (
	// OutputFormatText requests plain text output (default for most providers).
	OutputFormatText OutputFormat = "text"
	// OutputFormatJSON requests JSON output. The model will be constrained
	// to output valid JSON. Not all providers support this.
	OutputFormatJSON OutputFormat = "json"
)

func (OutputFormat) MarshalText added in v0.36.0

func (f OutputFormat) MarshalText() ([]byte, error)

func (*OutputFormat) UnmarshalText added in v0.36.0

func (f *OutputFormat) UnmarshalText(b []byte) error

type OverageLimit added in v0.29.0

type OverageLimit struct {
	// Status is "allowed" or "rejected".
	Status RateLimitStatus `json:"status"`

	// DisabledReason explains why overage is disabled (e.g., "out_of_credits").
	DisabledReason string `json:"disabled_reason,omitempty"`
}

OverageLimit describes the overage (pay-as-you-go) behavior.

type PreferenceRule added in v0.40.0

type PreferenceRule struct {
	Intent        string
	ServiceIDs    []string
	ProviderNames []string
}

type Provider

type Provider interface {
	Named
	ModelsProvider
	Streamer
}

Provider is the interface each LLM backend must implement.

type ProviderError added in v0.23.0

type ProviderError struct {
	// Sentinel is one of the Err* vars above. errors.Is matches against it.
	Sentinel error `json:"-"`

	// Provider is the name of the provider that produced this error.
	// Use the ProviderName* constants.
	Provider string `json:"provider"`

	// Message is a human-readable description of the error.
	Message string `json:"message"`

	// Cause is the underlying error that triggered this one, if any.
	Cause error `json:"-"`

	// RequestBody is the raw HTTP request body. Only set for ErrBuildRequest.
	RequestBody string `json:"request_body,omitempty"`

	// StatusCode is the HTTP response status code. Only set for ErrAPIError.
	StatusCode int `json:"status_code,omitempty"`

	// Body is the raw HTTP response body. Only set for ErrAPIError.
	ResponseBody string `json:"response_body,omitempty"`
}

ProviderError is a structured error emitted by any provider. It wraps a sentinel so errors.Is works, carries the provider name for identification, and optionally holds an HTTP status code and body for API errors.

func AsProviderError added in v0.23.0

func AsProviderError(provider string, err error) *ProviderError

AsProviderError ensures err is a *ProviderError. If it already is one, it is returned as-is. Otherwise it is wrapped in a new ProviderError with ErrUnknown as the sentinel. This guarantees that every error surface from CreateStream and EventStream.Error() is a *ProviderError.

func NewErrAPIError added in v0.23.0

func NewErrAPIError(provider string, statusCode int, responseBody string) *ProviderError

NewErrAPIError wraps a non-2xx HTTP response from a provider API.

func NewErrAPIErrorWithRequest added in v0.29.0

func NewErrAPIErrorWithRequest(provider string, requestBody string, statusCode int, responseBody string) *ProviderError

NewErrAPIErrorWithRequest wraps a non-2xx HTTP response from a provider API.

func NewErrAllProvidersFailed added in v0.35.0

func NewErrAllProvidersFailed(provider string, errs []error) *ProviderError

NewErrAllProvidersFailed returns an error when every failover target has been tried and all returned retriable errors. The original per-provider errors are preserved as the Cause via errors.Join so callers can inspect them with errors.Is / errors.As without losing the HTTP status, body, or message.

func NewErrBuildRequest added in v0.23.0

func NewErrBuildRequest(provider string, cause error) *ProviderError

NewErrBuildRequest wraps a failure that occurred while building the outgoing request (e.g. JSON serialisation error).

func NewErrContextCancelled added in v0.23.0

func NewErrContextCancelled(provider string, cause error) *ProviderError

NewErrContextCancelled wraps a context cancellation for a provider eventPub.

func NewErrMissingAPIKey added in v0.23.0

func NewErrMissingAPIKey(provider string) *ProviderError

NewErrMissingAPIKey returns an error for a provider that has no API key configured.

func NewErrNoProviders added in v0.23.0

func NewErrNoProviders(provider string) *ProviderError

NewErrNoProviders returns an error when no providers are available or all failover targets have been exhausted.

func NewErrProviderMsg added in v0.23.0

func NewErrProviderMsg(provider string, msg string) *ProviderError

NewErrProviderMsg wraps an explicit error message sent by the provider inside the eventPub (e.g. an Anthropic error event or OpenRouter chunk error).

func NewErrRequestFailed added in v0.23.0

func NewErrRequestFailed(provider string, cause error) *ProviderError

NewErrRequestFailed wraps an HTTP transport-level failure.

func NewErrStreamDecode added in v0.23.0

func NewErrStreamDecode(provider string, cause error) *ProviderError

NewErrStreamDecode wraps a JSON or protocol decode failure mid-eventPub.

func NewErrStreamRead added in v0.23.0

func NewErrStreamRead(provider string, cause error) *ProviderError

NewErrStreamRead wraps an I/O or scanner error that occurred while reading the response eventPub.

func NewErrUnknownModel added in v0.23.0

func NewErrUnknownModel(provider string, modelID string) *ProviderError

NewErrUnknownModel returns an error for a model ToolCallID or alias that cannot be resolved by the provider.

func (*ProviderError) Error added in v0.23.0

func (e *ProviderError) Error() string

Error returns a human-readable error string in the form: "<provider>: <sentinel>" or "<provider>: <sentinel>: <message>" (with optional ": <cause>" suffix).

func (*ProviderError) Is added in v0.23.0

func (e *ProviderError) Is(target error) bool

Is reports whether this error matches target. It matches if target is the same sentinel, enabling errors.Is(err, ErrAPIError) etc.

func (*ProviderError) MarshalJSON added in v0.23.0

func (e *ProviderError) MarshalJSON() ([]byte, error)

MarshalJSON serialises ProviderError to JSON. Sentinel and Cause are rendered as strings so the full error is machine-readable.

func (*ProviderError) Unwrap added in v0.23.0

func (e *ProviderError) Unwrap() error

Unwrap returns Cause when set, allowing errors.As/Is to traverse the chain. When Cause is nil, Unwrap returns Sentinel so errors.Is(err, ErrAPIError) still works even with no underlying cause.

func (*ProviderError) WithRequestBody added in v0.29.0

func (e *ProviderError) WithRequestBody(body string) *ProviderError

type ProviderFailoverEvent added in v0.35.0

type ProviderFailoverEvent struct {
	Provider         string `json:"provider"`          // failed provider
	FailoverProvider string `json:"failover_provider"` // next provider being tried
	Error            error  `json:"-"`
}

ProviderFailoverEvent is emitted by the router each time a provider attempt fails with a retriable error and the next provider is tried. It is NOT emitted when the last provider in the list fails (that is terminal, surfaced as an error return, not an event).

func (ProviderFailoverEvent) Type added in v0.35.0

type ProviderRegistry added in v0.40.0

type ProviderRegistry interface {
	Detect(ctx context.Context, env DetectEnv, disabled map[string]bool) ([]DetectedProvider, error)
	Build(ctx context.Context, req DetectedProvider, client *http.Client, opts []Option) (Provider, error)
}

type ProviderRequest added in v0.35.0

type ProviderRequest struct {
	URL     string            `json:"url"`
	Method  string            `json:"method"`
	Headers map[string]string `json:"headers"`
	Body    json.RawMessage   `json:"body"`
}

func ProviderRequestFromHTTP added in v0.35.0

func ProviderRequestFromHTTP(req *http.Request, body []byte) ProviderRequest

ProviderRequestFromHTTP builds a ProviderRequest from an outgoing *http.Request and the raw body bytes. Call it after the http.Request is fully constructed (all headers set) but before client.Do — the captured data then exactly matches what is sent on the wire.

Header keys are in canonical MIME form (e.g. "Content-Type", "X-Api-Key") as stored by net/http. Multi-value headers are joined with ", ". Sensitive headers (Authorization, X-Api-Key) are replaced with "[REDACTED]".

body is captured from the []byte slice, NOT from req.Body, so the reader inside req is untouched and client.Do works correctly afterwards.

type ProviderWrapper added in v0.40.0

type ProviderWrapper func(RegisteredProvider, Executor) Executor

type Publisher added in v0.26.0

type Publisher interface {
	Publish(payload Event)

	Started(started StreamStartedEvent)
	ModelResolved(resolver, name, resolved string)
	Failover(from, to string, err error)
	Delta(d *DeltaEvent)
	ToolCall(tc tool.Call)
	ContentBlock(evt ContentPartEvent)

	UsageRecord(r usage.Record)
	TokenEstimate(r usage.Record)
	Completed(completed CompletedEvent)

	Error(err error)
	Debug(msg string, data any)

	Close()
}

func NewEventPublisher added in v0.26.0

func NewEventPublisher() (Publisher, <-chan Envelope)

type RateLimitStatus added in v0.29.0

type RateLimitStatus string

RateLimitStatus represents the status of a rate limit.

const (
	RateLimitStatusAllowed    RateLimitStatus = "allowed"
	RateLimitStatusOverBudget RateLimitStatus = "over_budget"
	RateLimitStatusBlocked    RateLimitStatus = "blocked"
)

type RateLimits added in v0.29.0

type RateLimits struct {
	// Unified limits (applies to the unified API endpoint).
	Unified *UnifiedRateLimit `json:"unified,omitempty"`

	// OrganizationID is the org this request was made under.
	OrganizationID string `json:"organization_id,omitempty"`

	// RequestID is the upstream request identifier.
	RequestID string `json:"request_id,omitempty"`
}

RateLimits holds parsed rate-limit headers from Anthropic API responses. These are emitted in the StreamStartedEvent so consumers can inspect them.

func ParseRateLimits added in v0.29.0

func ParseRateLimits(headers map[string]string) *RateLimits

ParseRateLimits parses rate-limit headers from an Anthropic HTTP response. Pass the response headers map (lowercased keys → values).

type RegisteredProvider added in v0.40.0

type RegisteredProvider struct {
	Name      string
	ServiceID string
	Provider  Provider
}

type Request added in v0.25.0

type Request struct {
	// Model is the model identifier or alias to use, e.g. "fast", "anthropic/claude-sonnet-4-5".
	Model string `json:"model"`

	// Messages is the conversation history to send to the model.
	Messages Messages `json:"messages"`

	// MaxTokens limits the maximum number of tokens in the response.
	// When 0, the provider's default is used.
	MaxTokens int `json:"max_tokens,omitempty"`

	// Temperature controls randomness in sampling. Higher values produce
	// more diverse outputs (0.0-2.0 for most providers). Not supported by
	// Anthropic.
	Temperature float64 `json:"temperature,omitempty"`

	// TopP is the nucleus sampling threshold. The model considers only tokens
	// comprising the top P probability mass. Not supported by Anthropic.
	TopP float64 `json:"top_p,omitempty"`

	// TopK restricts token selection to the K most likely tokens. Higher values
	// increase diversity. Not supported by Anthropic.
	TopK int `json:"top_k,omitempty"`

	// OutputFormat specifies the desired output format.
	// Supported by OpenAI and Anthropic. When set to JSON, the model will
	// be constrained to output valid JSON.
	OutputFormat OutputFormat `json:"output_format,omitempty"`

	// Tools is the set of tools the model may call during the response.
	Tools []llmtool.Definition `json:"tools,omitempty"`

	// ToolChoice controls how the model selects tools. Defaults to Auto when Tools are provided.
	ToolChoice ToolChoice `json:"tool_choice,omitempty"`

	// Effort controls how thoroughly the model works on the response.
	Effort Effort `json:"effort,omitempty"`

	// Thinking controls whether extended/chain-of-thought reasoning is used.
	// This is a mode selector (on/off/auto), not a depth control.
	Thinking ThinkingMode `json:"thinking,omitempty"`

	// RequestMeta carries OpenAI-compatible request attribution metadata.
	RequestMeta *RequestMeta `json:"request_meta,omitempty"`

	// CacheHint is a top-level prompt caching hint. Behaviour is provider-specific:
	// Anthropic auto mode, Bedrock trailing cachePoint, OpenAI extended retention.
	CacheHint *CacheHint `json:"cache_hint,omitempty"`

	// ApiTypeHint expresses a preferred wire protocol. Providers honour it when
	// they support the requested API; otherwise they fall back to their default.
	// The actual API used is always reported in RequestEvent.ResolvedApiType.
	ApiTypeHint ApiType `json:"api_type_hint,omitempty"`
}

Request configures a provider CreateStream call.

func BuildRequest added in v0.35.0

func BuildRequest(opts ...RequestOption) (Request, error)

BuildRequest is a convenience wrapper; opts are passed through to Build.

func (Request) BuildRequest added in v0.38.0

func (r Request) BuildRequest(_ context.Context) (Request, error)

BuildRequest implements Buildable. Returns the request as-is without re-validating — providers call Validate() themselves after receiving opts, so passing a Request skips one validation round-trip compared to passing a *RequestBuilder (whose Build() also calls Validate()).

func (Request) Validate added in v0.25.0

func (o Request) Validate() error

Validate checks that the options are valid.

type RequestBuilder added in v0.35.0

type RequestBuilder struct {
	// contains filtered or unexported fields
}

func NewRequestBuilder added in v0.35.0

func NewRequestBuilder() *RequestBuilder

NewRequestBuilder returns a zero-value builder. All fields default to their provider-level defaults (zero values). Call Build() only after setting Model.

func (*RequestBuilder) ApiTypeHint added in v0.40.0

func (b *RequestBuilder) ApiTypeHint(t ApiType) *RequestBuilder

ApiTypeHint sets the preferred wire protocol. The provider honours it when supported; falls back to its default otherwise.

func (*RequestBuilder) Append added in v0.38.0

func (b *RequestBuilder) Append(msgs ...Message) *RequestBuilder

Append appends pre-built messages (assistant turns, tool results, etc.).

func (*RequestBuilder) Apply added in v0.38.0

func (b *RequestBuilder) Apply(opts ...RequestOption) *RequestBuilder

Apply applies functional options to the builder and returns b for chaining. Build(opts...) internally delegates to Apply, so both are interchangeable for the terminal options; Apply is preferred when options are pre-assembled.

func (*RequestBuilder) Build added in v0.35.0

func (b *RequestBuilder) Build(opts ...RequestOption) (Request, error)

func (*RequestBuilder) BuildRequest added in v0.38.0

func (b *RequestBuilder) BuildRequest(_ context.Context) (Request, error)

BuildRequest implements Buildable. Calls Build() and returns the result.

func (*RequestBuilder) Coding added in v0.35.0

func (b *RequestBuilder) Coding() *RequestBuilder

func (*RequestBuilder) Effort added in v0.35.0

func (b *RequestBuilder) Effort(level Effort) *RequestBuilder

func (*RequestBuilder) EndUser added in v0.40.0

func (b *RequestBuilder) EndUser(user string) *RequestBuilder

func (*RequestBuilder) MaxTokens added in v0.35.0

func (b *RequestBuilder) MaxTokens(maxTokens int) *RequestBuilder

func (*RequestBuilder) Metadata added in v0.40.0

func (b *RequestBuilder) Metadata(metadata map[string]any) *RequestBuilder

func (*RequestBuilder) Model added in v0.35.0

func (b *RequestBuilder) Model(modelID string) *RequestBuilder

func (*RequestBuilder) OutputFormat added in v0.35.0

func (b *RequestBuilder) OutputFormat(format OutputFormat) *RequestBuilder

OutputFormat sets the output format of the response.

func (*RequestBuilder) RequestMeta added in v0.40.0

func (b *RequestBuilder) RequestMeta(meta *RequestMeta) *RequestBuilder

func (*RequestBuilder) System added in v0.38.0

func (b *RequestBuilder) System(text string, cache ...CacheOpt) *RequestBuilder

System appends a system message. Pass CacheTTL1h or CacheTTL5m to enable prompt caching for this message. Omitting cache leaves CacheHint nil.

func (*RequestBuilder) Temperature added in v0.35.0

func (b *RequestBuilder) Temperature(temperature float64) *RequestBuilder

func (*RequestBuilder) Thinking added in v0.35.0

func (b *RequestBuilder) Thinking(mode ThinkingMode) *RequestBuilder

func (*RequestBuilder) ToolChoice added in v0.38.0

func (b *RequestBuilder) ToolChoice(tc ToolChoice) *RequestBuilder

ToolChoice sets the tool selection strategy.

func (*RequestBuilder) Tools added in v0.38.0

func (b *RequestBuilder) Tools(defs ...tool.Definition) *RequestBuilder

Tools appends tool definitions to the request. Multiple calls accumulate; all tools registered this way are sent to the model together.

func (*RequestBuilder) TopK added in v0.35.0

func (b *RequestBuilder) TopK(k int) *RequestBuilder

TopK sets the top-k parameter for sampling.

func (*RequestBuilder) TopP added in v0.35.0

func (*RequestBuilder) User added in v0.38.0

func (b *RequestBuilder) User(text string, cache ...CacheOpt) *RequestBuilder

User appends a user message. Pass CacheTTL1h or CacheTTL5m to enable prompt caching for this message. Omitting cache leaves CacheHint nil.

type RequestEvent added in v0.35.0

type RequestEvent struct {
	OriginalRequest Request         `json:"original_request"`
	ProviderRequest ProviderRequest `json:"provider_request"`

	// ResolvedApiType is the wire protocol actually used for this request.
	// Always a concrete value (never ApiTypeAuto). Set by the provider before
	// the HTTP call is made.
	ResolvedApiType ApiType `json:"resolved_api_type,omitempty"`
}

RequestEvent is emitted by a provider once per stream, carrying the final resolved request parameters (after alias resolution, default application, thinking-budget mapping, etc.). Consumers can use this for observability / debugging without inspecting the raw HTTP body.

func (RequestEvent) Type added in v0.35.0

func (e RequestEvent) Type() EventType

type RequestMeta added in v0.40.0

type RequestMeta struct {
	User     string         `json:"user,omitempty"`
	Metadata map[string]any `json:"metadata,omitempty"`
}

RequestMeta carries provider-specific request attribution metadata used by OpenAI-compatible APIs such as OpenAI and OpenRouter.

func (*RequestMeta) Clone added in v0.40.0

func (m *RequestMeta) Clone() *RequestMeta

type RequestOption added in v0.35.0

type RequestOption func(r *Request)

func WithApiTypeHint added in v0.40.0

func WithApiTypeHint(t ApiType) RequestOption

WithApiTypeHint sets Request.ApiTypeHint.

func WithEffort added in v0.38.0

func WithEffort(level Effort) RequestOption

func WithEndUser added in v0.40.0

func WithEndUser(user string) RequestOption

func WithMaxTokens added in v0.38.0

func WithMaxTokens(n int) RequestOption

func WithMessages added in v0.38.0

func WithMessages(msgs ...Message) RequestOption

WithMessages appends pre-built messages (assistant turns, tool results, etc.).

func WithMetadata added in v0.40.0

func WithMetadata(metadata map[string]any) RequestOption

func WithModel added in v0.38.0

func WithModel(model string) RequestOption

func WithOutputFormat added in v0.38.0

func WithOutputFormat(f OutputFormat) RequestOption

func WithRequestMeta added in v0.40.0

func WithRequestMeta(meta *RequestMeta) RequestOption

func WithSystem added in v0.38.0

func WithSystem(text string, cache ...CacheOpt) RequestOption

WithSystem appends a system message. Same cache nil-guard semantics as the fluent System method: omitting cache leaves CacheHint nil.

func WithTemperature added in v0.38.0

func WithTemperature(t float64) RequestOption

func WithThinking added in v0.38.0

func WithThinking(mode ThinkingMode) RequestOption

func WithToolChoice added in v0.38.0

func WithToolChoice(tc ToolChoice) RequestOption

WithToolChoice sets the tool selection strategy.

func WithTools added in v0.38.0

func WithTools(defs ...tool.Definition) RequestOption

WithTools appends tool definitions to the request. Multiple calls accumulate; all tools registered this way are sent to the model together.

func WithTopK added in v0.38.0

func WithTopK(k int) RequestOption

func WithTopP added in v0.38.0

func WithTopP(p float64) RequestOption

func WithUser added in v0.38.0

func WithUser(text string, cache ...CacheOpt) RequestOption

WithUser appends a user message. Same cache nil-guard semantics as the fluent User method: omitting cache leaves CacheHint nil.

type ResolvedModelSpec added in v0.40.0

type ResolvedModelSpec struct {
	RawModel       string
	ExactName      string
	ExactServiceID string
	RequestedModel string
	Offerings      []OfferingCandidate
	FromIntent     string
	Ambiguous      bool
}

type Response added in v0.26.0

type Response interface {
	Message() msg.Message
	Text() string
	Thought() string
	StopReason() StopReason
	Error() error
	ToolCalls() []tool.Call
	ToolResults() []tool.Result
}

type Result added in v0.26.0

type Result interface {
	Response
	Next() msg.Messages
	UsageRecords() []usage.Record   // provider-reported, in arrival order
	TokenEstimates() []usage.Record // pre-request estimates, in order
	Drift() *usage.Drift            // nil if no estimate received
}

func ProcessEvents added in v0.26.0

func ProcessEvents(ctx context.Context, ch <-chan Envelope) Result

type RetryPolicy added in v0.40.0

type RetryPolicy struct {
	EnableFallback bool
}

func DefaultRetryPolicy added in v0.40.0

func DefaultRetryPolicy() RetryPolicy

type Role

type Role = msg.Role

type Service added in v0.40.0

type Service struct {
	// contains filtered or unexported fields
}

func New added in v0.40.0

func New(opts ...ServiceOption) (*Service, error)

func (*Service) CreateStream added in v0.40.0

func (s *Service) CreateStream(ctx context.Context, src Buildable) (Stream, error)

func (*Service) ExplainModel added in v0.40.0

func (s *Service) ExplainModel(model string) (ResolvedModelSpec, []RegisteredProvider, error)

type ServiceConfig added in v0.40.0

type ServiceConfig struct {
	Providers        []RegisteredProvider
	IntentAliases    map[string]IntentSelector
	Preferences      []PreferenceRule
	Wrappers         []ProviderWrapper
	RetryPolicy      RetryPolicy
	AutoDetect       bool
	DisabledTypes    map[string]bool
	HTTPClient       *http.Client
	LLMOptions       []Option
	Registry         ProviderRegistry
	DetectedRequests []DetectedProvider
}

type ServiceOption added in v0.40.0

type ServiceOption func(*ServiceConfig)

func WithAutoDetect added in v0.40.0

func WithAutoDetect() ServiceOption

func WithDetectedProvider added in v0.40.0

func WithDetectedProvider(req DetectedProvider) ServiceOption

func WithIntentAlias added in v0.40.0

func WithIntentAlias(name string, sel IntentSelector) ServiceOption

func WithPreference added in v0.40.0

func WithPreference(pref PreferenceRule) ServiceOption

func WithProvider added in v0.40.0

func WithProvider(p Provider) ServiceOption

func WithProviderNamed added in v0.40.0

func WithProviderNamed(name string, p Provider) ServiceOption

func WithRegisteredProvider added in v0.40.0

func WithRegisteredProvider(p RegisteredProvider) ServiceOption

func WithRetryPolicy added in v0.40.0

func WithRetryPolicy(p RetryPolicy) ServiceOption

func WithServiceHTTPClient added in v0.40.0

func WithServiceHTTPClient(client *http.Client) ServiceOption

func WithServiceLLMOptions added in v0.40.0

func WithServiceLLMOptions(opts ...Option) ServiceOption

func WithWrapper added in v0.40.0

func WithWrapper(w ProviderWrapper) ServiceOption

func WithoutAutoDetect added in v0.40.0

func WithoutAutoDetect() ServiceOption

func WithoutProviderType added in v0.40.0

func WithoutProviderType(typeName string) ServiceOption

type StopReason added in v0.23.0

type StopReason string

StopReason describes why the model stopped generating.

const (
	// StopReasonEndTurn is natural completion — the model finished its response.
	StopReasonEndTurn StopReason = "end_turn"
	// StopReasonToolUse means the model emitted one or more tool calls.
	StopReasonToolUse StopReason = "tool_use"
	// StopReasonMaxTokens means the output length limit was reached.
	StopReasonMaxTokens StopReason = "max_tokens"
	// StopReasonContentFilter means output was blocked by the provider.
	StopReasonContentFilter StopReason = "content_filter"
	// StopReasonCancelled means the context was cancelled before the eventPub ended.
	StopReasonCancelled StopReason = "cancelled"
	// StopReasonError means the eventPub ended with a StreamEventError.
	StopReasonError StopReason = "error"

	StopReasonUnknown StopReason = ""
)

type Stream added in v0.26.0

type Stream <-chan Envelope

type StreamClosedEvent added in v0.26.0

type StreamClosedEvent struct{}

func (StreamClosedEvent) Type added in v0.26.0

func (e StreamClosedEvent) Type() EventType

type StreamCreatedEvent added in v0.26.0

type StreamCreatedEvent struct{}

func (StreamCreatedEvent) Type added in v0.26.0

func (e StreamCreatedEvent) Type() EventType

type StreamFunc added in v0.29.0

type StreamFunc func(ctx context.Context, src Buildable) (Stream, error)

func (StreamFunc) CreateStream added in v0.29.0

func (f StreamFunc) CreateStream(ctx context.Context, src Buildable) (Stream, error)

type StreamProcessor added in v0.26.0

type StreamProcessor struct {
	// contains filtered or unexported fields
}

func NewEventProcessor added in v0.26.0

func NewEventProcessor(ctx context.Context, ch <-chan Envelope) *StreamProcessor

func (*StreamProcessor) HandleTool added in v0.26.0

func (r *StreamProcessor) HandleTool(handlers ...tool.NamedHandler) *StreamProcessor

HandleTool registers a Handler that is invoked when the model emits a completed tool call matching h.ToolName(). The handler's output is stored in StreamResult.ToolResults and included in the messages returned by Next/Apply.

Pass a *BoundToolSpec (from llm.Handle) for typed, spec-aware handlers:

proc.HandleTool(llm.Handle(weatherSpec, func(ctx context.Context, in GetWeatherParams) (*GetWeatherResult, error) {
    return &GetWeatherResult{Temp: 22}, nil
}))

func (*StreamProcessor) OnDelta added in v0.26.0

func (*StreamProcessor) OnEvent added in v0.26.0

func (*StreamProcessor) OnReasoningDelta added in v0.26.0

func (r *StreamProcessor) OnReasoningDelta(fn func(delta string)) *StreamProcessor

OnReasoningDelta registers a callback that is called for each incremental reasoning/thinking token.

func (*StreamProcessor) OnStart added in v0.26.0

OnStart registers a callback that is called when the StreamEventStarted event arrives, carrying provider metadata (request ToolCallID, model, time-to-first-token).

func (*StreamProcessor) OnTextDelta added in v0.26.0

func (r *StreamProcessor) OnTextDelta(fn func(delta string)) *StreamProcessor

OnTextDelta registers a callback that is called for each incremental text token. Panics in the callback are recovered and recorded on the StreamResult error.

func (*StreamProcessor) OnToolDelta added in v0.26.0

func (r *StreamProcessor) OnToolDelta(fn func(d ToolDeltaPart)) *StreamProcessor

OnToolDelta registers a callback that is called for each partial tool-call argument fragment (DeltaKindTool deltas).

func (*StreamProcessor) Result added in v0.26.0

func (r *StreamProcessor) Result() Result

Result starts consuming the eventPub (at most once) and returns a channel that yields exactly one *StreamResult when the eventPub is fully processed. The channel is closed after the result is sent.

Calling Result() multiple times is safe — the eventPub is only consumed once and the same channel is returned on subsequent calls.

func (*StreamProcessor) WithAsyncToolDispatch added in v0.26.0

func (r *StreamProcessor) WithAsyncToolDispatch() *StreamProcessor

WithAsyncToolDispatch switches tool handler dispatch to concurrent mode: all tool calls emitted in a single response are executed in parallel, one goroutine per call. Results are collected in emission order before the eventPub is considered complete.

func (*StreamProcessor) WithToolDispatcher added in v0.26.0

func (r *StreamProcessor) WithToolDispatcher(d tool.DispatcherType) *StreamProcessor

WithToolDispatcher sets the tool dispatcher explicitly.

type StreamRequest added in v0.23.0

type StreamRequest = Request

type StreamStartedEvent added in v0.26.0

type StreamStartedEvent struct {
	RequestID string `json:"request_id,omitempty"`

	// Model is the model identifier returned by the upstream API in its response.
	// e.g., "claude-haiku-4-5-20251001". May be empty if the API doesn't echo the model back.
	Model string `json:"model,omitempty"`

	// Provider is the upstream provider that served the request.
	// For direct providers this equals the provider name.
	// For routing providers such as OpenRouter it is the actual backend
	// extracted from the response (e.g. "anthropic", "openai", "meta-llama").
	Provider string `json:"provider,omitempty"`

	// Extra holds provider-specific data such as rate-limit headers.
	Extra map[string]any `json:"extra,omitempty"`
}

func (StreamStartedEvent) Type added in v0.26.0

func (e StreamStartedEvent) Type() EventType

type Streamer added in v0.19.0

type Streamer interface {
	CreateStream(ctx context.Context, src Buildable) (Stream, error)
}

type ThinkingMode added in v0.35.0

type ThinkingMode string

ThinkingMode controls whether the model uses extended/chain-of-thought reasoning. This is a mode selector (on/off/auto), not a depth control — depth is controlled by Effort.

const (
	// ThinkingAuto lets the provider/model decide whether to think.
	ThinkingAuto ThinkingMode = ""
	// ThinkingOn forces extended thinking on.
	ThinkingOn ThinkingMode = "on"
	// ThinkingOff forces extended thinking off.
	ThinkingOff ThinkingMode = "off"
)

func (ThinkingMode) IsOff added in v0.35.0

func (m ThinkingMode) IsOff() bool

IsOff returns true when thinking is explicitly disabled.

func (ThinkingMode) IsOn added in v0.35.0

func (m ThinkingMode) IsOn() bool

IsOn returns true when thinking is explicitly enabled.

func (ThinkingMode) MarshalText added in v0.36.0

func (m ThinkingMode) MarshalText() ([]byte, error)

MarshalText maps the zero value to the user-visible string "auto".

func (*ThinkingMode) UnmarshalText added in v0.36.0

func (m *ThinkingMode) UnmarshalText(b []byte) error

UnmarshalText maps "auto" → ThinkingAuto (the zero value ""). An empty string is also accepted: ThinkingAuto == "" passes Valid().

func (ThinkingMode) Valid added in v0.35.0

func (m ThinkingMode) Valid() bool

Valid returns true if the ThinkingMode is a known valid value.

type TokenEstimateEvent added in v0.40.0

type TokenEstimateEvent struct {
	// Estimate is one pre-request estimate record.
	// The event is emitted once per record; multiple events may be emitted per request
	// when a labeled breakdown is provided (each with distinct Dims.Labels).
	Estimate usage.Record `json:"estimate"` // IsEstimate == true
}

TokenEstimateEvent is dispatched before the first response delta. It carries the pre-request token estimate so consumers can display estimates and drift without calling CountTokens themselves.

func (TokenEstimateEvent) Type added in v0.40.0

func (e TokenEstimateEvent) Type() EventType

type ToolCallEvent added in v0.26.0

type ToolCallEvent struct {
	ToolCall tool.Call `json:"tool_call"`
}

func (ToolCallEvent) Type added in v0.26.0

func (e ToolCallEvent) Type() EventType

type ToolChoice added in v0.6.0

type ToolChoice interface {
	String() string
	// contains filtered or unexported methods
}

func ParseToolChoice added in v0.36.0

func ParseToolChoice(s string) (ToolChoice, error)

ParseToolChoice parses a CLI string into a ToolChoice. Accepted values: "auto" or "", "none", "required", "tool:<name>". An empty string returns ToolChoiceAuto (not nil); use ToolChoiceFlag for flag parsing where empty means "not specified" (nil).

type ToolChoiceAuto added in v0.6.0

type ToolChoiceAuto struct{}

func (ToolChoiceAuto) String added in v0.29.0

func (t ToolChoiceAuto) String() string

type ToolChoiceFlag added in v0.36.0

type ToolChoiceFlag struct{ Value ToolChoice }

ToolChoiceFlag is a pflag-compatible holder for a ToolChoice value that implements encoding.TextMarshaler and encoding.TextUnmarshaler. A zero-value ToolChoiceFlag (Value == nil) means "not specified by the caller"; contrast with ParseToolChoice("") which returns ToolChoiceAuto.

func (ToolChoiceFlag) MarshalText added in v0.36.0

func (f ToolChoiceFlag) MarshalText() ([]byte, error)

func (*ToolChoiceFlag) UnmarshalText added in v0.36.0

func (f *ToolChoiceFlag) UnmarshalText(b []byte) error

type ToolChoiceNone added in v0.6.0

type ToolChoiceNone struct{}

func (ToolChoiceNone) String added in v0.29.0

func (t ToolChoiceNone) String() string

type ToolChoiceRequired added in v0.6.0

type ToolChoiceRequired struct{}

func (ToolChoiceRequired) String added in v0.29.0

func (t ToolChoiceRequired) String() string

type ToolChoiceTool added in v0.6.0

type ToolChoiceTool struct {
	Name string
}

func (ToolChoiceTool) String added in v0.29.0

func (t ToolChoiceTool) String() string

type ToolDeltaPart added in v0.26.0

type ToolDeltaPart struct {
	// ToolID, ToolName, and ToolArgs are populated for DeltaKindTool.
	// ToolArgs is a raw partial JSON fragment — not yet a complete object.
	ToolID   string `json:"tool_id,omitempty"`
	ToolName string `json:"tool_name,omitempty"`
	ToolArgs string `json:"tool_args,omitempty"`
}

type TypedEventHandler added in v0.26.0

type TypedEventHandler[T any] func(e T)

func (TypedEventHandler[T]) Handle added in v0.26.0

func (h TypedEventHandler[T]) Handle(e Event)

type UnifiedRateLimit added in v0.29.0

type UnifiedRateLimit struct {
	// The current status: "allowed", "over_budget", or "blocked".
	Status RateLimitStatus `json:"status"`

	// ResetAt is the Unix timestamp when the primary window resets.
	ResetAt time.Time `json:"reset_at"`

	// FiveHour contains limits for the 5-hour rolling window.
	FiveHour *WindowLimit `json:"five_hour,omitempty"`

	// SevenDay contains limits for the 7-day rolling window.
	SevenDay *WindowLimit `json:"seven_day,omitempty"`

	// Overage describes whether overage usage is enabled and why it might be disabled.
	Overage *OverageLimit `json:"overage,omitempty"`

	// FallbackPercentage is between 0 and 1, indicating how much of the fallback
	// (pay-as-you-go) pool is being used when the primary budget is exhausted.
	FallbackPercentage float64 `json:"fallback_percentage,omitempty"`

	// RepresentativeClaim identifies which tier/bucket this request counts against.
	RepresentativeClaim string `json:"representative_claim,omitempty"`
}

UnifiedRateLimit contains the unified rate-limit data from the Anthropic-Ratelimit-Unified-* headers.

type UsageUpdatedEvent added in v0.26.0

type UsageUpdatedEvent struct {
	Record usage.Record `json:"record"`
}

func (UsageUpdatedEvent) Type added in v0.26.0

func (e UsageUpdatedEvent) Type() EventType

type WindowLimit added in v0.29.0

type WindowLimit struct {
	// Status is "allowed" or "blocked".
	Status RateLimitStatus `json:"status"`

	// ResetAt is the Unix timestamp when this window resets.
	ResetAt time.Time `json:"reset_at"`

	// Utilization is between 0 and 1, representing how much of this window is used.
	Utilization float64 `json:"utilization"`
}

WindowLimit represents a rate-limit window (e.g., 5-hour or 7-day).

Directories

Path Synopsis
Package auto provides zero-config service construction on top of llm.Service.
Package auto provides zero-config service construction on top of llm.Service.
cmd
llmcli command
llmcli is a command-line tool for testing LLM providers.
llmcli is a command-line tool for testing LLM providers.
llmcli/cmds
Package cmds provides CLI commands for llmcli.
Package cmds provides CLI commands for llmcli.
llmcli/store
Package store provides token storage implementations.
Package store provides token storage implementations.
internal
sse
Package llmtest provides helpers for testing code that consumes llm.Stream channels, following the convention of packages like net/http/httptest.
Package llmtest provides helpers for testing code that consumes llm.Stream channels, following the convention of packages like net/http/httptest.
Package ops provides parameterised, use-case-oriented LLM operations.
Package ops provides parameterised, use-case-oriented LLM operations.
provider
anthropic/claude
Package claude provides an Anthropic provider using Claude OAuth tokens.
Package claude provides an Anthropic provider using Claude OAuth tokens.
Package tokencount provides a shared offline tiktoken wrapper for LLM token estimation.
Package tokencount provides a shared offline tiktoken wrapper for LLM token estimation.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL