llm

package
v0.2.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 7, 2026 License: MIT Imports: 6 Imported by: 0

Documentation

Overview

Package llm defines the provider-agnostic types and the Client interface every LLM provider satisfies.

The shapes here are deliberately the lowest common denominator across providers. Provider-specific knobs live on each provider's Options struct.

See docs/v1/API.md for the rationale behind every type.

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrAuth           = errors.New("aikido/llm: authentication failed")
	ErrRateLimited    = errors.New("aikido/llm: rate limited")
	ErrServerError    = errors.New("aikido/llm: provider server error")
	ErrInvalidRequest = errors.New("aikido/llm: invalid request")

	// ErrContentFiltered is wrapped when the provider's safety policy aborts a
	// generation. Detected from finish_reason="content_filter" on a streamed
	// chunk OR a structured error envelope whose code/type signals content
	// filtering. Callers should NOT retry — the same prompt will trip the
	// classifier deterministically.
	//
	// Note: providers sometimes mid-stream RST without sending a structured
	// signal. Those failures still surface as ErrServerError; callers cannot
	// distinguish silent content aborts from genuine transient flake at the
	// wire level.
	ErrContentFiltered = errors.New("aikido/llm: content filtered by provider safety policy")
)

Errors providers wrap with %w when mapping HTTP status to a typed cause.

Functions

func Collect

func Collect(ctx context.Context, c Client, req Request) (text string, calls []ToolCall, images []ImagePart, usage *Usage, err error)

Collect drains a stream into a final result. Useful for non-streaming callers.

Returns text accumulated from EventTextDelta, all complete tool calls, all images surfaced by the provider, final Usage if the provider emitted one, and the first error encountered. Thinking text is not included in the returned text.

Collect respects ctx cancellation: if ctx is cancelled before the stream closes, Collect returns ctx.Err() without waiting for the producer.

func CollectWithRetry added in v0.2.2

func CollectWithRetry(ctx context.Context, c Client, req Request, policy retry.Policy) (text string, calls []ToolCall, images []ImagePart, usage *Usage, err error)

CollectWithRetry wraps Collect with retry.Do using the supplied policy. On retry, the entire stream is restarted from scratch (no resume) — Collect's accumulated state is discarded between attempts. The final return values are from the last attempt.

Provider-side cost is paid per attempt: a streaming model that aborts after emitting partial tokens still bills for those tokens. Tune MaxAttempts with cost in mind.

If policy.ShouldRetry is nil, IsTransientServerError is used. If policy.MaxAttempts is < 1, it's clamped to 1 (no retry).

func DefaultStreamingRetryPolicy added in v0.2.2

func DefaultStreamingRetryPolicy() retry.Policy

DefaultStreamingRetryPolicy returns a sensible retry policy for image generation and other streaming LLM operations: 5 attempts, 2s base, 30s cap, 2x multiplier, 20% jitter, retrying only transient provider errors.

Tuned for image-gen preview models (e.g. Gemini flash-image-preview) which drop streams under upstream load at ~20% in observed traces. With 5 attempts at this rate the effective failure rate is ~0.03%.

func Float32

func Float32(v float32) *float32

Float32 returns a pointer to v.

Convenience for SessionOptions.Temperature so callers can write inline values. Float32(0) returns a non-nil pointer to zero — the deterministic-zero case.

func IsContentFilter added in v0.2.3

func IsContentFilter(err error) bool

IsContentFilter reports whether err is a content-policy abort surfaced by the provider (finish_reason="content_filter", an explicit error envelope with a matching code/message, or any error wrapping ErrContentFiltered). Use this to short-circuit retry — the same prompt will trip the classifier again on the next attempt — and to render a "try different wording" message rather than a generic "try again later".

Note: providers sometimes mid-stream RST without sending a structured content-filter signal. Those failures surface as ErrServerError and are indistinguishable from genuine transient flake at the wire level. Callers that want stronger detection can heuristically classify "all retries failed identically" as a content-filter signal at the call-site.

func IsTransientServerError added in v0.2.2

func IsTransientServerError(err error) bool

IsTransientServerError reports whether err is a transient upstream-provider failure (typically a 5xx, mid-stream RST, or other server-side flake) that is worth retrying. Authentication errors, invalid-request errors, and rate limits are NOT classified as transient — auth and bad-request errors won't fix themselves with a retry, and rate limits should be honored explicitly (typically via a Retry-After-aware policy at the caller).

Use as the ShouldRetry predicate when retrying llm.Collect:

llm.CollectWithRetry(ctx, c, req, retry.Policy{
    MaxAttempts: 5,
    BaseDelay:   2 * time.Second,
    MaxDelay:    30 * time.Second,
    Multiplier:  2.0,
    Jitter:      0.2,
    ShouldRetry: llm.IsTransientServerError,
})

Types

type CacheBreakpoint

type CacheBreakpoint struct {
	TTL CacheTTL
}

CacheBreakpoint marks a message as a cache breakpoint on providers that support it. nil = no breakpoint. &CacheBreakpoint{} = default 5m TTL.

type CacheTTL

type CacheTTL string
const (
	CacheTTL5Min  CacheTTL = "5m"
	CacheTTL1Hour CacheTTL = "1h"
)

type Client

type Client interface {
	Stream(ctx context.Context, req Request) (<-chan Event, error)

	// Complete sends a non-streaming completion request and returns the full
	// response in one shot. Implementations should map provider errors onto the
	// same Err* sentinels Stream uses (ErrAuth, ErrRateLimited, ErrServerError,
	// ErrInvalidRequest, ErrContentFiltered).
	//
	// Complete is the right choice for image generation: image-capable models
	// emit the entire base64 payload in a single chunk that frequently exceeds
	// the SSE per-line scanner cap, deterministically failing Stream-based
	// callers. Complete reads the response as a single JSON body and is
	// immune to that failure mode.
	Complete(ctx context.Context, req Request) (Response, error)
}

type Event

type Event struct {
	Kind  EventKind
	Text  string
	Tool  *ToolCall
	Image *ImagePart
	Usage *Usage
	Err   error

	// FinishReason carries the provider's reported finish_reason on the chunk
	// that closes a generation. Set on EventEnd (and on the EventError that
	// stands in for finish_reason="content_filter"). Common values: "stop",
	// "tool_calls", "length", "content_filter", "error". Empty when the
	// provider didn't surface one.
	FinishReason string
}

Event is one streaming event emitted by a Client. Channel closes after EventEnd.

type EventKind

type EventKind string

EventKind identifies the kind of a streaming event.

const (
	EventTextDelta EventKind = "text_delta"
	EventToolCall  EventKind = "tool_call"
	EventThinking  EventKind = "thinking"
	EventImage     EventKind = "image"
	EventUsage     EventKind = "usage"
	EventError     EventKind = "error"
	EventEnd       EventKind = "end"
)

type ImagePart

type ImagePart struct {
	URL         string
	ContentType string
	Data        []byte
}

ImagePart is one image attached to a message or returned by an image-capable model. URL is set when the provider returned a remote URL or when the caller supplied a data URI / remote URL on input. Data is set when the provider returned inline bytes (decoded from a data: URI) or when the caller wants to inline an image on output. ContentType is the MIME type when known ("image/png", "image/jpeg", ...) — empty when not provided by the wire.

type Message

type Message struct {
	Role       Role
	Content    string
	Images     []ImagePart
	ToolCalls  []ToolCall
	ToolCallID string
	Cache      *CacheBreakpoint
}

type Request

type Request struct {
	Model         string
	Messages      []Message
	Tools         []ToolDef
	MaxTokens     int
	Temperature   *float32
	Thinking      *ThinkingConfig
	StopSequences []string
}

type Response added in v0.2.4

type Response struct {
	Text         string
	ToolCalls    []ToolCall
	Images       []ImagePart
	Usage        *Usage
	FinishReason string
}

Response is the fully-assembled result of a non-streaming completion.

Returned by Client.Complete in one shot — no event channel, no SSE framing. Use this for image generation and any other workload where progressive token delivery has no UX value: the SSE path imposes a per-line buffer cap that can't accommodate the multi-MB single-chunk responses image-capable models emit.

FinishReason mirrors the provider's reported value ("stop", "tool_calls", "length", "content_filter", ...) — empty when the provider didn't surface one.

func CompleteWithRetry added in v0.2.4

func CompleteWithRetry(ctx context.Context, c Client, req Request, policy retry.Policy) (Response, error)

CompleteWithRetry wraps Client.Complete with retry.Do using the supplied policy. Each attempt sends a fresh non-streaming request; the final Response is from the last attempt.

Use this for image generation: a single oversized base64 payload exceeds the SSE scanner cap and trips ErrServerError on every Stream attempt. Complete reads the body as one JSON document and avoids the cap entirely; the retry wrapper is here for the standard 429/5xx-at-request-start flake.

If policy.ShouldRetry is nil, IsTransientServerError is used.

type Role

type Role string
const (
	RoleSystem    Role = "system"
	RoleUser      Role = "user"
	RoleAssistant Role = "assistant"
	RoleTool      Role = "tool"
)

type ThinkingConfig

type ThinkingConfig struct {
	// contains filtered or unexported fields
}

ThinkingConfig configures provider-side thinking. Use ThinkingByEffort or ThinkingByBudget — the unexported fields prevent setting both.

func ThinkingByBudget

func ThinkingByBudget(n int) *ThinkingConfig

func ThinkingByEffort

func ThinkingByEffort(e ThinkingEffort) *ThinkingConfig

func (*ThinkingConfig) Budget

func (t *ThinkingConfig) Budget() int

Budget returns the configured token budget, 0 if effort-based.

func (*ThinkingConfig) Effort

func (t *ThinkingConfig) Effort() ThinkingEffort

Effort returns the configured coarse effort, empty string if budget-based.

type ThinkingEffort

type ThinkingEffort string
const (
	ThinkingEffortLow    ThinkingEffort = "low"
	ThinkingEffortMedium ThinkingEffort = "medium"
	ThinkingEffortHigh   ThinkingEffort = "high"
)

type ToolCall

type ToolCall struct {
	ID        string
	Name      string
	Arguments string
}

type ToolDef

type ToolDef struct {
	Name        string
	Description string
	Parameters  json.RawMessage
}

ToolDef is one tool the model may call. Parameters is a JSON Schema.

type Usage

type Usage struct {
	PromptTokens     int
	CompletionTokens int
	CacheReadTokens  int
	CacheWriteTokens int
	CostUSD          float64
}

Directories

Path Synopsis
Package llmtest provides public test helpers for the llm package.
Package llmtest provides public test helpers for the llm package.
Package openrouter implements llm.Client against OpenRouter (https://openrouter.ai).
Package openrouter implements llm.Client against OpenRouter (https://openrouter.ai).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL