llm

package

v0.2.4 Latest Latest Go to latest Published: May 7, 2026 License: MIT Imports: 6 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/mxcd/aikido

Links

Open Source Insights

Documentation ¶

Overview ¶

Package llm defines the provider-agnostic types and the Client interface every LLM provider satisfies.

The shapes here are deliberately the lowest common denominator across providers. Provider-specific knobs live on each provider's Options struct.

See docs/v1/API.md for the rationale behind every type.

Index ¶

Variables
func Collect(ctx context.Context, c Client, req Request) (text string, calls []ToolCall, images []ImagePart, usage *Usage, err error)
func CollectWithRetry(ctx context.Context, c Client, req Request, policy retry.Policy) (text string, calls []ToolCall, images []ImagePart, usage *Usage, err error)
func DefaultStreamingRetryPolicy() retry.Policy
func Float32(v float32) *float32
func IsContentFilter(err error) bool
func IsTransientServerError(err error) bool
type CacheBreakpoint
type CacheTTL
type Client
type Event
type EventKind
type ImagePart
type Message
type Request
type Response
- func CompleteWithRetry(ctx context.Context, c Client, req Request, policy retry.Policy) (Response, error)
type Role
type ThinkingConfig
- func ThinkingByBudget(n int) *ThinkingConfig
- func ThinkingByEffort(e ThinkingEffort) *ThinkingConfig
- func (t *ThinkingConfig) Budget() int
- func (t *ThinkingConfig) Effort() ThinkingEffort
type ThinkingEffort
type ToolCall
type ToolDef
type Usage

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	ErrAuth           = errors.New("aikido/llm: authentication failed")
	ErrRateLimited    = errors.New("aikido/llm: rate limited")
	ErrServerError    = errors.New("aikido/llm: provider server error")
	ErrInvalidRequest = errors.New("aikido/llm: invalid request")

	// ErrContentFiltered is wrapped when the provider's safety policy aborts a
	// generation. Detected from finish_reason="content_filter" on a streamed
	// chunk OR a structured error envelope whose code/type signals content
	// filtering. Callers should NOT retry — the same prompt will trip the
	// classifier deterministically.
	//
	// Note: providers sometimes mid-stream RST without sending a structured
	// signal. Those failures still surface as ErrServerError; callers cannot
	// distinguish silent content aborts from genuine transient flake at the
	// wire level.
	ErrContentFiltered = errors.New("aikido/llm: content filtered by provider safety policy")
)

Errors providers wrap with %w when mapping HTTP status to a typed cause.

Functions ¶

func Collect ¶

func Collect(ctx context.Context, c Client, req Request) (text string, calls []ToolCall, images []ImagePart, usage *Usage, err error)

Collect drains a stream into a final result. Useful for non-streaming callers.

Returns text accumulated from EventTextDelta, all complete tool calls, all images surfaced by the provider, final Usage if the provider emitted one, and the first error encountered. Thinking text is not included in the returned text.

Collect respects ctx cancellation: if ctx is cancelled before the stream closes, Collect returns ctx.Err() without waiting for the producer.

func CollectWithRetry ¶ added in v0.2.2

func CollectWithRetry(ctx context.Context, c Client, req Request, policy retry.Policy) (text string, calls []ToolCall, images []ImagePart, usage *Usage, err error)

CollectWithRetry wraps Collect with retry.Do using the supplied policy. On retry, the entire stream is restarted from scratch (no resume) — Collect's accumulated state is discarded between attempts. The final return values are from the last attempt.

Provider-side cost is paid per attempt: a streaming model that aborts after emitting partial tokens still bills for those tokens. Tune MaxAttempts with cost in mind.

If policy.ShouldRetry is nil, IsTransientServerError is used. If policy.MaxAttempts is < 1, it's clamped to 1 (no retry).

func DefaultStreamingRetryPolicy ¶ added in v0.2.2

func DefaultStreamingRetryPolicy() retry.Policy

DefaultStreamingRetryPolicy returns a sensible retry policy for image generation and other streaming LLM operations: 5 attempts, 2s base, 30s cap, 2x multiplier, 20% jitter, retrying only transient provider errors.

Tuned for image-gen preview models (e.g. Gemini flash-image-preview) which drop streams under upstream load at ~20% in observed traces. With 5 attempts at this rate the effective failure rate is ~0.03%.

func Float32 ¶

func Float32(v float32) *float32

Float32 returns a pointer to v.

Convenience for SessionOptions.Temperature so callers can write inline values. Float32(0) returns a non-nil pointer to zero — the deterministic-zero case.

func IsContentFilter ¶ added in v0.2.3

func IsContentFilter(err error) bool

IsContentFilter reports whether err is a content-policy abort surfaced by the provider (finish_reason="content_filter", an explicit error envelope with a matching code/message, or any error wrapping ErrContentFiltered). Use this to short-circuit retry — the same prompt will trip the classifier again on the next attempt — and to render a "try different wording" message rather than a generic "try again later".

Note: providers sometimes mid-stream RST without sending a structured content-filter signal. Those failures surface as ErrServerError and are indistinguishable from genuine transient flake at the wire level. Callers that want stronger detection can heuristically classify "all retries failed identically" as a content-filter signal at the call-site.

func IsTransientServerError ¶ added in v0.2.2

func IsTransientServerError(err error) bool

IsTransientServerError reports whether err is a transient upstream-provider failure (typically a 5xx, mid-stream RST, or other server-side flake) that is worth retrying. Authentication errors, invalid-request errors, and rate limits are NOT classified as transient — auth and bad-request errors won't fix themselves with a retry, and rate limits should be honored explicitly (typically via a Retry-After-aware policy at the caller).

Use as the ShouldRetry predicate when retrying llm.Collect:

llm.CollectWithRetry(ctx, c, req, retry.Policy{
    MaxAttempts: 5,
    BaseDelay:   2 * time.Second,
    MaxDelay:    30 * time.Second,
    Multiplier:  2.0,
    Jitter:      0.2,
    ShouldRetry: llm.IsTransientServerError,
})

Types ¶

type CacheBreakpoint ¶

type CacheBreakpoint struct {
	TTL CacheTTL
}

CacheBreakpoint marks a message as a cache breakpoint on providers that support it. nil = no breakpoint. &CacheBreakpoint{} = default 5m TTL.

type CacheTTL ¶

type CacheTTL string

const (
	CacheTTL5Min  CacheTTL = "5m"
	CacheTTL1Hour CacheTTL = "1h"
)

type Client ¶

type Client interface {
	Stream(ctx context.Context, req Request) (<-chan Event, error)

	// Complete sends a non-streaming completion request and returns the full
	// response in one shot. Implementations should map provider errors onto the
	// same Err* sentinels Stream uses (ErrAuth, ErrRateLimited, ErrServerError,
	// ErrInvalidRequest, ErrContentFiltered).
	//
	// Complete is the right choice for image generation: image-capable models
	// emit the entire base64 payload in a single chunk that frequently exceeds
	// the SSE per-line scanner cap, deterministically failing Stream-based
	// callers. Complete reads the response as a single JSON body and is
	// immune to that failure mode.
	Complete(ctx context.Context, req Request) (Response, error)
}

type Event ¶

type Event struct {
	Kind  EventKind
	Text  string
	Tool  *ToolCall
	Image *ImagePart
	Usage *Usage
	Err   error

	// FinishReason carries the provider's reported finish_reason on the chunk
	// that closes a generation. Set on EventEnd (and on the EventError that
	// stands in for finish_reason="content_filter"). Common values: "stop",
	// "tool_calls", "length", "content_filter", "error". Empty when the
	// provider didn't surface one.
	FinishReason string
}

Event is one streaming event emitted by a Client. Channel closes after EventEnd.

type EventKind ¶

type EventKind string

EventKind identifies the kind of a streaming event.

const (
	EventTextDelta EventKind = "text_delta"
	EventToolCall  EventKind = "tool_call"
	EventThinking  EventKind = "thinking"
	EventImage     EventKind = "image"
	EventUsage     EventKind = "usage"
	EventError     EventKind = "error"
	EventEnd       EventKind = "end"
)

type ImagePart ¶

type ImagePart struct {
	URL         string
	ContentType string
	Data        []byte
}

ImagePart is one image attached to a message or returned by an image-capable model. URL is set when the provider returned a remote URL or when the caller supplied a data URI / remote URL on input. Data is set when the provider returned inline bytes (decoded from a data: URI) or when the caller wants to inline an image on output. ContentType is the MIME type when known ("image/png", "image/jpeg", ...) — empty when not provided by the wire.

type Message ¶

type Message struct {
	Role       Role
	Content    string
	Images     []ImagePart
	ToolCalls  []ToolCall
	ToolCallID string
	Cache      *CacheBreakpoint
}

type Request ¶

type Request struct {
	Model         string
	Messages      []Message
	Tools         []ToolDef
	MaxTokens     int
	Temperature   *float32
	Thinking      *ThinkingConfig
	StopSequences []string
}

type Response ¶ added in v0.2.4

type Response struct {
	Text         string
	ToolCalls    []ToolCall
	Images       []ImagePart
	Usage        *Usage
	FinishReason string
}

Response is the fully-assembled result of a non-streaming completion.

Returned by Client.Complete in one shot — no event channel, no SSE framing. Use this for image generation and any other workload where progressive token delivery has no UX value: the SSE path imposes a per-line buffer cap that can't accommodate the multi-MB single-chunk responses image-capable models emit.

FinishReason mirrors the provider's reported value ("stop", "tool_calls", "length", "content_filter", ...) — empty when the provider didn't surface one.

func CompleteWithRetry ¶ added in v0.2.4

func CompleteWithRetry(ctx context.Context, c Client, req Request, policy retry.Policy) (Response, error)

CompleteWithRetry wraps Client.Complete with retry.Do using the supplied policy. Each attempt sends a fresh non-streaming request; the final Response is from the last attempt.

Use this for image generation: a single oversized base64 payload exceeds the SSE scanner cap and trips ErrServerError on every Stream attempt. Complete reads the body as one JSON document and avoids the cap entirely; the retry wrapper is here for the standard 429/5xx-at-request-start flake.

If policy.ShouldRetry is nil, IsTransientServerError is used.

type Role ¶

type Role string

const (
	RoleSystem    Role = "system"
	RoleUser      Role = "user"
	RoleAssistant Role = "assistant"
	RoleTool      Role = "tool"
)

type ThinkingConfig ¶

type ThinkingConfig struct {
	// contains filtered or unexported fields
}

ThinkingConfig configures provider-side thinking. Use ThinkingByEffort or ThinkingByBudget — the unexported fields prevent setting both.

func ThinkingByBudget ¶

func ThinkingByBudget(n int) *ThinkingConfig

func ThinkingByEffort ¶

func ThinkingByEffort(e ThinkingEffort) *ThinkingConfig

func (*ThinkingConfig) Budget ¶

func (t *ThinkingConfig) Budget() int

Budget returns the configured token budget, 0 if effort-based.

func (*ThinkingConfig) Effort ¶

func (t *ThinkingConfig) Effort() ThinkingEffort

Effort returns the configured coarse effort, empty string if budget-based.

type ThinkingEffort ¶

type ThinkingEffort string

const (
	ThinkingEffortLow    ThinkingEffort = "low"
	ThinkingEffortMedium ThinkingEffort = "medium"
	ThinkingEffortHigh   ThinkingEffort = "high"
)

type ToolCall ¶

type ToolCall struct {
	ID        string
	Name      string
	Arguments string
}

type ToolDef ¶

type ToolDef struct {
	Name        string
	Description string
	Parameters  json.RawMessage
}

ToolDef is one tool the model may call. Parameters is a JSON Schema.

type Usage ¶

type Usage struct {
	PromptTokens     int
	CompletionTokens int
	CacheReadTokens  int
	CacheWriteTokens int
	CostUSD          float64
}

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
llmtest Package llmtest provides public test helpers for the llm package.	Package llmtest provides public test helpers for the llm package.
openrouter Package openrouter implements llm.Client against OpenRouter (https://openrouter.ai).	Package openrouter implements llm.Client against OpenRouter (https://openrouter.ai).

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL