Documentation
¶
Overview ¶
Package llm defines the provider-agnostic types and the Client interface every LLM provider satisfies.
The shapes here are deliberately the lowest common denominator across providers. Provider-specific knobs live on each provider's Options struct.
See docs/v1/API.md for the rationale behind every type.
Index ¶
- Variables
- func Collect(ctx context.Context, c Client, req Request) (text string, calls []ToolCall, images []ImagePart, usage *Usage, err error)
- func CollectWithRetry(ctx context.Context, c Client, req Request, policy retry.Policy) (text string, calls []ToolCall, images []ImagePart, usage *Usage, err error)
- func DefaultStreamingRetryPolicy() retry.Policy
- func Float32(v float32) *float32
- func IsContentFilter(err error) bool
- func IsTransientServerError(err error) bool
- type CacheBreakpoint
- type CacheTTL
- type Client
- type Event
- type EventKind
- type ImagePart
- type Message
- type Request
- type Response
- type Role
- type ThinkingConfig
- type ThinkingEffort
- type ToolCall
- type ToolDef
- type Usage
Constants ¶
This section is empty.
Variables ¶
var ( ErrAuth = errors.New("aikido/llm: authentication failed") ErrRateLimited = errors.New("aikido/llm: rate limited") ErrServerError = errors.New("aikido/llm: provider server error") ErrInvalidRequest = errors.New("aikido/llm: invalid request") // ErrContentFiltered is wrapped when the provider's safety policy aborts a // generation. Detected from finish_reason="content_filter" on a streamed // chunk OR a structured error envelope whose code/type signals content // filtering. Callers should NOT retry — the same prompt will trip the // classifier deterministically. // // Note: providers sometimes mid-stream RST without sending a structured // signal. Those failures still surface as ErrServerError; callers cannot // distinguish silent content aborts from genuine transient flake at the // wire level. ErrContentFiltered = errors.New("aikido/llm: content filtered by provider safety policy") )
Errors providers wrap with %w when mapping HTTP status to a typed cause.
Functions ¶
func Collect ¶
func Collect(ctx context.Context, c Client, req Request) (text string, calls []ToolCall, images []ImagePart, usage *Usage, err error)
Collect drains a stream into a final result. Useful for non-streaming callers.
Returns text accumulated from EventTextDelta, all complete tool calls, all images surfaced by the provider, final Usage if the provider emitted one, and the first error encountered. Thinking text is not included in the returned text.
Collect respects ctx cancellation: if ctx is cancelled before the stream closes, Collect returns ctx.Err() without waiting for the producer.
func CollectWithRetry ¶ added in v0.2.2
func CollectWithRetry(ctx context.Context, c Client, req Request, policy retry.Policy) (text string, calls []ToolCall, images []ImagePart, usage *Usage, err error)
CollectWithRetry wraps Collect with retry.Do using the supplied policy. On retry, the entire stream is restarted from scratch (no resume) — Collect's accumulated state is discarded between attempts. The final return values are from the last attempt.
Provider-side cost is paid per attempt: a streaming model that aborts after emitting partial tokens still bills for those tokens. Tune MaxAttempts with cost in mind.
If policy.ShouldRetry is nil, IsTransientServerError is used. If policy.MaxAttempts is < 1, it's clamped to 1 (no retry).
func DefaultStreamingRetryPolicy ¶ added in v0.2.2
DefaultStreamingRetryPolicy returns a sensible retry policy for image generation and other streaming LLM operations: 5 attempts, 2s base, 30s cap, 2x multiplier, 20% jitter, retrying only transient provider errors.
Tuned for image-gen preview models (e.g. Gemini flash-image-preview) which drop streams under upstream load at ~20% in observed traces. With 5 attempts at this rate the effective failure rate is ~0.03%.
func Float32 ¶
Float32 returns a pointer to v.
Convenience for SessionOptions.Temperature so callers can write inline values. Float32(0) returns a non-nil pointer to zero — the deterministic-zero case.
func IsContentFilter ¶ added in v0.2.3
IsContentFilter reports whether err is a content-policy abort surfaced by the provider (finish_reason="content_filter", an explicit error envelope with a matching code/message, or any error wrapping ErrContentFiltered). Use this to short-circuit retry — the same prompt will trip the classifier again on the next attempt — and to render a "try different wording" message rather than a generic "try again later".
Note: providers sometimes mid-stream RST without sending a structured content-filter signal. Those failures surface as ErrServerError and are indistinguishable from genuine transient flake at the wire level. Callers that want stronger detection can heuristically classify "all retries failed identically" as a content-filter signal at the call-site.
func IsTransientServerError ¶ added in v0.2.2
IsTransientServerError reports whether err is a transient upstream-provider failure (typically a 5xx, mid-stream RST, or other server-side flake) that is worth retrying. Authentication errors, invalid-request errors, and rate limits are NOT classified as transient — auth and bad-request errors won't fix themselves with a retry, and rate limits should be honored explicitly (typically via a Retry-After-aware policy at the caller).
Use as the ShouldRetry predicate when retrying llm.Collect:
llm.CollectWithRetry(ctx, c, req, retry.Policy{
MaxAttempts: 5,
BaseDelay: 2 * time.Second,
MaxDelay: 30 * time.Second,
Multiplier: 2.0,
Jitter: 0.2,
ShouldRetry: llm.IsTransientServerError,
})
Types ¶
type CacheBreakpoint ¶
type CacheBreakpoint struct {
TTL CacheTTL
}
CacheBreakpoint marks a message as a cache breakpoint on providers that support it. nil = no breakpoint. &CacheBreakpoint{} = default 5m TTL.
type Client ¶
type Client interface {
Stream(ctx context.Context, req Request) (<-chan Event, error)
// Complete sends a non-streaming completion request and returns the full
// response in one shot. Implementations should map provider errors onto the
// same Err* sentinels Stream uses (ErrAuth, ErrRateLimited, ErrServerError,
// ErrInvalidRequest, ErrContentFiltered).
//
// Complete is the right choice for image generation: image-capable models
// emit the entire base64 payload in a single chunk that frequently exceeds
// the SSE per-line scanner cap, deterministically failing Stream-based
// callers. Complete reads the response as a single JSON body and is
// immune to that failure mode.
Complete(ctx context.Context, req Request) (Response, error)
}
type Event ¶
type Event struct {
Kind EventKind
Text string
Tool *ToolCall
Image *ImagePart
Usage *Usage
Err error
// FinishReason carries the provider's reported finish_reason on the chunk
// that closes a generation. Set on EventEnd (and on the EventError that
// stands in for finish_reason="content_filter"). Common values: "stop",
// "tool_calls", "length", "content_filter", "error". Empty when the
// provider didn't surface one.
FinishReason string
}
Event is one streaming event emitted by a Client. Channel closes after EventEnd.
type ImagePart ¶
ImagePart is one image attached to a message or returned by an image-capable model. URL is set when the provider returned a remote URL or when the caller supplied a data URI / remote URL on input. Data is set when the provider returned inline bytes (decoded from a data: URI) or when the caller wants to inline an image on output. ContentType is the MIME type when known ("image/png", "image/jpeg", ...) — empty when not provided by the wire.
type Response ¶ added in v0.2.4
type Response struct {
Text string
ToolCalls []ToolCall
Images []ImagePart
Usage *Usage
FinishReason string
}
Response is the fully-assembled result of a non-streaming completion.
Returned by Client.Complete in one shot — no event channel, no SSE framing. Use this for image generation and any other workload where progressive token delivery has no UX value: the SSE path imposes a per-line buffer cap that can't accommodate the multi-MB single-chunk responses image-capable models emit.
FinishReason mirrors the provider's reported value ("stop", "tool_calls", "length", "content_filter", ...) — empty when the provider didn't surface one.
func CompleteWithRetry ¶ added in v0.2.4
func CompleteWithRetry(ctx context.Context, c Client, req Request, policy retry.Policy) (Response, error)
CompleteWithRetry wraps Client.Complete with retry.Do using the supplied policy. Each attempt sends a fresh non-streaming request; the final Response is from the last attempt.
Use this for image generation: a single oversized base64 payload exceeds the SSE scanner cap and trips ErrServerError on every Stream attempt. Complete reads the body as one JSON document and avoids the cap entirely; the retry wrapper is here for the standard 429/5xx-at-request-start flake.
If policy.ShouldRetry is nil, IsTransientServerError is used.
type ThinkingConfig ¶
type ThinkingConfig struct {
// contains filtered or unexported fields
}
ThinkingConfig configures provider-side thinking. Use ThinkingByEffort or ThinkingByBudget — the unexported fields prevent setting both.
func ThinkingByBudget ¶
func ThinkingByBudget(n int) *ThinkingConfig
func ThinkingByEffort ¶
func ThinkingByEffort(e ThinkingEffort) *ThinkingConfig
func (*ThinkingConfig) Budget ¶
func (t *ThinkingConfig) Budget() int
Budget returns the configured token budget, 0 if effort-based.
func (*ThinkingConfig) Effort ¶
func (t *ThinkingConfig) Effort() ThinkingEffort
Effort returns the configured coarse effort, empty string if budget-based.
type ThinkingEffort ¶
type ThinkingEffort string
const ( ThinkingEffortLow ThinkingEffort = "low" ThinkingEffortMedium ThinkingEffort = "medium" ThinkingEffortHigh ThinkingEffort = "high" )
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package llmtest provides public test helpers for the llm package.
|
Package llmtest provides public test helpers for the llm package. |
|
Package openrouter implements llm.Client against OpenRouter (https://openrouter.ai).
|
Package openrouter implements llm.Client against OpenRouter (https://openrouter.ai). |