Documentation
¶
Overview ¶
Package llmagent invokes coding-agent CLIs (claude, gemini, codex, opencode, pi, cursor) as subprocesses, streams their stream-json output, and recovers from quota and rate-limit failures.
The unit of work is an Agent: one named provider with everything needed to build its command line. Call Agent.Run with a prompt and a working directory to get the agent's stdout. A first call probes the binary; subsequent calls reuse the probe.
To fail over across providers, wrap a slice of agents in a Runner. The runner skips providers in CooldownTracker cooldown and retries the next one.
Index ¶
- Constants
- Variables
- func Base(provider string) string
- func DefaultCommand(ctx context.Context, agent *Agent) (*exec.Cmd, error)
- func DetectQuota(output string) (detail string, ok bool)
- func FormatStatuses(statuses []CooldownStatus) string
- func IsLocal(provider string) bool
- func IsStderrNoise(line string) bool
- func Model(provider string) string
- func ParseResetDuration(detail string) time.Duration
- type Agent
- type CooldownStatus
- type CooldownTracker
- func (c *CooldownTracker) Clear(provider string)
- func (c *CooldownTracker) IsCoolingDown(provider string) bool
- func (c *CooldownTracker) Remaining(provider string) time.Duration
- func (c *CooldownTracker) Select(providers []string) []string
- func (c *CooldownTracker) Set(provider string, d time.Duration, reason string)
- func (c *CooldownTracker) Statuses(providers []string) []CooldownStatus
- type QuotaError
- type Result
- type RunOptions
- type RunResult
- type Runner
Constants ¶
const BlacklistCooldown = 20 * time.Minute
BlacklistCooldown is applied to providers that fail with a non-quota error (connection refused, auth, timeout). It is shorter than the typical quota reset because the underlying problem is usually transient.
const DefaultCooldown = 60 * time.Minute
DefaultCooldown is the cooldown applied when a quota error has no parseable reset time.
Variables ¶
var ErrIdleTimeout = errors.New("idle timeout")
ErrIdleTimeout is returned by Run when the subprocess produces no stdout for longer than Agent.IdleTimeout.
Functions ¶
func Base ¶
Base returns the provider name without any model suffix. "gemini:gemini-2.5-pro" → "gemini"; "claude" → "claude".
func DefaultCommand ¶
DefaultCommand builds the *exec.Cmd that invokes provider with the agent's IncludeDirs, Model, TmpDir, Env, and ExtraArgs. It supports the providers cyclotron uses today: claude, gemini, codex, opencode, pi, cursor.
The returned command:
- reads its prompt from stdin (cursor reads PROMPT.md, written by Run);
- has TmpDir injected as TMPDIR/TMP/TEMP via Env;
- has any Agent.ExtraArgs appended after the built-in flags (and before a trailing positional, e.g. codex's `-` or cursor's prompt argument);
- has stdin/stdout/stderr left for the caller to wire pipes onto.
To support a new provider, write a custom NewCmd on Agent.
func DetectQuota ¶
DetectQuota reports whether output contains a quota or rate-limit signal, and if so returns a human-readable detail string (e.g. "resets in 8h24m6s").
func FormatStatuses ¶
func FormatStatuses(statuses []CooldownStatus) string
FormatStatuses joins a status slice into a comma-separated summary.
func IsLocal ¶
IsLocal reports whether the provider runs a local LLM and therefore needs a longer probe budget and shorter blacklist than cloud agents.
func IsStderrNoise ¶
IsStderrNoise reports whether line matches a known noise pattern.
func Model ¶
Model returns the model suffix, or "" if none. "gemini:gemini-2.5-pro" → "gemini-2.5-pro"; "claude" → "".
func ParseResetDuration ¶
ParseResetDuration parses a Go duration embedded in detail (e.g. "resets in 8h24m6s"). Returns DefaultCooldown when nothing parseable is found.
Types ¶
type Agent ¶
type Agent struct {
// NewCmd builds the command for one invocation. Defaults to DefaultCommand.
// Override to inject mocks in tests or to support a new provider.
NewCmd func(ctx context.Context, agent *Agent) (*exec.Cmd, error)
// Logger receives structured progress events. Defaults to slog.Default.
Logger *slog.Logger
// Provider is the agent name, optionally with a model suffix:
// "claude", "gemini:gemini-2.5-pro", "opencode:kimi-k2".
Provider string
// TmpDir, when non-empty, is exported as TMPDIR/TMP/TEMP to the
// subprocess so its scratch files land in a known location.
TmpDir string
// IncludeDirs are directories the agent is allowed to read or write.
// Used by providers that take a directory allow-list (gemini, codex).
IncludeDirs []string
// Env is appended to the subprocess environment after TMPDIR overrides.
// Each entry must be "KEY=VALUE".
Env []string
// ExtraArgs is appended to the provider command line after the built-in
// flags but before any prompt placeholder. Use this for provider-specific
// flags the caller wants to set without overriding NewCmd, e.g.
// pi's `--thinking off`, `--no-tools`, `--no-skills`.
ExtraArgs []string
// Timeout is the wall-clock limit for one Run. Zero means no limit.
Timeout time.Duration
// IdleTimeout is the maximum gap between stdout lines. Zero means no
// limit; the subprocess can stall indefinitely.
IdleTimeout time.Duration
// ProbeTimeout overrides the first-byte timeout used by Probe.
// Zero picks 75s for cloud providers and 5 min for local ones.
ProbeTimeout time.Duration
// MaxStdoutBytes caps the captured Result.Output. Zero uses the default
// (8 MB). A positive value uses exactly that many bytes; a negative value
// disables the cap (use carefully — a runaway stream can OOM the host).
//
// Lines beyond the cap are silently dropped from the captured Output but
// are still observed by OnEvent and the provider's protocol filter (so
// pi RPC events still drive feedPi correctly).
MaxStdoutBytes int
// contains filtered or unexported fields
}
Agent is one named LLM CLI tool ready to be invoked. The zero value is not useful; set Provider at minimum.
An Agent is safe for concurrent use: it probes once, lazily, and the probe outcome is shared by every caller. Sessions belong to the Run call, not the Agent — see RunOptions.SessionID.
Field order minimises GC pointer-scan footprint; logical groupings are in the field comments.
func (*Agent) Probe ¶
Probe verifies the agent's binary exists and produces output for a trivial prompt. It is called automatically by Run on first use; calling Probe directly lets callers warm up agents in parallel.
Probe is idempotent: after one success or failure, subsequent calls are no-ops returning the same outcome.
type CooldownStatus ¶
CooldownStatus is a snapshot of one provider's cooldown.
func (CooldownStatus) String ¶
func (s CooldownStatus) String() string
type CooldownTracker ¶
type CooldownTracker struct {
// contains filtered or unexported fields
}
CooldownTracker remembers when each provider's cooldown expires. It is safe for concurrent use.
func NewCooldownTracker ¶
func NewCooldownTracker() *CooldownTracker
NewCooldownTracker returns an empty tracker.
func (*CooldownTracker) Clear ¶
func (c *CooldownTracker) Clear(provider string)
Clear removes any active cooldown for provider.
func (*CooldownTracker) IsCoolingDown ¶
func (c *CooldownTracker) IsCoolingDown(provider string) bool
IsCoolingDown reports whether provider's cooldown has not yet expired.
func (*CooldownTracker) Remaining ¶
func (c *CooldownTracker) Remaining(provider string) time.Duration
Remaining returns time left on the cooldown, or 0 if none.
func (*CooldownTracker) Select ¶
func (c *CooldownTracker) Select(providers []string) []string
Select reorders providers so that those not in cooldown come first. If every provider is cooling down, the soonest-to-expire is returned first so callers can wait on the most-recoverable one.
func (*CooldownTracker) Set ¶
func (c *CooldownTracker) Set(provider string, d time.Duration, reason string)
Set puts provider into cooldown for d, recording reason.
func (*CooldownTracker) Statuses ¶
func (c *CooldownTracker) Statuses(providers []string) []CooldownStatus
Statuses returns a status for every provider in the input list that is currently cooling down, in input order.
type QuotaError ¶
QuotaError indicates a provider failed because of quota or rate-limit exhaustion. It is kept distinct from generic provider failures so callers can decide whether to retry, fail over, or wait for the cooldown to expire.
func (*QuotaError) Error ¶
func (e *QuotaError) Error() string
type Result ¶
type Result struct {
Output string // captured stdout
SessionID string // last session ID parsed from the stream
Duration time.Duration // wall time of the invocation
}
Result is the outcome of a successful Run.
type RunOptions ¶
type RunOptions struct {
// OnEvent is called once per non-empty stdout line. Lines are usually
// stream-json; the caller is free to parse or ignore them.
OnEvent func(line string)
// OnStderr is called once per stderr line that is not provider-startup
// noise (see IsStderrNoise).
OnStderr func(line string)
// Logger overrides Agent.Logger for this single Run, so per-call context
// (e.g. a sample's sha256) attaches to llmagent's lifecycle records
// (`llmagent invoke`, `llmagent probe ok`, idle/exit warnings). Falls back
// to Agent.Logger, then slog.Default. Optional.
Logger *slog.Logger
// SessionID, if non-empty, asks the provider to resume that session.
// Providers that don't support resumption (codex, gemini) ignore
// this field.
SessionID string
}
RunOptions are per-call settings.
type Runner ¶
type Runner struct {
// Cooldowns is consulted before every Run; pass nil to disable.
Cooldowns *CooldownTracker
// Logger receives structured progress events. Defaults to slog.Default.
Logger *slog.Logger
// OnCooldown, if non-nil, is called whenever a provider is put into
// cooldown — useful for surfacing the event in a UI.
OnCooldown func(provider string, d time.Duration, reason string)
// Agents are the providers to try, in order.
Agents []*Agent
// LocalCooldown overrides the blacklist duration for local providers
// (opencode, pi). Zero means 15 min.
LocalCooldown time.Duration
// contains filtered or unexported fields
}
Runner invokes a list of agents in order, failing over from one to the next when an agent errors. Quota-style errors put the agent in cooldown so sibling Runners sharing the same CooldownTracker skip it.
A Runner is safe for concurrent use and tracks its own per-provider session IDs across invocations.