Documentation
¶
Overview ¶
Package starling is an event-sourced agent runtime for Go.
Every agent run is recorded as an append-only log of events, making every execution deterministically replayable, cost-enforceable, and cryptographically auditable.
Status: pre-alpha. Public API is not yet stable.
Index ¶
- Variables
- func MetricsHandler(g prometheus.Gatherer) http.Handler
- func NewRunID(namespace string) (string, error)
- func Replay(ctx context.Context, log eventlog.EventLog, runID string, a *Agent, ...) error
- type Agent
- func (a *Agent) ReplayProviderInfo() (providerID, apiVersion, modelID string)
- func (a *Agent) Resume(ctx context.Context, runID, extraMessage string) (*RunResult, error)
- func (a *Agent) ResumeWith(ctx context.Context, runID, extraMessage string, opts ...ResumeOption) (*RunResult, error)
- func (a *Agent) Run(ctx context.Context, goal string) (*RunResult, error)
- func (a *Agent) RunOnce(ctx context.Context, prompt string) (string, error)
- func (a *Agent) RunReplay(ctx context.Context, recorded []event.Event) error
- func (a *Agent) RunReplayInto(ctx context.Context, recorded []event.Event, sink eventlog.EventLog) error
- func (a *Agent) RunStream(ctx context.Context, goal string) (string, <-chan AgentEvent, error)
- func (a *Agent) RunWithID(ctx context.Context, runID string, goal string) (*RunResult, error)
- func (a *Agent) Stream(ctx context.Context, goal string) (string, <-chan StepEvent, error)
- func (a *Agent) StreamWithID(ctx context.Context, runID string, goal string) (<-chan StepEvent, error)
- type AgentEvent
- type Budget
- type CacheStats
- type Config
- type ContractsCmd
- type DoctorCmd
- type Done
- type ExportCmd
- type InspectCmd
- type MCPCmd
- type Metrics
- type MigrateCmd
- type ProviderError
- type PruneCmd
- type ReplayCmd
- type ReplayOption
- type ResumeOption
- type RunResult
- type SchemaVersionCmd
- type StepEvent
- type TextDelta
- type ToolCallEnded
- type ToolCallRecord
- type ToolCallStarted
- type ToolError
- type ValidateCmd
Constants ¶
This section is empty.
Variables ¶
var ( // ErrBudgetExceeded is returned when a budget cap trips. The // matching BudgetExceeded event precedes the terminal RunFailed // in the log. ErrBudgetExceeded = errors.New("starling: budget exceeded") // ErrMaxTurnsExceeded is returned when the loop reaches Config.MaxTurns // without the model producing a final answer. Tool calls dispatched // before the cap tripped are still recorded; callers driving forced // single-shot tool flows can recover their output from // RunResult.ToolCalls regardless of this error. ErrMaxTurnsExceeded = errors.New("starling: max turns exceeded") // ErrNonDeterminism is returned by Replay when a re-emitted event // diverges from the recording. Wraps replay.ErrNonDeterminism. ErrNonDeterminism = errors.New("starling: non-determinism detected during replay") // ErrProviderModelMismatch is returned by Replay when the agent's // Provider.ID, APIVersion, or Config.Model disagree with the // values recorded in RunStarted. Override with WithForceProvider. // Aliased from replay.ErrProviderModelMismatch so callers that do // not import the replay package can still route on it. ErrProviderModelMismatch = replay.ErrProviderModelMismatch // ErrRunNotFound is returned by Resume when the requested runID has // no events in the log. ErrRunNotFound = errors.New("starling: run not found in log") // ErrRunAlreadyTerminal is returned by Resume when the run's last // event is a terminal kind (RunCompleted/RunFailed/RunCancelled). // Resuming a terminated run is not supported — the terminal event // commits a Merkle root over every event before it, and appending // past that point would invalidate the commitment. ErrRunAlreadyTerminal = errors.New("starling: run already terminal") // ErrSchemaVersionMismatch is returned by Resume when the run's // RunStarted event records a schema version this binary does not // understand. ErrSchemaVersionMismatch = errors.New("starling: event schema version mismatch") // ErrPartialToolCall is returned by Resume when the run's tail // contains a ToolCallScheduled event without a matching // ToolCallCompleted/ToolCallFailed, and WithReissueTools(false) was // passed. It signals that the resuming process would otherwise have // to re-issue a tool call of unknown idempotency. ErrPartialToolCall = errors.New("starling: run has a partial tool call; pass WithReissueTools(true) to reissue") // ErrRunInUse is returned by Resume when its first Append onto the // existing chain is rejected because another writer has advanced // the tail under us. Indicates two processes are racing to resume // the same run — the loser bails cleanly rather than risk chain // corruption. ErrRunInUse = errors.New("starling: run is being appended by another writer") // ErrLogCorrupt wraps every eventlog.Validate failure. Aliased // from the eventlog package so callers that never import eventlog // directly can still route on it. ErrLogCorrupt = eventlog.ErrLogCorrupt )
Sentinel errors surfaced by Agent.Run. Tests and callers route on these with errors.Is; the agent loop converts matching errors into the appropriate terminal event (RunFailed vs RunCancelled) before propagating.
var Version = "v0.1.0-beta.2"
Version is the semantic version of the Starling library and bundled CLI binaries. Bumped per release per the policy documented in CHANGELOG.md and the README.
Build tooling can override this via -ldflags="-X github.com/jerkeyray/starling.Version=vX.Y.Z" so dev builds report the underlying tag (or "dev" when off-tag); the constant here is the source of truth shipped with the tagged release.
Functions ¶
func MetricsHandler ¶
func MetricsHandler(g prometheus.Gatherer) http.Handler
MetricsHandler is a convenience wrapper so users don't have to import promhttp themselves for the common case. Equivalent to promhttp.HandlerFor(g, promhttp.HandlerOpts{}).
func NewRunID ¶
NewRunID returns a fresh Starling run id. When namespace is non-empty the returned id is namespace + "/" + ULID. Namespace must not contain "/" because slash separates the namespace from the ULID in URLs and event-log filters.
func Replay ¶
func Replay(ctx context.Context, log eventlog.EventLog, runID string, a *Agent, opts ...ReplayOption) error
Replay re-executes runID against a and verifies the reproduced event sequence matches the recorded log byte-for-byte. a must be configured identically to the original run (same Tools, same Config); Provider and Log are overridden with replay equivalents and the caller's log stays untouched.
Returns ErrNonDeterminism (wrapped) on divergence; ErrProviderModelMismatch when a's Provider.ID/APIVersion/Config.Model disagree with the recording (override with WithForceProvider); other errors (log-read, tool execution) surface verbatim.
Types ¶
type Agent ¶
type Agent struct {
// Provider is the LLM adapter. Required.
Provider provider.Provider
// Tools the agent may plan. Optional.
Tools []tool.Tool
// Log is the event log backend. Required.
Log eventlog.EventLog
// Budget enforces token/USD/wall-clock caps. Optional; zero
// values disable individual axes. Input tokens are checked
// pre-call, output tokens and USD mid-stream, wall-clock via
// context deadline.
Budget *Budget
// Config carries model / system prompt / params / MaxTurns.
Config Config
// Namespace prefixes this agent's RunIDs so multiple tenants can
// share one event log. When set, RunID = Namespace + "/" + ULID;
// must not contain "/". Empty leaves RunID as a bare ULID.
Namespace string
// Metrics is an optional Prometheus sink. Nil disables the
// pipeline at no runtime cost.
Metrics *Metrics
// contains filtered or unexported fields
}
Agent is the user-facing entry point. Fields are validated on Run, not at construction.
func (*Agent) ReplayProviderInfo ¶
ReplayProviderInfo reports the agent's provider/model identity so the replay package can compare it against the recording's RunStarted event before re-executing. Implements replay.ProviderInspector.
func (*Agent) Resume ¶
Resume continues a previously-started run from its last recorded event. If extraMessage is non-empty it is appended as a user turn before the loop resumes. A terminal event is always emitted before return.
Returns:
- ErrRunNotFound: runID not in a.Log.
- ErrRunAlreadyTerminal: last event is terminal.
- ErrSchemaVersionMismatch: RunStarted schema unknown.
- ErrPartialToolCall: unpaired ToolCallScheduled with WithReissueTools(false).
- ErrRunInUse: chain advanced between tail read and first append.
Budget: MaxWallClock and step-level token/USD caps reset at the process boundary; MaxTurns counts across the whole run.
func (*Agent) ResumeWith ¶
func (a *Agent) ResumeWith(ctx context.Context, runID, extraMessage string, opts ...ResumeOption) (*RunResult, error)
ResumeWith is Resume with options. Resume(ctx, id, msg) is equivalent to ResumeWith(ctx, id, msg).
func (*Agent) Run ¶
Run starts a new agent run against the configured provider + tools. The returned RunResult summarizes the run; full detail is in Log.
Terminal events are always emitted before Run returns (successful completion → RunCompleted; ctx cancellation → RunCancelled; any other error → RunFailed), so the log is self-describing regardless of how the run ends.
func (*Agent) RunOnce ¶
RunOnce is a no-tools, single-turn convenience: it ignores Agent.Tools, forces MaxTurns=1, runs the agent against prompt, and returns the assistant's final text. Equivalent to setting Config.MaxTurns=1, clearing Tools, calling Run, and reading RunResult.FinalText. For tool-using or multi-turn flows use Run.
func (*Agent) RunReplay ¶
RunReplay re-executes the agent in replay mode against recorded. Intended for callers of the replay package; not part of the normal user flow. Goal, RunID, and provider streams are all reconstructed from recorded; the original Provider and Log are overridden (the Provider by a replay provider, the Log by a fresh in-memory log) so the live side is fully isolated from the recording.
Returns nil on a clean byte-matching replay. On divergence, returns an error that wraps step.ErrReplayMismatch — the replay package wraps that further into ErrNonDeterminism.
func (*Agent) RunReplayInto ¶
func (a *Agent) RunReplayInto(ctx context.Context, recorded []event.Event, sink eventlog.EventLog) error
RunReplayInto is RunReplay with a caller-supplied sink log instead of an internal in-memory one. Intended for callers (notably replay.Stream) that need to observe the byte-matching events as they are appended — subscribe to sink.Stream(...) before calling.
The sink's lifecycle is the caller's responsibility; this method does NOT close it.
func (*Agent) RunStream ¶
RunStream starts a new run and returns a channel of typed AgentEvents. The channel always closes after a single Done; setup errors surface synchronously.
RunStream is a thin projection of Stream — every emitted AgentEvent is derived from the same underlying StepEvent that Stream would have produced, filtered to the typed surface. Use Stream when you need the full envelope (raw event, sequence numbers, every Kind); use RunStream when you want a stable, narrow API for chat-style frontends.
func (*Agent) RunWithID ¶
RunWithID starts a new agent run using a caller-supplied runID. This is primarily for service wrappers that need to return a run id before worker execution begins. Callers must pass a fresh id; the event log rejects attempts to append over an existing run.
func (*Agent) Stream ¶
Stream starts a new run and returns a channel of StepEvents. Terminal events are always last. Setup errors are returned synchronously; run-time errors surface as a terminal StepEvent (Err populated). On ctx cancel the run is cancelled and the channel closes after draining.
func (*Agent) StreamWithID ¶
func (a *Agent) StreamWithID(ctx context.Context, runID string, goal string) (<-chan StepEvent, error)
StreamWithID starts a new run with a caller-supplied runID and returns a channel of StepEvents. It is the streaming counterpart to RunWithID, intended for services that allocate run ids before worker execution begins.
type AgentEvent ¶
type AgentEvent interface {
// contains filtered or unexported methods
}
AgentEvent is the typed event surface for RunStream. Values are one of TextDelta, ToolCallStarted, ToolCallEnded, or Done. Unknown concrete types must be tolerated by callers using a type switch with a default branch — additions are not breaking changes within a beta cycle.
AgentEvent is layered on top of the lower-level StepEvent stream returned by Stream. RunStream is the user-friendly path; Stream is the escape hatch for callers who want every event with the full envelope.
type Budget ¶
Budget is re-exported from the budget package for callers that want a single import path. All four axes are enforced end-to-end: MaxInputTokens pre-call (step.LLMCall), MaxOutputTokens and MaxUSD mid-stream on every usage chunk (step.LLMCall), MaxWallClock via context.WithDeadline at the agent level. Zero on any field disables that axis.
type CacheStats ¶
CacheStats summarizes prompt-cache activity over the run, aggregated from per-turn AssistantMessageCompleted events. Only Anthropic and providers that surface cache token counts populate non-zero values; for others CacheStats is the zero value.
Semantics:
- ReadTokens / CreateTokens: sums of CacheReadTokens and CacheCreateTokens across every turn.
- Hits: number of turns whose CacheReadTokens was greater than 0.
- Misses: number of turns that consumed input but did not read any cached prefix (CacheReadTokens == 0 && InputTokens > 0).
type Config ¶
type Config struct {
// Model is the provider-specific model identifier passed through
// to every LLM call. Required in practice; the adapter will error
// if empty.
Model string
// SystemPrompt is prepended to every conversation and captured
// verbatim into RunStarted.
SystemPrompt string
// Params is the raw provider-specific parameter blob (temperature,
// top_p, max_tokens, …). Canonical CBOR so the hash in RunStarted
// is stable across runs with equivalent params.
Params cborenc.RawMessage
// MaxTurns caps how many model API calls the loop will make. The
// wrap-up call after a tool result counts as its own turn, so a
// forced single-tool flow needs MaxTurns >= 2 (turn 1 emits the
// tool_use, turn 2 lets the model respond to the tool result).
// 0 (or negative) means unlimited — not recommended.
MaxTurns int
// RequireRawResponseHash fails any turn whose ChunkEnd lacks a
// 32-byte hash.
RequireRawResponseHash bool
// AppVersion identifies the caller's application build and is
// stamped into RunStarted alongside the Starling library version.
// Optional; left blank when unset.
AppVersion string
// SessionID groups multiple runs into one application-level
// session. Stamped into RunStarted metadata as "session_id"
// when Metadata does not already supply that key. If both are
// set and disagree, Metadata wins and a warning is logged.
SessionID string
// Metadata is caller-supplied run context — e.g. action_id, pr_number,
// tenant_id — stamped verbatim into RunStarted, surfaced on RunResult,
// and rendered by the inspector. Keys and values are caller-defined;
// Starling makes no semantic interpretation. Nil or empty disables.
Metadata map[string]string
// EmitTimeout bounds each event-log Append the agent issues under
// context.WithoutCancel (terminal events, tool failures during
// cancellation). Zero disables the bound; set this when a hung
// backend must not block shutdown.
EmitTimeout time.Duration
// SkipSchemaCheck disables the pre-flight schema-version check that
// Run, Resume, and Replay run against the event log. Reserved for
// tests and tooling that intentionally point at a database older
// than the binary.
SkipSchemaCheck bool
// Logger receives structured slog records covering the run lifecycle:
// RunStarted, per-turn start, budget trips, tool retries, and the
// terminal event. Every record carries a "run_id" attribute; per-turn
// and per-tool records add "turn_id" / "call_id".
//
// The event log remains the source of truth for auditing — Logger is
// a side-channel trace for operators watching live runs. Nil is
// silent: library output is discarded. Pass slog.New(...) (or
// slog.Default()) to enable logs; level filtering is delegated to
// the supplied handler's slog.HandlerOptions.Level.
//
// Exceptions: replay divergences and dropped event-log subscribers
// are safety-critical signals and are always logged via
// slog.Default(), regardless of this field.
Logger *slog.Logger
}
Config captures the per-run knobs the user supplies on Agent. Every field is optional with a documented default.
type ContractsCmd ¶
ContractsCmd validates explicit run IDs against a YAML contract file.
func ContractsCommand ¶
func ContractsCommand() *ContractsCmd
ContractsCommand returns a CLI-style entrypoint for `starling contracts`.
func (*ContractsCmd) Run ¶
func (c *ContractsCmd) Run(args []string) error
Run parses args and validates each run.
args shape: <db> <contract.yml> <runID...>
type DoctorCmd ¶
DoctorCmd is the handle returned by DoctorCommand.
func DoctorCommand ¶
func DoctorCommand() *DoctorCmd
DoctorCommand returns a CLI-style entrypoint for `starling doctor`.
Doctor is a quick health check rolled into a single command: it reports the binary's Starling version, the schema version of the supplied event log (if any), validates the hash chain, and surveys well-known provider env vars. It exits 0 on success, 1 if any subcheck fails. Useful as the first thing to run when a downstream build "isn't working" — it surfaces version skew, schema drift, missing API keys, and chain corruption in one place.
Usage:
starling doctor # env-only checks starling doctor <db> # env + schema/validate against db
type Done ¶
Done is always the last AgentEvent on the channel. TerminalKind is the run's terminal event kind; Err is set on RunFailed and on RunCancelled (with context.Canceled).
type ExportCmd ¶
type ExportCmd struct {
// Name is used in flag error messages and the usage string.
Name string
// Output is where NDJSON is written. Defaults to os.Stdout.
Output io.Writer
}
ExportCmd is the handle returned by ExportCommand.
func ExportCommand ¶
func ExportCommand() *ExportCmd
ExportCommand returns a CLI-style entrypoint for `starling export`. Emits one NDJSON line per event (envelope + typed payload) so the output pipes cleanly into jq. Intended to be invoked from cmd/starling; the returned *ExportCmd is safe to configure further before Run.
type InspectCmd ¶
type InspectCmd struct {
// Factory is the replay.Factory wired into inspect.WithReplayer.
// Nil disables replay (the UI is read-only).
Factory replay.Factory
// Name is the program name used in flag error messages and the
// usage string. Defaults to "inspect".
Name string
// Output is where logs and flag errors are written. Defaults to
// os.Stderr.
Output io.Writer
// Token, when non-empty, enables bearer-token auth on every
// inspector route. Clients must send
// `Authorization: Bearer <token>`.
//
// If empty, the --token flag is consulted, then the
// STARLING_INSPECT_TOKEN environment variable. If all three are
// empty, the inspector runs unauthenticated (default localhost
// posture). Callers wanting a different auth scheme (JWT, mTLS,
// IP allowlist, …) should build the inspect.Server themselves
// with inspect.WithAuth instead.
Token string
}
InspectCmd is the handle returned by InspectCommand. Fields may be customised between construction and Run; zero values are fine.
func InspectCommand ¶
func InspectCommand(factory replay.Factory) *InspectCmd
InspectCommand returns a CLI-style entrypoint for the Starling inspector. Intended for dual-mode binaries: a user's agent binary that runs the agent in one mode and serves the inspector (with replay wired up) in another, so the same Go code that produced a run can replay it.
Shape:
func main() {
if len(os.Args) > 1 && os.Args[1] == "inspect" {
cmd := starling.InspectCommand(myAgentFactory)
if err := cmd.Run(os.Args[2:]); err != nil {
log.Fatal(err)
}
return
}
// ... normal agent run ...
}
factory may be nil: the inspector runs read-only (no Replay button). When non-nil, it is invoked once per replay session to construct a fresh agent configured equivalently to the original run.
The returned *InspectCmd is safe to configure further via its exported fields before calling Run.
func (*InspectCmd) Run ¶
func (c *InspectCmd) Run(args []string) error
Run parses args, opens the log read-only, starts the inspector server, and blocks until the process receives SIGINT/SIGTERM or the server crashes. Blocking matches the expectation of a CLI subcommand; callers that need more control should use inspect.New directly.
args is the subcommand-level argument slice (e.g., os.Args[2:] after a "inspect" dispatch). It supports a minimal flag set — the defaults match cmd/starling-inspect, so the user experience is identical whether they run the standalone binary or their own dual-mode tool.
type MCPCmd ¶
type MCPCmd struct {
// Name is used in flag error messages and the usage string.
Name string
// Output is where startup banners and flag errors are written.
// stdout is reserved for the JSON-RPC protocol stream; never
// write to it from here.
Output io.Writer
}
MCPCmd is the handle returned by MCPCommand.
func MCPCommand ¶
func MCPCommand() *MCPCmd
MCPCommand returns a CLI-style entrypoint for `starling mcp`. It serves a read-only MCP server over stdio against the SQLite event log at args[0]; clients (Claude Desktop, Cursor, Claude Code, ...) spawn the binary as a subprocess and exchange JSON-RPC over the pipe.
Usage shape:
func main() {
if len(os.Args) > 1 && os.Args[1] == "mcp" {
if err := starling.MCPCommand().Run(os.Args[2:]); err != nil {
log.Fatal(err)
}
return
}
// ... normal agent run ...
}
The binary is identical to the standalone cmd/starling-mcp; both delegate here. Both are read-only by construction.
type Metrics ¶
type Metrics struct {
// contains filtered or unexported fields
}
Metrics holds every collector Starling exposes. Construct via NewMetrics against a Registerer the operator owns; assign the result to Agent.Metrics. The zero value is not usable — nil is the "disabled" sentinel.
func NewMetrics ¶
func NewMetrics(reg prometheus.Registerer) *Metrics
NewMetrics registers every collector against reg and returns a Metrics ready to attach to an Agent. Panics on duplicate registration — same posture as BearerAuth("") — so misuse surfaces at startup rather than silently discarding samples.
Pass a fresh prometheus.NewRegistry() per process to avoid accidentally colliding with globally-registered collectors. promhttp.Handler(reg) then exposes the scrape endpoint.
type MigrateCmd ¶
MigrateCmd is the handle returned by MigrateCommand.
func MigrateCommand ¶
func MigrateCommand() *MigrateCmd
MigrateCommand returns a CLI-style entrypoint for `starling migrate`.
func (*MigrateCmd) Run ¶
func (c *MigrateCmd) Run(args []string) error
Run applies pending migrations to the SQLite event log at args[0]. Flags:
-dry-run report pending versions without applying any DDL.
type ProviderError ¶
ProviderError wraps an error from the LLM provider (stream open failure, mid-stream error). Provider is the provider ID (e.g. "openai"); Code carries an HTTP status if the adapter surfaced one, 0 otherwise.
Aliased to provider.Error so the step package (which the root cannot import without cycles) can construct it directly. Callers that route on *starling.ProviderError continue to work via errors.As.
type PruneCmd ¶
type PruneCmd struct {
// Name is used in flag error messages and the usage string.
Name string
// Output is where the retention report is written. Defaults to
// os.Stdout.
Output io.Writer
}
PruneCmd is the handle returned by PruneCommand.
func PruneCommand ¶
func PruneCommand() *PruneCmd
PruneCommand returns a CLI-style entrypoint for `starling prune`. It deletes whole runs that are older than a retention cutoff. The command is dry-run unless --confirm is passed.
type ReplayCmd ¶
type ReplayCmd struct {
// Factory builds the agent that re-executes the recorded run. Nil
// is valid and makes Run return a dual-mode-guidance error — that
// path is what the stock `cmd/starling` binary uses.
Factory replay.Factory
// Name is used in flag error messages and the usage string.
Name string
// Output is where the text divergence report is written. Defaults
// to os.Stdout.
Output io.Writer
}
ReplayCmd is the handle returned by ReplayCommand.
func ReplayCommand ¶
ReplayCommand returns a CLI-style entrypoint for `starling replay`. Intended for dual-mode binaries that link their agent factory into the same binary that serves the runtime CLI.
Shape:
func main() {
if len(os.Args) > 1 && os.Args[1] == "replay" {
cmd := starling.ReplayCommand(myAgentFactory)
if err := cmd.Run(os.Args[2:]); err != nil {
log.Fatal(err)
}
return
}
// ... normal agent run ...
}
factory may be nil: Run then returns an error explaining the dual-mode requirement, so the stock `cmd/starling` binary fails cleanly rather than pretending replay is possible without user code.
When factory is non-nil it is invoked once per Run to construct a replay.Agent configured equivalently to the original run.
type ReplayOption ¶
ReplayOption tunes Replay behavior. See WithForceProvider.
func WithForceProvider ¶
func WithForceProvider() ReplayOption
WithForceProvider disables Replay's provider/model identity check. By default Replay refuses to run when the agent's Provider.ID/APIVersion/Config.Model differ from the values recorded in the log's RunStarted event; this catches the common "wrong agent factory" mistake before any turn executes. Pass this option only when the divergence is intentional.
type ResumeOption ¶
type ResumeOption func(*resumeConfig)
ResumeOption tunes Resume / ResumeWith behavior. See WithReissueTools.
func WithReissueTools ¶
func WithReissueTools(b bool) ResumeOption
WithReissueTools controls whether Resume re-runs tool calls that were scheduled but never completed. Defaults to true. Set false for tools that mutate external state and should fail loudly (ErrPartialToolCall) instead of silently retrying. Re-issued calls get fresh CallIDs.
type RunResult ¶
type RunResult struct {
RunID string
FinalText string
TurnCount int
ToolCallCount int
TotalCostUSD float64
InputTokens int64
OutputTokens int64
CacheStats CacheStats
Duration time.Duration
TerminalKind event.Kind // RunCompleted | RunFailed | RunCancelled
MerkleRoot []byte
// ToolCalls records every tool invocation the run dispatched, in
// event-log order. One record per (CallID, Attempt); a retried call
// appears multiple times.
ToolCalls []ToolCallRecord
// Metadata is the caller-supplied map stamped into RunStarted via
// Config.Metadata. Nil when the run was started without metadata.
Metadata map[string]string
// SessionID is the session group stamped into RunStarted metadata,
// when configured.
SessionID string
}
RunResult is the user-facing summary of a completed agent run. Populated from the events the Run emitted into the log — the same values are recoverable by replaying the log, so RunResult is a convenience, not a source of truth.
type SchemaVersionCmd ¶
SchemaVersionCmd is the handle returned by SchemaVersionCommand.
func SchemaVersionCommand ¶
func SchemaVersionCommand() *SchemaVersionCmd
SchemaVersionCommand returns a CLI-style entrypoint for `starling schema-version`.
func (*SchemaVersionCmd) Run ¶
func (c *SchemaVersionCmd) Run(args []string) error
Run prints the current schema version of the SQLite event log at args[0].
type StepEvent ¶
type StepEvent struct {
Kind event.Kind
TurnID string
CallID string
Text string // assistant text, reasoning content, or tool result
Tool string // for tool call events
Err error // set on Failed kinds
Raw event.Event // full envelope for consumers that want everything
}
StepEvent is the user-facing projection of one event, used by the future streaming API (Agent.Stream). Narrower than event.Event so consumers don't have to decode payloads themselves for common cases.
type TextDelta ¶
TextDelta carries an assistant turn's accumulated text. Emitted once per AssistantMessageCompleted, not per intra-turn chunk — chunk-level streaming is a future addition behind the same AgentEvent surface.
type ToolCallEnded ¶
ToolCallEnded reports the tool's outcome. Result is the raw JSON the tool returned (empty on failure). Err is non-nil on a failed call (KindToolCallFailed) and nil on success (KindToolCallCompleted).
type ToolCallRecord ¶
type ToolCallRecord struct {
CallID string
TurnID string
Name string
Args json.RawMessage
Result json.RawMessage
Err string
Attempt uint32
Final bool
Duration time.Duration
}
ToolCallRecord is one tool invocation as it appears in the event log. On failure Result is nil and Err is set; Final reports whether retries were exhausted.
type ToolCallStarted ¶
ToolCallStarted reports that the runtime has scheduled a tool invocation. Emitted on KindToolCallScheduled.
type ToolError ¶
ToolError wraps an error returned by a tool invocation with the tool's name and the CallID of the offending call. Used when the agent loop bails because of an unrecoverable tool failure.
type ValidateCmd ¶
type ValidateCmd struct {
// Name is used in flag error messages and the usage string.
Name string
// Output is where per-run status lines are written. Defaults to
// os.Stdout.
Output io.Writer
}
ValidateCmd is the handle returned by ValidateCommand.
func ValidateCommand ¶
func ValidateCommand() *ValidateCmd
ValidateCommand returns a CLI-style entrypoint for `starling validate`. Runs eventlog.Validate over one run (or every run in the log) and prints per-run status. Intended to be invoked from cmd/starling; the returned *ValidateCmd is safe to configure further before Run.
func (*ValidateCmd) Run ¶
func (c *ValidateCmd) Run(args []string) error
Run parses args and validates the requested run(s). Prints one line per run: "<runID>\tOK" or "<runID>\tCORRUPT: <reason>". Returns a non-nil error on I/O failure or on any validation failure, so the caller (cmd/starling) can exit non-zero.
args shape:
<db> validate every run in the log <db> <runID> validate one run
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package budget defines cost and token budgets and the enforcement logic that cancels in-flight LLM streams when a budget trips.
|
Package budget defines cost and token budgets and the enforcement logic that cancels in-flight LLM streams when a budget trips. |
|
cmd
|
|
|
starling
command
Command starling is the runtime CLI for Starling event logs.
|
Command starling is the runtime CLI for Starling event logs. |
|
starling-inspect
command
Command starling-inspect is a local web inspector for Starling event logs.
|
Command starling-inspect is a local web inspector for Starling event logs. |
|
starling-mcp
command
Command starling-mcp serves a read-only MCP server over stdio against a Starling SQLite event log.
|
Command starling-mcp serves a read-only MCP server over stdio against a Starling SQLite event log. |
|
Package contracts validates recorded Starling runs against simple CI-friendly assertions.
|
Package contracts validates recorded Starling runs against simple CI-friendly assertions. |
|
Package event defines the event types, canonical encoding, and hash-chain helpers that form the core of the Starling event log.
|
Package event defines the event types, canonical encoding, and hash-chain helpers that form the core of the Starling event log. |
|
Package eventlog defines the EventLog interface and ships three default backends: in-memory (NewInMemory), SQLite (NewSQLite), and Postgres (NewPostgres).
|
Package eventlog defines the EventLog interface and ships three default backends: in-memory (NewInMemory), SQLite (NewSQLite), and Postgres (NewPostgres). |
|
examples
|
|
|
branching
command
Command branching demonstrates the WAL-safe SQLite fork helper.
|
Command branching demonstrates the WAL-safe SQLite fork helper. |
|
hello
command
Command hello is the minimal Starling agent: build the provider from OPENAI_API_KEY, run a single-turn no-tools prompt with RunOnce, print the response.
|
Command hello is the minimal Starling agent: build the provider from OPENAI_API_KEY, run a single-turn no-tools prompt with RunOnce, print the response. |
|
incident_triage
command
|
|
|
m1_hello
command
Command m1_hello is Starling's end-to-end demo.
|
Command m1_hello is Starling's end-to-end demo. |
|
m4_inspector_demo
command
Command m4_inspector_demo seeds a SQLite event log with a handful of synthetic runs so a developer can boot starling-inspect and look at the UI without an LLM provider key, without internet, and without running a real agent.
|
Command m4_inspector_demo seeds a SQLite event log with a handful of synthetic runs so a developer can boot starling-inspect and look at the UI without an LLM provider key, without internet, and without running a real agent. |
|
manual_writes
command
Command manual_writes shows how to write events into a Starling event log without using Agent.Run — useful when integrating non-LLM workflows that nonetheless want the same audit log, inspector, and validation surface.
|
Command manual_writes shows how to write events into a Starling event log without using Agent.Run — useful when integrating non-LLM workflows that nonetheless want the same audit log, inspector, and validation surface. |
|
mcp_tools
command
|
|
|
multi_turn
command
Command multi_turn shows the recommended pattern for chat-style multi-message workflows: one Run per user message.
|
Command multi_turn shows the recommended pattern for chat-style multi-message workflows: one Run per user message. |
|
Package inspect implements Starling's local web inspector as a reusable library.
|
Package inspect implements Starling's local web inspector as a reusable library. |
|
internal
|
|
|
cborenc
Package cborenc is Starling's canonical CBOR codec (RFC 8949 §4.2).
|
Package cborenc is Starling's canonical CBOR codec (RFC 8949 §4.2). |
|
obs
Package obs holds internal observability helpers shared across the root starling package and the step helpers.
|
Package obs holds internal observability helpers shared across the root starling package and the step helpers. |
|
Package mcpsrv exposes a recorded Starling event log to AI assistants over the Model Context Protocol.
|
Package mcpsrv exposes a recorded Starling event log to AI assistants over the Model Context Protocol. |
|
Package merkle provides the binary BLAKE3 Merkle-tree helpers used to commit to a run's event log.
|
Package merkle provides the binary BLAKE3 Merkle-tree helpers used to commit to a run's event log. |
|
Package provider defines the Provider interface and the normalized stream chunk types every LLM adapter produces.
|
Package provider defines the Provider interface and the normalized stream chunk types every LLM adapter produces. |
|
anthropic
Package anthropic adapts the Anthropic Messages API to Starling's Provider interface.
|
Package anthropic adapts the Anthropic Messages API to Starling's Provider interface. |
|
bedrock
Package bedrock adapts Amazon Bedrock Runtime's ConverseStream API to Starling's Provider interface.
|
Package bedrock adapts Amazon Bedrock Runtime's ConverseStream API to Starling's Provider interface. |
|
gemini
Package gemini adapts the Google Gemini API to Starling's Provider interface.
|
Package gemini adapts the Google Gemini API to Starling's Provider interface. |
|
openai
Package openai implements the Provider interface against the OpenAI Chat Completions API.
|
Package openai implements the Provider interface against the OpenAI Chat Completions API. |
|
openrouter
Package openrouter is a thin wrapper over provider/openai that sets OpenRouter's base URL and optional attribution headers (HTTP-Referer, X-Title).
|
Package openrouter is a thin wrapper over provider/openai that sets OpenRouter's base URL and optional attribution headers (HTTP-Referer, X-Title). |
|
Package replay re-runs a recorded event log through the agent loop and verifies the reproduced events match the recording byte-for-byte.
|
Package replay re-runs a recorded event log through the agent loop and verifies the reproduced events match the recording byte-for-byte. |
|
Package starlingd exposes Starling agents as a small HTTP daemon.
|
Package starlingd exposes Starling agents as a small HTTP daemon. |
|
Package starlingtest exposes test helpers for downstream consumers of Starling: a deterministic scripted Provider, in-memory event-log seeders, and replay assertions.
|
Package starlingtest exposes test helpers for downstream consumers of Starling: a deterministic scripted Provider, in-memory event-log seeders, and replay assertions. |
|
Package step is the determinism boundary.
|
Package step is the determinism boundary. |
|
Package tool defines the Tool interface and the Typed[In, Out] generic helper for building typed agent tools.
|
Package tool defines the Tool interface and the Typed[In, Out] generic helper for building typed agent tools. |
|
builtin
Package builtin provides a small set of ready-made tools (HTTP fetch, local file read) used by the examples and suitable for simple agents.
|
Package builtin provides a small set of ready-made tools (HTTP fetch, local file read) used by the examples and suitable for simple agents. |
|
mcp
Package mcp adapts Model Context Protocol tools to Starling tools.
|
Package mcp adapts Model Context Protocol tools to Starling tools. |

