starling

package module

v0.1.0-beta.2 Latest Latest Go to latest Published: May 7, 2026 License: Apache-2.0 Imports: 41 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/jerkeyray/starling

Links

Open Source Insights

README ¶

Event-sourced agent runtime for Go.

Replayable runs · Tamper-evident logs · Provider-neutral tools · Production debugging

Quickstart · Docs

Every run is an event log.

That's the whole pitch. Starling treats the agent loop as a stream of typed, append-only events - every prompt, model chunk, tool call, budget decision, and terminal state - committed to a BLAKE3 hash chain with a Merkle root over the whole run. The log is the source of truth; RunResult is just a convenience derived from it.

That single decision is what gives you everything else:

Replay is reading the log back through the same agent wiring and byte-comparing each re-emitted event. Divergence is a structured error pointing at the first event that didn't reproduce.
Resume is appending to a chain that didn't reach a terminal event. The hash chain enforces "nothing was lost in the gap."
Audit is the Merkle root on the terminal event committing to every leaf - tampering with any earlier event invalidates the commitment.
Cost control, observability, the inspector, replay tests - all of them are projections of the same event stream.

If you've worked with event sourcing before, this should sound familiar. If you've shipped LLM agents before, you know what it costs to not have this.

What's included

Event-sourced execution: every meaningful runtime action is an event.
Deterministic replay: recorded runs can be replayed without calling the model or re-running recorded side effects.
Durable event logs: in-memory, SQLite, and Postgres backends with schema migration and validation helpers.
Provider adapters: OpenAI-compatible APIs, Anthropic, Gemini, Amazon Bedrock, and OpenRouter.
MCP tools: stdio subprocess and streamable HTTP clients backed by the official Go MCP SDK.
Tool safety: retries, transient error classification, typed tool errors, max MCP output caps, and replay-safe side effects.
Hermetic tests: starlingtest ships a scripted provider and replay assertions so agent tests run without an LLM.
Inspector: dependency-free browser UI for exploring runs and replay divergence.
HTTP daemon helper: starlingd lets your own agent binary accept runs over HTTP, stream SSE updates, expose metrics, and mount the inspector.
Observability: metrics wrappers, OpenTelemetry-friendly examples, and opt-in structured slog output (silent by default; pass Config.Logger = slog.New(...) to enable).

Install

go get github.com/jerkeyray/starling@v0.1.0-beta.2

Starling is a single Go module. The provider sub-packages (provider/anthropic, provider/openai, provider/gemini, provider/bedrock, provider/openrouter) come along with this go get; no separate install is needed.

Pin a tag rather than tracking main - Starling is in beta and breaking changes are permitted between beta cuts. See Release policy and CHANGELOG.md.

Documentation

docs/getting-started.md - install, your first agent, tools, durable storage, replay.
docs/mental-model.md - what a Run is, when it terminates, when to use one Run versus many, what replay actually checks.
docs/faq.md - quick answers to recurring questions.
Cookbook: branching, manual writes, multi-turn.
Reference: events, step primitives, cost model, tools, replay, contracts, metrics, starlingd, save file, MCP server.

docs/README.md is the full index.

Quickstart

Single-turn, no tools — the most common shape:

package main

import (
	"context"
	"fmt"
	"os"

	starling "github.com/jerkeyray/starling"
	"github.com/jerkeyray/starling/eventlog"
	"github.com/jerkeyray/starling/provider/openai"
)

func main() {
	prov, err := openai.New(openai.WithAPIKey(os.Getenv("OPENAI_API_KEY")))
	if err != nil {
		panic(err)
	}

	log := eventlog.NewInMemory()
	a := &starling.Agent{
		Provider: prov,
		Log:      log,
		Config:   starling.Config{Model: "gpt-4o-mini"},
	}

	text, err := a.RunOnce(context.Background(), "Give me a three bullet incident summary.")
	if err != nil {
		panic(err)
	}

	fmt.Println(text)
}

RunOnce ignores any tools on the agent and caps the loop at one turn — ideal for prompt-in/text-out use cases. For tool-using or multi-turn flows, call Agent.Run (see examples/incident_triage). Note that MaxTurns counts every model call, so a forced single-tool flow needs MaxTurns >= 2 (turn 1 emits the tool_use, turn 2 lets the model respond to the tool result).

Core Model

Agent.Run
  -> provider.Stream
  -> tool execution
  -> budget checks
  -> append-only event log
  -> replay / inspect / resume

Starling treats the event log as the source of truth. The runtime records model requests, streaming chunks, tool calls, usage, budget decisions, terminal states, and replay metadata as structured events. Backends validate event ordering, schema versions, and hash continuity.

Durable Logs

Use SQLite or Postgres when runs must survive process restarts or be inspected later.

log, err := eventlog.NewSQLite("starling.db")
if err != nil {
	panic(err)
}
defer log.Close()

Durable backends support schema preflight checks, migrations, validation, and read-only inspection workflows.

Replay And Resume

Replay a recorded run against the same agent wiring:

if err := starling.Replay(ctx, log, runID, a); err != nil {
	if errors.Is(err, starling.ErrNonDeterminism) {
		// Inspect the log for the first diverging event.
	}
	panic(err)
}

Resume continues from a persisted run while preserving call correlation and budget accounting.

next, err := a.Resume(ctx, runID, "Continue with remediation steps.")

The starlingtest package wires the same machinery into Go tests without touching a real model:

p := &starlingtest.ScriptedProvider{Scripts: scripts}
a := &starling.Agent{Provider: p, Log: eventlog.NewInMemory(), Config: cfg}
res, _ := a.Run(ctx, "...")
p.Reset()
starlingtest.AssertReplayMatches(t, a.Log, res.RunID, a)

Providers

Provider	Package	Notes
OpenAI-compatible	`provider/openai`	OpenAI, Groq, Together, Ollama, vLLM, LM Studio, Azure OpenAI, and compatible APIs via custom `BaseURL`.
Anthropic	`provider/anthropic`	Messages API support, tool use, thinking/signatures, and prompt caching metadata.
Gemini	`provider/gemini`	Native Gemini adapter for Google models.
Amazon Bedrock	`provider/bedrock`	Native Bedrock ConverseStream adapter with AWS SDK auth, tool use, reasoning, and cache-aware usage.
OpenRouter	`provider/openrouter`	OpenRouter-specific convenience wrapper over the OpenAI-compatible path.

Provider behavior is covered by a conformance suite so adapters share the same streaming, usage, tool-call, and error contracts.

MCP Tools

Starling can expose remote MCP tools as regular tool.Tool values.

client, err := toolmcp.NewCommand(ctx,
	exec.Command("uvx", "mcp-server-filesystem", "/tmp"),
	toolmcp.WithIncludeTools("read_file", "list_directory"),
	toolmcp.WithMaxOutputBytes(64<<10),
)
if err != nil {
	panic(err)
}
defer client.Close()

tools, err := client.Tools(ctx)
if err != nil {
	panic(err)
}

a := &starling.Agent{
	Provider: prov,
	Log:      log,
	Tools:    tools,
	Config:   starling.Config{Model: "gpt-4o-mini", MaxTurns: 8},
}

Supported transports:

toolmcp.NewCommand(ctx, cmd, opts...) for stdio subprocess servers.
toolmcp.NewHTTP(ctx, endpoint, httpClient, opts...) for streamable HTTP servers.
toolmcp.New(ctx, transport, opts...) for custom transports.

MCP tool calls are wrapped in step.SideEffect, so replay uses the recorded result instead of contacting the remote MCP server again. Starling currently supports MCP tools; resources, prompts, and sampling are intentionally deferred.

Budgets And Retries

Budgets can cap input tokens, output tokens, USD cost, and wall-clock runtime.

a := &starling.Agent{
	Provider: prov,
	Log:      log,
	Budget: &starling.Budget{
		MaxInputTokens:  20_000,
		MaxOutputTokens: 4_000,
		MaxUSD:          0.50,
		MaxWallClock:    30 * time.Second,
	},
	Config: starling.Config{Model: "gpt-4o-mini", MaxTurns: 8},
}

Tool retries are explicit and replay-aware:

out, err := step.CallTool(ctx, step.ToolCall{
	CallID:      "fetch-ticket",
	TurnID:      turnID,
	Name:        "fetch_ticket",
	Args:        args,
	Idempotent:  true,
	MaxAttempts: 3,
})

MCP server

starling-mcp exposes the recorded event log to AI assistants (Claude Desktop, Cursor, Claude Code) over stdio. Read-only by construction. Once wired into your MCP client, you ask normal questions about your agent's runs and the model calls the appropriate tool — list_runs, summarize_run, get_event, diff_runs, search_runs, etc.

go install github.com/jerkeyray/starling/cmd/starling-mcp@latest

Add to your client config (Claude Desktop shape; Cursor / Claude Code follow the same pattern):

{
  "mcpServers": {
    "starling": {
      "command": "starling-mcp",
      "args": ["/path/to/runs.db"]
    }
  }
}

Full reference: docs/reference/mcp-server.md.

HTTP Daemon

Use package starlingd when you want your own agent wiring exposed as a private HTTP service. It provides a bounded in-process queue, async POST /api/v1/runs, SSE progress streams, run/event read APIs, /metrics, bearer auth, and an optional inspector mount.

if err := starlingd.Command(buildAgent).Run(os.Args[1:]); err != nil {
	panic(err)
}

Full reference: docs/reference/starlingd.md.

Inspector

go run ./cmd/starling-inspect starling.db

Loopback web UI: runs list with per-row totals, per-event timeline with a syntax-highlighted JSON detail pane, a /sessions page that groups runs by Config.SessionID, and a /diff page aligning any two runs side-by-side by sequence number. Dark by default, theme toggle in the topbar, hashes and run ids are click-to-copy, no CDN or JS build step. Runs read-only - Append is impossible on the inspector's DB handle.

Inspector run detail with timeline and JSON pane

Inspector diff page

CLI

go install github.com/jerkeyray/starling/cmd/starling@latest for the stock binary, or build a dual-mode binary around starling.InspectCommand / starling.ReplayCommand to wire your own agent factory.

Subcommand	What it does
`validate <db> [<runID>]`	Hash-chain + Merkle check, one run or every run.
`export <db> <runID>`	Dump events as NDJSON (pipe into `jq`).
`prune [flags] <db>`	Delete old whole runs after an explicit dry-run.
`inspect [flags] <db>`	Read-only web inspector.
`replay <db> <runID>`	Headless replay. Dual-mode binaries only.
`contracts <db> <file> <runID...>`	Validate explicit runs against a YAML contract file.
`migrate <db>`	Apply pending schema migrations.
`schema-version <db>`	Print the on-disk schema version.
`doctor [<db>]`	Health check: env vars, schema, chain validation.
`--version`	Print the linked starling module version.

Production Checklist

Run make check before release: format, vet, build, race tests, lint, and vulnerability scan.
Pick a durable log backend for production runs: SQLite for single-node use, Postgres for shared infrastructure.
Run eventlog preflight and migrations during deploys.
Protect inspector access behind your normal internal auth boundary.
Put starlingd behind TLS, rate limiting, and your normal service auth; its built-in bearer token is a private-service guard, not a full auth system.
Set explicit budgets for tokens, cost, and wall-clock runtime.
Use idempotent retries and per-call timeouts for tools that touch external systems.
Use replay regression tests for critical agent workflows.
Store raw provider responses only when your privacy and retention policy allows it.
Export runs you must keep, then run starling prune --older-than <duration> --confirm <db> as a scheduled retention job. Without --confirm, prune is a dry-run report.

Examples

Example	What it shows
examples/hello	The smallest end-to-end agent (~50 lines). Start here.
examples/m1_hello	Dual-mode pattern: run / inspect / replay / reset / show.
examples/multi_turn	Chat-style workflow: one Run per user message.
examples/branching	`eventlog.ForkSQLite` to split a recorded run into a counterfactual branch.
examples/manual_writes	Writing events without `Agent.Run`, including the Merkle root.
examples/incident_triage	End-to-end production-style workflow with budgets, replay, resume, metrics, OTel, and durable logs.
examples/mcp_tools	MCP server tools adapted into Starling tools.
examples/m4_inspector_demo	Local run data for the inspector.

Code layout

package starling lives at the module root - that's a Go convention, not a layout choice. The interesting parts are under sub-packages.

.
├── agent.go, config.go, errors.go, result.go,    Core API: Agent, Config,
│   stream.go, runstream.go, resume.go,           RunResult, Resume, replay
│   replay_api.go, metrics.go, version.go,        wrappers, sentinel errors,
│   *_command.go, *_test.go                       CLI command helpers, tests.
│
├── bench/             benchmarks
├── budget/            pricing tables, USD/token caps
├── cmd/
│   ├── starling/         stock CLI (validate / export / inspect / replay / migrate / doctor)
│   └── starling-inspect/ standalone inspector binary
├── docs/              prose docs (getting-started, mental-model, cookbook, reference)
├── event/             Event / Kind types, per-kind payload schemas
├── eventlog/          append-only log: in-memory / SQLite / Postgres
├── examples/          runnable agents (start with examples/hello)
├── inspect/           inspector server + UI templates + static assets
├── internal/          unexported helpers (cborenc, obs)
├── merkle/            public BLAKE3 Merkle helpers
├── provider/          OpenAI / Anthropic / Gemini / Bedrock / OpenRouter adapters
├── replay/            replay re-execution + Stream
├── starlingtest/      test helpers (ScriptedProvider, AssertReplayMatches)
├── starlingd/         HTTP daemon package + command helper
├── step/              step.Now / step.Random / step.SideEffect, CallTool
└── tool/              Tool interface, Typed[In,Out], Wrap middleware, MCP client

Development

make check

Useful targets:

make test      # race-enabled Go test suite
make lint      # golangci-lint
make vuln      # govulncheck
make inspect   # run the inspector locally
make smoke     # quick end-to-end smoke run

Release policy

Starling is in beta. Versions are tagged v0.x.y-beta.N and distributed through Go module proxy.

Pin a tag. Don't track main; the working branch may carry breaking changes between beta cuts.
Breaking changes are permitted between beta tags. Each tag's delta is recorded in CHANGELOG.md. Until GA there is no API or wire-format compatibility promise.
Within a tag, breakage is a bug. A pinned beta is reproducible.
Schema versioning for the event log is documented under Event log schema below - this is the one surface that has its own forward/back-compat contract.
GA (v1.0.0) will land when the public API surface, event schema, and replay contract are stable enough to commit to. No date promised.

Event log schema

event.SchemaVersion is the format version of the events written into the log. Resume and replay both read this field and refuse runs written by an unknown schema (ErrSchemaVersionMismatch).

When the constant bumps. Whenever the wire-format of an event payload, the set of event.Kind values, or the canonical-encoding rules change in a way that affects the BLAKE3 hash chain.
What consumers must do. Re-pin to the matching beta tag, then run starling migrate <db> (also exposed in-process as starling.MigrateCommand) to bring on-disk logs forward. The starling schema-version <db> command prints the current version.
Compatibility within a major-schema family. Minor bumps must remain resume-compatible: an older agent binary should be able to resume a run written by a newer one whenever the new schema is a superset. Breaking format changes bump the major part and require the explicit migrate step.
Migrations live in migrate_command.go. Each new on-disk format ships its forward migration alongside the schema bump in the same beta.

License

Apache 2.0. See LICENSE.

Documentation ¶

Overview ¶

Package starling is an event-sourced agent runtime for Go.

Every agent run is recorded as an append-only log of events, making every execution deterministically replayable, cost-enforceable, and cryptographically auditable.

Status: pre-alpha. Public API is not yet stable.

Index ¶

Variables
func MetricsHandler(g prometheus.Gatherer) http.Handler
func NewRunID(namespace string) (string, error)
func Replay(ctx context.Context, log eventlog.EventLog, runID string, a *Agent, ...) error
type Agent
- func (a *Agent) ReplayProviderInfo() (providerID, apiVersion, modelID string)
- func (a *Agent) Resume(ctx context.Context, runID, extraMessage string) (*RunResult, error)
- func (a *Agent) ResumeWith(ctx context.Context, runID, extraMessage string, opts ...ResumeOption) (*RunResult, error)
- func (a *Agent) Run(ctx context.Context, goal string) (*RunResult, error)
- func (a *Agent) RunOnce(ctx context.Context, prompt string) (string, error)
- func (a *Agent) RunReplay(ctx context.Context, recorded []event.Event) error
- func (a *Agent) RunReplayInto(ctx context.Context, recorded []event.Event, sink eventlog.EventLog) error
- func (a *Agent) RunStream(ctx context.Context, goal string) (string, <-chan AgentEvent, error)
- func (a *Agent) RunWithID(ctx context.Context, runID string, goal string) (*RunResult, error)
- func (a *Agent) Stream(ctx context.Context, goal string) (string, <-chan StepEvent, error)
- func (a *Agent) StreamWithID(ctx context.Context, runID string, goal string) (<-chan StepEvent, error)
type AgentEvent
type Budget
type CacheStats
type Config
type ContractsCmd
- func ContractsCommand() *ContractsCmd
- func (c *ContractsCmd) Run(args []string) error
type DoctorCmd
- func DoctorCommand() *DoctorCmd
- func (c *DoctorCmd) Run(args []string) error
type Done
type ExportCmd
- func ExportCommand() *ExportCmd
- func (c *ExportCmd) Run(args []string) error
type InspectCmd
- func InspectCommand(factory replay.Factory) *InspectCmd
- func (c *InspectCmd) Run(args []string) error
type MCPCmd
- func MCPCommand() *MCPCmd
- func (c *MCPCmd) Run(args []string) error
type Metrics
- func NewMetrics(reg prometheus.Registerer) *Metrics
type MigrateCmd
- func MigrateCommand() *MigrateCmd
- func (c *MigrateCmd) Run(args []string) error
type ProviderError
type PruneCmd
- func PruneCommand() *PruneCmd
- func (c *PruneCmd) Run(args []string) error
type ReplayCmd
- func ReplayCommand(factory replay.Factory) *ReplayCmd
- func (c *ReplayCmd) Run(args []string) error
type ReplayOption
- func WithForceProvider() ReplayOption
type ResumeOption
- func WithReissueTools(b bool) ResumeOption
type RunResult
type SchemaVersionCmd
- func SchemaVersionCommand() *SchemaVersionCmd
- func (c *SchemaVersionCmd) Run(args []string) error
type StepEvent
type TextDelta
type ToolCallEnded
type ToolCallRecord
type ToolCallStarted
type ToolError
- func (e *ToolError) Error() string
- func (e *ToolError) Unwrap() error
type ValidateCmd
- func ValidateCommand() *ValidateCmd
- func (c *ValidateCmd) Run(args []string) error

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	// ErrBudgetExceeded is returned when a budget cap trips. The
	// matching BudgetExceeded event precedes the terminal RunFailed
	// in the log.
	ErrBudgetExceeded = errors.New("starling: budget exceeded")

	// ErrMaxTurnsExceeded is returned when the loop reaches Config.MaxTurns
	// without the model producing a final answer. Tool calls dispatched
	// before the cap tripped are still recorded; callers driving forced
	// single-shot tool flows can recover their output from
	// RunResult.ToolCalls regardless of this error.
	ErrMaxTurnsExceeded = errors.New("starling: max turns exceeded")

	// ErrNonDeterminism is returned by Replay when a re-emitted event
	// diverges from the recording. Wraps replay.ErrNonDeterminism.
	ErrNonDeterminism = errors.New("starling: non-determinism detected during replay")

	// ErrProviderModelMismatch is returned by Replay when the agent's
	// Provider.ID, APIVersion, or Config.Model disagree with the
	// values recorded in RunStarted. Override with WithForceProvider.
	// Aliased from replay.ErrProviderModelMismatch so callers that do
	// not import the replay package can still route on it.
	ErrProviderModelMismatch = replay.ErrProviderModelMismatch

	// ErrRunNotFound is returned by Resume when the requested runID has
	// no events in the log.
	ErrRunNotFound = errors.New("starling: run not found in log")

	// ErrRunAlreadyTerminal is returned by Resume when the run's last
	// event is a terminal kind (RunCompleted/RunFailed/RunCancelled).
	// Resuming a terminated run is not supported — the terminal event
	// commits a Merkle root over every event before it, and appending
	// past that point would invalidate the commitment.
	ErrRunAlreadyTerminal = errors.New("starling: run already terminal")

	// ErrSchemaVersionMismatch is returned by Resume when the run's
	// RunStarted event records a schema version this binary does not
	// understand.
	ErrSchemaVersionMismatch = errors.New("starling: event schema version mismatch")

	// ErrPartialToolCall is returned by Resume when the run's tail
	// contains a ToolCallScheduled event without a matching
	// ToolCallCompleted/ToolCallFailed, and WithReissueTools(false) was
	// passed. It signals that the resuming process would otherwise have
	// to re-issue a tool call of unknown idempotency.
	ErrPartialToolCall = errors.New("starling: run has a partial tool call; pass WithReissueTools(true) to reissue")

	// ErrRunInUse is returned by Resume when its first Append onto the
	// existing chain is rejected because another writer has advanced
	// the tail under us. Indicates two processes are racing to resume
	// the same run — the loser bails cleanly rather than risk chain
	// corruption.
	ErrRunInUse = errors.New("starling: run is being appended by another writer")

	// ErrLogCorrupt wraps every eventlog.Validate failure. Aliased
	// from the eventlog package so callers that never import eventlog
	// directly can still route on it.
	ErrLogCorrupt = eventlog.ErrLogCorrupt
)

Sentinel errors surfaced by Agent.Run. Tests and callers route on these with errors.Is; the agent loop converts matching errors into the appropriate terminal event (RunFailed vs RunCancelled) before propagating.

View Source

var Version = "v0.1.0-beta.2"

Version is the semantic version of the Starling library and bundled CLI binaries. Bumped per release per the policy documented in CHANGELOG.md and the README.

Build tooling can override this via -ldflags="-X github.com/jerkeyray/starling.Version=vX.Y.Z" so dev builds report the underlying tag (or "dev" when off-tag); the constant here is the source of truth shipped with the tagged release.

Functions ¶

func MetricsHandler ¶

func MetricsHandler(g prometheus.Gatherer) http.Handler

MetricsHandler is a convenience wrapper so users don't have to import promhttp themselves for the common case. Equivalent to promhttp.HandlerFor(g, promhttp.HandlerOpts{}).

func NewRunID ¶

func NewRunID(namespace string) (string, error)

NewRunID returns a fresh Starling run id. When namespace is non-empty the returned id is namespace + "/" + ULID. Namespace must not contain "/" because slash separates the namespace from the ULID in URLs and event-log filters.

func Replay ¶

func Replay(ctx context.Context, log eventlog.EventLog, runID string, a *Agent, opts ...ReplayOption) error

Replay re-executes runID against a and verifies the reproduced event sequence matches the recorded log byte-for-byte. a must be configured identically to the original run (same Tools, same Config); Provider and Log are overridden with replay equivalents and the caller's log stays untouched.

Returns ErrNonDeterminism (wrapped) on divergence; ErrProviderModelMismatch when a's Provider.ID/APIVersion/Config.Model disagree with the recording (override with WithForceProvider); other errors (log-read, tool execution) surface verbatim.

Types ¶

type Agent ¶

type Agent struct {
	// Provider is the LLM adapter. Required.
	Provider provider.Provider

	// Tools the agent may plan. Optional.
	Tools []tool.Tool

	// Log is the event log backend. Required.
	Log eventlog.EventLog

	// Budget enforces token/USD/wall-clock caps. Optional; zero
	// values disable individual axes. Input tokens are checked
	// pre-call, output tokens and USD mid-stream, wall-clock via
	// context deadline.
	Budget *Budget

	// Config carries model / system prompt / params / MaxTurns.
	Config Config

	// Namespace prefixes this agent's RunIDs so multiple tenants can
	// share one event log. When set, RunID = Namespace + "/" + ULID;
	// must not contain "/". Empty leaves RunID as a bare ULID.
	Namespace string

	// Metrics is an optional Prometheus sink. Nil disables the
	// pipeline at no runtime cost.
	Metrics *Metrics
	// contains filtered or unexported fields
}

Agent is the user-facing entry point. Fields are validated on Run, not at construction.

func (*Agent) ReplayProviderInfo ¶

func (a *Agent) ReplayProviderInfo() (providerID, apiVersion, modelID string)

ReplayProviderInfo reports the agent's provider/model identity so the replay package can compare it against the recording's RunStarted event before re-executing. Implements replay.ProviderInspector.

func (*Agent) Resume ¶

func (a *Agent) Resume(ctx context.Context, runID, extraMessage string) (*RunResult, error)

Resume continues a previously-started run from its last recorded event. If extraMessage is non-empty it is appended as a user turn before the loop resumes. A terminal event is always emitted before return.

Returns:

ErrRunNotFound: runID not in a.Log.
ErrRunAlreadyTerminal: last event is terminal.
ErrSchemaVersionMismatch: RunStarted schema unknown.
ErrPartialToolCall: unpaired ToolCallScheduled with WithReissueTools(false).
ErrRunInUse: chain advanced between tail read and first append.

Budget: MaxWallClock and step-level token/USD caps reset at the process boundary; MaxTurns counts across the whole run.

func (*Agent) ResumeWith ¶

func (a *Agent) ResumeWith(ctx context.Context, runID, extraMessage string, opts ...ResumeOption) (*RunResult, error)

ResumeWith is Resume with options. Resume(ctx, id, msg) is equivalent to ResumeWith(ctx, id, msg).

func (*Agent) Run ¶

func (a *Agent) Run(ctx context.Context, goal string) (*RunResult, error)

Run starts a new agent run against the configured provider + tools. The returned RunResult summarizes the run; full detail is in Log.

Terminal events are always emitted before Run returns (successful completion → RunCompleted; ctx cancellation → RunCancelled; any other error → RunFailed), so the log is self-describing regardless of how the run ends.

func (*Agent) RunOnce ¶

func (a *Agent) RunOnce(ctx context.Context, prompt string) (string, error)

RunOnce is a no-tools, single-turn convenience: it ignores Agent.Tools, forces MaxTurns=1, runs the agent against prompt, and returns the assistant's final text. Equivalent to setting Config.MaxTurns=1, clearing Tools, calling Run, and reading RunResult.FinalText. For tool-using or multi-turn flows use Run.

func (*Agent) RunReplay ¶

func (a *Agent) RunReplay(ctx context.Context, recorded []event.Event) error

RunReplay re-executes the agent in replay mode against recorded. Intended for callers of the replay package; not part of the normal user flow. Goal, RunID, and provider streams are all reconstructed from recorded; the original Provider and Log are overridden (the Provider by a replay provider, the Log by a fresh in-memory log) so the live side is fully isolated from the recording.

Returns nil on a clean byte-matching replay. On divergence, returns an error that wraps step.ErrReplayMismatch — the replay package wraps that further into ErrNonDeterminism.

func (*Agent) RunReplayInto ¶

func (a *Agent) RunReplayInto(ctx context.Context, recorded []event.Event, sink eventlog.EventLog) error

RunReplayInto is RunReplay with a caller-supplied sink log instead of an internal in-memory one. Intended for callers (notably replay.Stream) that need to observe the byte-matching events as they are appended — subscribe to sink.Stream(...) before calling.

The sink's lifecycle is the caller's responsibility; this method does NOT close it.

func (*Agent) RunStream ¶

func (a *Agent) RunStream(ctx context.Context, goal string) (string, <-chan AgentEvent, error)

RunStream starts a new run and returns a channel of typed AgentEvents. The channel always closes after a single Done; setup errors surface synchronously.

RunStream is a thin projection of Stream — every emitted AgentEvent is derived from the same underlying StepEvent that Stream would have produced, filtered to the typed surface. Use Stream when you need the full envelope (raw event, sequence numbers, every Kind); use RunStream when you want a stable, narrow API for chat-style frontends.

func (*Agent) RunWithID ¶

func (a *Agent) RunWithID(ctx context.Context, runID string, goal string) (*RunResult, error)

RunWithID starts a new agent run using a caller-supplied runID. This is primarily for service wrappers that need to return a run id before worker execution begins. Callers must pass a fresh id; the event log rejects attempts to append over an existing run.

func (*Agent) Stream ¶

func (a *Agent) Stream(ctx context.Context, goal string) (string, <-chan StepEvent, error)

Stream starts a new run and returns a channel of StepEvents. Terminal events are always last. Setup errors are returned synchronously; run-time errors surface as a terminal StepEvent (Err populated). On ctx cancel the run is cancelled and the channel closes after draining.

func (*Agent) StreamWithID ¶

func (a *Agent) StreamWithID(ctx context.Context, runID string, goal string) (<-chan StepEvent, error)

StreamWithID starts a new run with a caller-supplied runID and returns a channel of StepEvents. It is the streaming counterpart to RunWithID, intended for services that allocate run ids before worker execution begins.

type AgentEvent ¶

type AgentEvent interface {
	// contains filtered or unexported methods
}

AgentEvent is the typed event surface for RunStream. Values are one of TextDelta, ToolCallStarted, ToolCallEnded, or Done. Unknown concrete types must be tolerated by callers using a type switch with a default branch — additions are not breaking changes within a beta cycle.

AgentEvent is layered on top of the lower-level StepEvent stream returned by Stream. RunStream is the user-friendly path; Stream is the escape hatch for callers who want every event with the full envelope.

type Budget ¶

type Budget = budget.Budget

Budget is re-exported from the budget package for callers that want a single import path. All four axes are enforced end-to-end: MaxInputTokens pre-call (step.LLMCall), MaxOutputTokens and MaxUSD mid-stream on every usage chunk (step.LLMCall), MaxWallClock via context.WithDeadline at the agent level. Zero on any field disables that axis.

type CacheStats ¶

type CacheStats struct {
	Hits         int
	Misses       int
	ReadTokens   int64
	CreateTokens int64
}

CacheStats summarizes prompt-cache activity over the run, aggregated from per-turn AssistantMessageCompleted events. Only Anthropic and providers that surface cache token counts populate non-zero values; for others CacheStats is the zero value.

Semantics:

ReadTokens / CreateTokens: sums of CacheReadTokens and CacheCreateTokens across every turn.
Hits: number of turns whose CacheReadTokens was greater than 0.
Misses: number of turns that consumed input but did not read any cached prefix (CacheReadTokens == 0 && InputTokens > 0).

type Config ¶

type Config struct {
	// Model is the provider-specific model identifier passed through
	// to every LLM call. Required in practice; the adapter will error
	// if empty.
	Model string

	// SystemPrompt is prepended to every conversation and captured
	// verbatim into RunStarted.
	SystemPrompt string

	// Params is the raw provider-specific parameter blob (temperature,
	// top_p, max_tokens, …). Canonical CBOR so the hash in RunStarted
	// is stable across runs with equivalent params.
	Params cborenc.RawMessage

	// MaxTurns caps how many model API calls the loop will make. The
	// wrap-up call after a tool result counts as its own turn, so a
	// forced single-tool flow needs MaxTurns >= 2 (turn 1 emits the
	// tool_use, turn 2 lets the model respond to the tool result).
	// 0 (or negative) means unlimited — not recommended.
	MaxTurns int

	// RequireRawResponseHash fails any turn whose ChunkEnd lacks a
	// 32-byte hash.
	RequireRawResponseHash bool

	// AppVersion identifies the caller's application build and is
	// stamped into RunStarted alongside the Starling library version.
	// Optional; left blank when unset.
	AppVersion string

	// SessionID groups multiple runs into one application-level
	// session. Stamped into RunStarted metadata as "session_id"
	// when Metadata does not already supply that key. If both are
	// set and disagree, Metadata wins and a warning is logged.
	SessionID string

	// Metadata is caller-supplied run context — e.g. action_id, pr_number,
	// tenant_id — stamped verbatim into RunStarted, surfaced on RunResult,
	// and rendered by the inspector. Keys and values are caller-defined;
	// Starling makes no semantic interpretation. Nil or empty disables.
	Metadata map[string]string

	// EmitTimeout bounds each event-log Append the agent issues under
	// context.WithoutCancel (terminal events, tool failures during
	// cancellation). Zero disables the bound; set this when a hung
	// backend must not block shutdown.
	EmitTimeout time.Duration

	// SkipSchemaCheck disables the pre-flight schema-version check that
	// Run, Resume, and Replay run against the event log. Reserved for
	// tests and tooling that intentionally point at a database older
	// than the binary.
	SkipSchemaCheck bool

	// Logger receives structured slog records covering the run lifecycle:
	// RunStarted, per-turn start, budget trips, tool retries, and the
	// terminal event. Every record carries a "run_id" attribute; per-turn
	// and per-tool records add "turn_id" / "call_id".
	//
	// The event log remains the source of truth for auditing — Logger is
	// a side-channel trace for operators watching live runs. Nil is
	// silent: library output is discarded. Pass slog.New(...) (or
	// slog.Default()) to enable logs; level filtering is delegated to
	// the supplied handler's slog.HandlerOptions.Level.
	//
	// Exceptions: replay divergences and dropped event-log subscribers
	// are safety-critical signals and are always logged via
	// slog.Default(), regardless of this field.
	Logger *slog.Logger
}

Config captures the per-run knobs the user supplies on Agent. Every field is optional with a documented default.

type ContractsCmd ¶

type ContractsCmd struct {
	Name   string
	Output io.Writer
}

ContractsCmd validates explicit run IDs against a YAML contract file.

func ContractsCommand ¶

func ContractsCommand() *ContractsCmd

ContractsCommand returns a CLI-style entrypoint for `starling contracts`.

func (*ContractsCmd) Run ¶

func (c *ContractsCmd) Run(args []string) error

Run parses args and validates each run.

args shape: <db> <contract.yml> <runID...>

type DoctorCmd ¶

type DoctorCmd struct {
	Name   string
	Output io.Writer
}

DoctorCmd is the handle returned by DoctorCommand.

func DoctorCommand ¶

func DoctorCommand() *DoctorCmd

DoctorCommand returns a CLI-style entrypoint for `starling doctor`.

Doctor is a quick health check rolled into a single command: it reports the binary's Starling version, the schema version of the supplied event log (if any), validates the hash chain, and surveys well-known provider env vars. It exits 0 on success, 1 if any subcheck fails. Useful as the first thing to run when a downstream build "isn't working" — it surfaces version skew, schema drift, missing API keys, and chain corruption in one place.

Usage:

starling doctor                    # env-only checks
starling doctor <db>               # env + schema/validate against db

func (*DoctorCmd) Run ¶

func (c *DoctorCmd) Run(args []string) error

Run executes every subcheck and returns nil iff all pass.

type Done ¶

type Done struct {
	TerminalKind event.Kind
	FinalText    string
	Err          error
}

Done is always the last AgentEvent on the channel. TerminalKind is the run's terminal event kind; Err is set on RunFailed and on RunCancelled (with context.Canceled).

type ExportCmd ¶

type ExportCmd struct {
	// Name is used in flag error messages and the usage string.
	Name string
	// Output is where NDJSON is written. Defaults to os.Stdout.
	Output io.Writer
}

ExportCmd is the handle returned by ExportCommand.

func ExportCommand ¶

func ExportCommand() *ExportCmd

ExportCommand returns a CLI-style entrypoint for `starling export`. Emits one NDJSON line per event (envelope + typed payload) so the output pipes cleanly into jq. Intended to be invoked from cmd/starling; the returned *ExportCmd is safe to configure further before Run.

func (*ExportCmd) Run ¶

func (c *ExportCmd) Run(args []string) error

Run parses args and writes one NDJSON line per event in the run.

args shape: <db> <runID>

type InspectCmd ¶

type InspectCmd struct {
	// Factory is the replay.Factory wired into inspect.WithReplayer.
	// Nil disables replay (the UI is read-only).
	Factory replay.Factory

	// Name is the program name used in flag error messages and the
	// usage string. Defaults to "inspect".
	Name string

	// Output is where logs and flag errors are written. Defaults to
	// os.Stderr.
	Output io.Writer

	// Token, when non-empty, enables bearer-token auth on every
	// inspector route. Clients must send
	// `Authorization: Bearer <token>`.
	//
	// If empty, the --token flag is consulted, then the
	// STARLING_INSPECT_TOKEN environment variable. If all three are
	// empty, the inspector runs unauthenticated (default localhost
	// posture). Callers wanting a different auth scheme (JWT, mTLS,
	// IP allowlist, …) should build the inspect.Server themselves
	// with inspect.WithAuth instead.
	Token string
}

InspectCmd is the handle returned by InspectCommand. Fields may be customised between construction and Run; zero values are fine.

func InspectCommand ¶

func InspectCommand(factory replay.Factory) *InspectCmd

InspectCommand returns a CLI-style entrypoint for the Starling inspector. Intended for dual-mode binaries: a user's agent binary that runs the agent in one mode and serves the inspector (with replay wired up) in another, so the same Go code that produced a run can replay it.

Shape:

func main() {
    if len(os.Args) > 1 && os.Args[1] == "inspect" {
        cmd := starling.InspectCommand(myAgentFactory)
        if err := cmd.Run(os.Args[2:]); err != nil {
            log.Fatal(err)
        }
        return
    }
    // ... normal agent run ...
}

factory may be nil: the inspector runs read-only (no Replay button). When non-nil, it is invoked once per replay session to construct a fresh agent configured equivalently to the original run.

The returned *InspectCmd is safe to configure further via its exported fields before calling Run.

func (*InspectCmd) Run ¶

func (c *InspectCmd) Run(args []string) error

Run parses args, opens the log read-only, starts the inspector server, and blocks until the process receives SIGINT/SIGTERM or the server crashes. Blocking matches the expectation of a CLI subcommand; callers that need more control should use inspect.New directly.

args is the subcommand-level argument slice (e.g., os.Args[2:] after a "inspect" dispatch). It supports a minimal flag set — the defaults match cmd/starling-inspect, so the user experience is identical whether they run the standalone binary or their own dual-mode tool.

type MCPCmd ¶

type MCPCmd struct {
	// Name is used in flag error messages and the usage string.
	Name string
	// Output is where startup banners and flag errors are written.
	// stdout is reserved for the JSON-RPC protocol stream; never
	// write to it from here.
	Output io.Writer
}

MCPCmd is the handle returned by MCPCommand.

func MCPCommand ¶

func MCPCommand() *MCPCmd

MCPCommand returns a CLI-style entrypoint for `starling mcp`. It serves a read-only MCP server over stdio against the SQLite event log at args[0]; clients (Claude Desktop, Cursor, Claude Code, ...) spawn the binary as a subprocess and exchange JSON-RPC over the pipe.

Usage shape:

func main() {
    if len(os.Args) > 1 && os.Args[1] == "mcp" {
        if err := starling.MCPCommand().Run(os.Args[2:]); err != nil {
            log.Fatal(err)
        }
        return
    }
    // ... normal agent run ...
}

The binary is identical to the standalone cmd/starling-mcp; both delegate here. Both are read-only by construction.

func (*MCPCmd) Run ¶

func (c *MCPCmd) Run(args []string) error

Run parses args, opens the log read-only, and serves MCP over stdio until SIGINT/SIGTERM or stdin closes. Returns nil on clean shutdown.

type Metrics ¶

type Metrics struct {
	// contains filtered or unexported fields
}

Metrics holds every collector Starling exposes. Construct via NewMetrics against a Registerer the operator owns; assign the result to Agent.Metrics. The zero value is not usable — nil is the "disabled" sentinel.

func NewMetrics ¶

func NewMetrics(reg prometheus.Registerer) *Metrics

NewMetrics registers every collector against reg and returns a Metrics ready to attach to an Agent. Panics on duplicate registration — same posture as BearerAuth("") — so misuse surfaces at startup rather than silently discarding samples.

Pass a fresh prometheus.NewRegistry() per process to avoid accidentally colliding with globally-registered collectors. promhttp.Handler(reg) then exposes the scrape endpoint.

type MigrateCmd ¶

type MigrateCmd struct {
	Name   string
	Output io.Writer
}

MigrateCmd is the handle returned by MigrateCommand.

func MigrateCommand ¶

func MigrateCommand() *MigrateCmd

MigrateCommand returns a CLI-style entrypoint for `starling migrate`.

func (*MigrateCmd) Run ¶

func (c *MigrateCmd) Run(args []string) error

Run applies pending migrations to the SQLite event log at args[0]. Flags:

-dry-run   report pending versions without applying any DDL.

type ProviderError ¶

type ProviderError = provider.Error

ProviderError wraps an error from the LLM provider (stream open failure, mid-stream error). Provider is the provider ID (e.g. "openai"); Code carries an HTTP status if the adapter surfaced one, 0 otherwise.

Aliased to provider.Error so the step package (which the root cannot import without cycles) can construct it directly. Callers that route on *starling.ProviderError continue to work via errors.As.

type PruneCmd ¶

type PruneCmd struct {
	// Name is used in flag error messages and the usage string.
	Name string
	// Output is where the retention report is written. Defaults to
	// os.Stdout.
	Output io.Writer
}

PruneCmd is the handle returned by PruneCommand.

func PruneCommand ¶

func PruneCommand() *PruneCmd

PruneCommand returns a CLI-style entrypoint for `starling prune`. It deletes whole runs that are older than a retention cutoff. The command is dry-run unless --confirm is passed.

func (*PruneCmd) Run ¶

func (c *PruneCmd) Run(args []string) error

type ReplayCmd ¶

type ReplayCmd struct {
	// Factory builds the agent that re-executes the recorded run. Nil
	// is valid and makes Run return a dual-mode-guidance error — that
	// path is what the stock `cmd/starling` binary uses.
	Factory replay.Factory
	// Name is used in flag error messages and the usage string.
	Name string
	// Output is where the text divergence report is written. Defaults
	// to os.Stdout.
	Output io.Writer
}

ReplayCmd is the handle returned by ReplayCommand.

func ReplayCommand ¶

func ReplayCommand(factory replay.Factory) *ReplayCmd

ReplayCommand returns a CLI-style entrypoint for `starling replay`. Intended for dual-mode binaries that link their agent factory into the same binary that serves the runtime CLI.

Shape:

func main() {
    if len(os.Args) > 1 && os.Args[1] == "replay" {
        cmd := starling.ReplayCommand(myAgentFactory)
        if err := cmd.Run(os.Args[2:]); err != nil {
            log.Fatal(err)
        }
        return
    }
    // ... normal agent run ...
}

factory may be nil: Run then returns an error explaining the dual-mode requirement, so the stock `cmd/starling` binary fails cleanly rather than pretending replay is possible without user code.

When factory is non-nil it is invoked once per Run to construct a replay.Agent configured equivalently to the original run.

func (*ReplayCmd) Run ¶

func (c *ReplayCmd) Run(args []string) error

Run parses args and replays one run. Prints `OK: replay matches recorded log` on clean replay; `DIVERGED: <reason>` on non-determinism, returning a non-nil error so callers exit non-zero.

args shape: <db> <runID>

type ReplayOption ¶

type ReplayOption = replay.Option

ReplayOption tunes Replay behavior. See WithForceProvider.

func WithForceProvider ¶

func WithForceProvider() ReplayOption

WithForceProvider disables Replay's provider/model identity check. By default Replay refuses to run when the agent's Provider.ID/APIVersion/Config.Model differ from the values recorded in the log's RunStarted event; this catches the common "wrong agent factory" mistake before any turn executes. Pass this option only when the divergence is intentional.

type ResumeOption ¶

type ResumeOption func(*resumeConfig)

ResumeOption tunes Resume / ResumeWith behavior. See WithReissueTools.

func WithReissueTools ¶

func WithReissueTools(b bool) ResumeOption

WithReissueTools controls whether Resume re-runs tool calls that were scheduled but never completed. Defaults to true. Set false for tools that mutate external state and should fail loudly (ErrPartialToolCall) instead of silently retrying. Re-issued calls get fresh CallIDs.

type RunResult ¶

type RunResult struct {
	RunID         string
	FinalText     string
	TurnCount     int
	ToolCallCount int
	TotalCostUSD  float64
	InputTokens   int64
	OutputTokens  int64
	CacheStats    CacheStats
	Duration      time.Duration
	TerminalKind  event.Kind // RunCompleted | RunFailed | RunCancelled
	MerkleRoot    []byte

	// ToolCalls records every tool invocation the run dispatched, in
	// event-log order. One record per (CallID, Attempt); a retried call
	// appears multiple times.
	ToolCalls []ToolCallRecord

	// Metadata is the caller-supplied map stamped into RunStarted via
	// Config.Metadata. Nil when the run was started without metadata.
	Metadata map[string]string

	// SessionID is the session group stamped into RunStarted metadata,
	// when configured.
	SessionID string
}

RunResult is the user-facing summary of a completed agent run. Populated from the events the Run emitted into the log — the same values are recoverable by replaying the log, so RunResult is a convenience, not a source of truth.

type SchemaVersionCmd ¶

type SchemaVersionCmd struct {
	Name   string
	Output io.Writer
}

SchemaVersionCmd is the handle returned by SchemaVersionCommand.

func SchemaVersionCommand ¶

func SchemaVersionCommand() *SchemaVersionCmd

SchemaVersionCommand returns a CLI-style entrypoint for `starling schema-version`.

func (*SchemaVersionCmd) Run ¶

func (c *SchemaVersionCmd) Run(args []string) error

Run prints the current schema version of the SQLite event log at args[0].

type StepEvent ¶

type StepEvent struct {
	Kind   event.Kind
	TurnID string
	CallID string
	Text   string      // assistant text, reasoning content, or tool result
	Tool   string      // for tool call events
	Err    error       // set on Failed kinds
	Raw    event.Event // full envelope for consumers that want everything
}

StepEvent is the user-facing projection of one event, used by the future streaming API (Agent.Stream). Narrower than event.Event so consumers don't have to decode payloads themselves for common cases.

type TextDelta ¶

type TextDelta struct {
	TurnID string
	Text   string
}

TextDelta carries an assistant turn's accumulated text. Emitted once per AssistantMessageCompleted, not per intra-turn chunk — chunk-level streaming is a future addition behind the same AgentEvent surface.

type ToolCallEnded ¶

type ToolCallEnded struct {
	CallID string
	Tool   string
	Result []byte
	Err    error
}

ToolCallEnded reports the tool's outcome. Result is the raw JSON the tool returned (empty on failure). Err is non-nil on a failed call (KindToolCallFailed) and nil on success (KindToolCallCompleted).

type ToolCallRecord ¶

type ToolCallRecord struct {
	CallID   string
	TurnID   string
	Name     string
	Args     json.RawMessage
	Result   json.RawMessage
	Err      string
	Attempt  uint32
	Final    bool
	Duration time.Duration
}

ToolCallRecord is one tool invocation as it appears in the event log. On failure Result is nil and Err is set; Final reports whether retries were exhausted.

type ToolCallStarted ¶

type ToolCallStarted struct {
	TurnID string
	CallID string
	Tool   string
}

ToolCallStarted reports that the runtime has scheduled a tool invocation. Emitted on KindToolCallScheduled.

type ToolError ¶

type ToolError struct {
	Name   string
	CallID string
	Err    error
}

ToolError wraps an error returned by a tool invocation with the tool's name and the CallID of the offending call. Used when the agent loop bails because of an unrecoverable tool failure.

func (*ToolError) Error ¶

func (e *ToolError) Error() string

Error implements the error interface, formatting the tool name, CallID, and underlying error.

func (*ToolError) Unwrap ¶

func (e *ToolError) Unwrap() error

Unwrap returns the underlying tool error so callers can route on it with errors.Is / errors.As.

type ValidateCmd ¶

type ValidateCmd struct {
	// Name is used in flag error messages and the usage string.
	Name string
	// Output is where per-run status lines are written. Defaults to
	// os.Stdout.
	Output io.Writer
}

ValidateCmd is the handle returned by ValidateCommand.

func ValidateCommand ¶

func ValidateCommand() *ValidateCmd

ValidateCommand returns a CLI-style entrypoint for `starling validate`. Runs eventlog.Validate over one run (or every run in the log) and prints per-run status. Intended to be invoked from cmd/starling; the returned *ValidateCmd is safe to configure further before Run.

func (*ValidateCmd) Run ¶

func (c *ValidateCmd) Run(args []string) error

Run parses args and validates the requested run(s). Prints one line per run: "<runID>\tOK" or "<runID>\tCORRUPT: <reason>". Returns a non-nil error on I/O failure or on any validation failure, so the caller (cmd/starling) can exit non-zero.

args shape:

<db>            validate every run in the log
<db> <runID>    validate one run

Directories ¶

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

Path	Synopsis
budget Package budget defines cost and token budgets and the enforcement logic that cancels in-flight LLM streams when a budget trips.	Package budget defines cost and token budgets and the enforcement logic that cancels in-flight LLM streams when a budget trips.
cmd
starling command Command starling is the runtime CLI for Starling event logs.	Command starling is the runtime CLI for Starling event logs.
starling-inspect command Command starling-inspect is a local web inspector for Starling event logs.	Command starling-inspect is a local web inspector for Starling event logs.
starling-mcp command Command starling-mcp serves a read-only MCP server over stdio against a Starling SQLite event log.	Command starling-mcp serves a read-only MCP server over stdio against a Starling SQLite event log.
contracts Package contracts validates recorded Starling runs against simple CI-friendly assertions.	Package contracts validates recorded Starling runs against simple CI-friendly assertions.
event Package event defines the event types, canonical encoding, and hash-chain helpers that form the core of the Starling event log.	Package event defines the event types, canonical encoding, and hash-chain helpers that form the core of the Starling event log.
eventlog Package eventlog defines the EventLog interface and ships three default backends: in-memory (NewInMemory), SQLite (NewSQLite), and Postgres (NewPostgres).	Package eventlog defines the EventLog interface and ships three default backends: in-memory (NewInMemory), SQLite (NewSQLite), and Postgres (NewPostgres).
examples
branching command Command branching demonstrates the WAL-safe SQLite fork helper.	Command branching demonstrates the WAL-safe SQLite fork helper.
hello command Command hello is the minimal Starling agent: build the provider from OPENAI_API_KEY, run a single-turn no-tools prompt with RunOnce, print the response.	Command hello is the minimal Starling agent: build the provider from OPENAI_API_KEY, run a single-turn no-tools prompt with RunOnce, print the response.
incident_triage command
m1_hello command Command m1_hello is Starling's end-to-end demo.	Command m1_hello is Starling's end-to-end demo.
m4_inspector_demo command Command m4_inspector_demo seeds a SQLite event log with a handful of synthetic runs so a developer can boot starling-inspect and look at the UI without an LLM provider key, without internet, and without running a real agent.	Command m4_inspector_demo seeds a SQLite event log with a handful of synthetic runs so a developer can boot starling-inspect and look at the UI without an LLM provider key, without internet, and without running a real agent.
manual_writes command Command manual_writes shows how to write events into a Starling event log without using Agent.Run — useful when integrating non-LLM workflows that nonetheless want the same audit log, inspector, and validation surface.	Command manual_writes shows how to write events into a Starling event log without using Agent.Run — useful when integrating non-LLM workflows that nonetheless want the same audit log, inspector, and validation surface.
mcp_tools command
multi_turn command Command multi_turn shows the recommended pattern for chat-style multi-message workflows: one Run per user message.	Command multi_turn shows the recommended pattern for chat-style multi-message workflows: one Run per user message.
inspect Package inspect implements Starling's local web inspector as a reusable library.	Package inspect implements Starling's local web inspector as a reusable library.
internal
cborenc Package cborenc is Starling's canonical CBOR codec (RFC 8949 §4.2).	Package cborenc is Starling's canonical CBOR codec (RFC 8949 §4.2).
obs Package obs holds internal observability helpers shared across the root starling package and the step helpers.	Package obs holds internal observability helpers shared across the root starling package and the step helpers.
mcpsrv Package mcpsrv exposes a recorded Starling event log to AI assistants over the Model Context Protocol.	Package mcpsrv exposes a recorded Starling event log to AI assistants over the Model Context Protocol.
merkle Package merkle provides the binary BLAKE3 Merkle-tree helpers used to commit to a run's event log.	Package merkle provides the binary BLAKE3 Merkle-tree helpers used to commit to a run's event log.
provider Package provider defines the Provider interface and the normalized stream chunk types every LLM adapter produces.	Package provider defines the Provider interface and the normalized stream chunk types every LLM adapter produces.
anthropic Package anthropic adapts the Anthropic Messages API to Starling's Provider interface.	Package anthropic adapts the Anthropic Messages API to Starling's Provider interface.
bedrock Package bedrock adapts Amazon Bedrock Runtime's ConverseStream API to Starling's Provider interface.	Package bedrock adapts Amazon Bedrock Runtime's ConverseStream API to Starling's Provider interface.
conformance
gemini Package gemini adapts the Google Gemini API to Starling's Provider interface.	Package gemini adapts the Google Gemini API to Starling's Provider interface.
openai Package openai implements the Provider interface against the OpenAI Chat Completions API.	Package openai implements the Provider interface against the OpenAI Chat Completions API.
openrouter Package openrouter is a thin wrapper over provider/openai that sets OpenRouter's base URL and optional attribution headers (HTTP-Referer, X-Title).	Package openrouter is a thin wrapper over provider/openai that sets OpenRouter's base URL and optional attribution headers (HTTP-Referer, X-Title).
replay Package replay re-runs a recorded event log through the agent loop and verifies the reproduced events match the recording byte-for-byte.	Package replay re-runs a recorded event log through the agent loop and verifies the reproduced events match the recording byte-for-byte.
starlingd Package starlingd exposes Starling agents as a small HTTP daemon.	Package starlingd exposes Starling agents as a small HTTP daemon.
starlingtest Package starlingtest exposes test helpers for downstream consumers of Starling: a deterministic scripted Provider, in-memory event-log seeders, and replay assertions.	Package starlingtest exposes test helpers for downstream consumers of Starling: a deterministic scripted Provider, in-memory event-log seeders, and replay assertions.
step Package step is the determinism boundary.	Package step is the determinism boundary.
tool Package tool defines the Tool interface and the Typed[In, Out] generic helper for building typed agent tools.	Package tool defines the Tool interface and the Typed[In, Out] generic helper for building typed agent tools.
builtin Package builtin provides a small set of ready-made tools (HTTP fetch, local file read) used by the examples and suitable for simple agents.	Package builtin provides a small set of ready-made tools (HTTP fetch, local file read) used by the examples and suitable for simple agents.
mcp Package mcp adapts Model Context Protocol tools to Starling tools.	Package mcp adapts Model Context Protocol tools to Starling tools.