commander

package
v0.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 28, 2026 License: MIT Imports: 19 Imported by: 0

Documentation

Overview

Package commander implements the built-in commander — an LLM loop that runs inside the Eyrie process. The user chats directly with Eyrie, and the commander dispatches tools that call into Eyrie's stores directly.

Unlike captains and talons, which run as external framework processes for workspace-sandboxed work, the commander has no sandbox and no subprocess. Its tools are ordinary Go functions.

Index

Constants

View Source
const (
	// EventDelta carries an incremental text chunk from the LLM.
	// Field: {"type": "delta", "text": "..."}
	EventDelta = "delta"

	// EventToolCall announces the LLM wants to invoke a tool.
	// Field: {"type": "tool_call", "id": "...", "name": "...", "args": {...}}
	EventToolCall = "tool_call"

	// EventToolResult carries the result of a tool invocation.
	// Field: {"type": "tool_result", "id": "...", "name": "...", "output": "...", "error": false}
	EventToolResult = "tool_result"

	// EventMessage carries a complete, persisted message (assistant-final).
	// Field: {"type": "message", "role": "assistant", "content": "..."}
	EventMessage = "message"

	// EventDone signals the turn is complete. May include token usage.
	// Field: {"type": "done", "input_tokens": N, "output_tokens": M}
	EventDone = "done"

	// EventError signals a terminal error in the turn.
	// Field: {"type": "error", "error": "..."}
	EventError = "error"

	// EventConfirmRequired signals the LLM wants to call a write tool
	// that requires user approval. The turn pauses after this event;
	// the client must POST /api/commander/confirm/{id} with an approval
	// decision before the tool executes. See pending.go.
	// Field: {"type": "confirm_required", "id": "pa_xxx", "tool": "...",
	//         "args": {...}, "summary": "human-readable description"}
	EventConfirmRequired = "confirm_required"
)

SSE event types emitted by the commander's chat endpoint. These are the wire contract between backend and any client (curl, future UI, test scripts). Each event is JSON with a `type` field the client discriminates on.

WHY types as constants: Avoids typos, makes grep-ability easy, gives a single source of truth when the UI is built on another machine.

Variables

View Source
var ErrMemoryNotFound = errors.New("memory key not found")

ErrMemoryNotFound is returned when Recall or Forget targets a missing key.

Functions

func NormalizeKey

func NormalizeKey(key string) string

NormalizeKey returns the canonical form of a memory key: lowercase and trimmed. Exposed so tool validation can display the normalized form back to the LLM (e.g. confirm_required summary).

Types

type AuditEntry

type AuditEntry struct {
	Timestamp time.Time      `json:"timestamp"`
	PendingID string         `json:"pending_id,omitempty"`
	Tool      string         `json:"tool"`
	Args      map[string]any `json:"args"`
	Risk      string         `json:"risk"`
	Decision  string         `json:"decision"`
	Outcome   string         `json:"outcome"`
	Error     string         `json:"error,omitempty"`
	Reason    string         `json:"reason,omitempty"`
}

type AuditLog

type AuditLog struct {
	// contains filtered or unexported fields
}

AuditLog writes audit entries to disk. Thread-safe via a mutex — the write rate is low enough (one line per write-tool attempt) that mutex contention is not a concern.

func NewAuditLog

func NewAuditLog() (*AuditLog, error)

NewAuditLog creates an audit log backed by ~/.eyrie/commander/audit.jsonl.

func (*AuditLog) Append

func (a *AuditLog) Append(entry AuditEntry) error

Append writes one audit entry. Non-fatal: on error, returns the error but the caller should log it, not abort the user operation. The audit log is an observability aid, not a gating mechanism.

type Commander

type Commander struct {
	// contains filtered or unexported fields
}

Commander is the orchestrator: it holds the LLM provider, the conversation store, and the tool registry. One Commander instance serves the whole process — there is a single persistent conversation with the user.

func New

func New(cfg Config) *Commander

New constructs a Commander. Callers supply all dependencies so this package has no knowledge of how the provider is obtained (env var, config file, vault) — that's the server's responsibility.

func NewDefault

func NewDefault(deps DefaultConfig) (*Commander, error)

NewDefault builds a Commander with the skeleton defaults: OpenRouter as the LLM provider, a Claude model with tool-calling support, the standard conversation store, and the full built-in tool registry.

WHY OpenRouter (not the Claude Max proxy): the claude-max-api proxy runs Claude Code internally — it ignores custom tool definitions and only lets the model use Claude Code's built-in tools. For the commander's tool-calling loop to work, we need a real LLM endpoint that forwards tool definitions to the model.

WHY baked-in defaults in the skeleton: a proper config file and provider switching are Phase 5a follow-up work. OpenRouter key retrieval uses the existing vault (env var > keys.json fallback).

func (*Commander) Memory

func (c *Commander) Memory() *MemoryStore

Memory returns the memory store, or nil if memory is disabled. Exposed for the HTTP read endpoint.

func (*Commander) Pending

func (c *Commander) Pending() *PendingStore

Pending returns the pending-action store. Exposed so the server's confirm endpoint can look up and resolve pending actions.

func (*Commander) ResumeAfterConfirm

func (c *Commander) ResumeAfterConfirm(ctx context.Context, pa *PendingAction, approved bool, reason string, emit Emitter) error

ResumeAfterConfirm processes the result of a user approval or denial and either (a) continues processing remaining unresolved tool_calls from the same assistant message, or (b) runs the LLM continuation once all tool_calls in that batch are resolved. Called by the confirm endpoint in the server.

The flow:

  • If approved: execute the tool, append the result (or error) to history as the tool_result for the unresolved tool_call.
  • If denied: append a synthetic tool_result describing the denial.
  • Check the parent assistant message for other tool_calls that still lack tool_results (happens when the LLM batched multiple tool_calls in one reply). If any remain, process them via processToolCalls — auto tools run immediately, the next Confirm triggers another confirm_required and pauses again.
  • Only when all tool_calls in the batch are resolved do we call runContinuation so the LLM can react to the full batch of results (e.g. "I've created all three projects").

Both audit outcomes are logged.

func (*Commander) RunTurn

func (c *Commander) RunTurn(ctx context.Context, userMessage string, emit Emitter) error

RunTurn handles one user message: loads history, appends the user message, calls the LLM + tools in a bounded loop, and streams events to the emitter. Returns nil on success; on error, an error event has already been emitted via the emitter.

func (*Commander) Store

func (c *Commander) Store() *Store

Store returns the underlying conversation store. Exposed for history and clear endpoints.

type Config

type Config struct {
	Provider      embedded.LLMProvider
	Model         string
	ContextWindow int // model's max context in tokens; 0 omits from done events
	Store         *Store
	Tools         *Registry
	Pending       *PendingStore // optional; defaults to a fresh in-memory store
	Audit         *AuditLog     // optional; nil disables audit logging
	Memory        *MemoryStore  // optional; nil omits memory injection + tools
}

Config configures a new Commander.

type DefaultConfig

type DefaultConfig struct {
	Projects      *project.Store
	Chat          *project.ChatStore
	Discovery     func(ctx context.Context) discovery.Result
	SendToProject func(ctx context.Context, projectID, message string) error
	RestartAgent  func(ctx context.Context, name string) error
	// Vault is the key vault to read API keys from. When nil,
	// selectProvider falls back to config.GetKeyVault().
	Vault *config.KeyVault
}

DefaultConfig bundles everything NewDefault needs from the caller. The server populates this with its cached stores and method values for server-side callbacks (discovery, project message injection, agent restart).

type Emitter

type Emitter interface {
	WriteEvent(v any) error
}

Emitter is the minimal interface the turn loop needs to push events to the client. The server's SSEWriter satisfies this.

WHY an interface (not *server.SSEWriter): prevents the commander package from importing the server package, which would create an import cycle since the server imports the commander.

type MemoryEntry

type MemoryEntry struct {
	Key       string    `json:"key"`
	Value     string    `json:"value"`
	CreatedAt time.Time `json:"created_at"`
	UpdatedAt time.Time `json:"updated_at"`
}

MemoryEntry is one note the commander has stored about the user, a project, or a preference. Entries are keyed by a normalized string (lowercase + trimmed) so minor capitalization differences do not create duplicate notes for the same concept.

type MemoryStore

type MemoryStore struct {
	// contains filtered or unexported fields
}

MemoryStore is a flat, mutex-guarded key-value store backed by a single JSON file at ~/.eyrie/commander/memory.json.

WHY flat JSON (not JSONL): chat is append-only, memory is edit-in-place. JSONL would require rewriting the whole file on every Forget anyway, so one JSON object keeps the shape matching the semantics. Atomic rewrites via fileutil.AtomicWrite prevent partial writes.

WHY in-memory cache: memory contents are injected into the system prompt every turn. Re-reading the file each turn would add disk I/O to every LLM call for no benefit — the store is authoritative in memory and only the disk file is updated on mutation.

func NewMemoryStore

func NewMemoryStore() (*MemoryStore, error)

NewMemoryStore constructs a MemoryStore backed by the default path. Loads existing entries if the file exists; an absent file is treated as an empty store. A malformed file logs an error at load time but still returns a usable empty store — memory is advisory, not critical.

func (*MemoryStore) Forget

func (m *MemoryStore) Forget(key string) error

Forget removes an entry. Returns ErrMemoryNotFound if the key is not stored — callers can treat this as non-fatal if idempotency matters.

func (*MemoryStore) List

func (m *MemoryStore) List() []MemoryEntry

List returns all entries in deterministic order (sorted by key). Safe to call when there are no entries (returns empty slice).

func (*MemoryStore) Recall

func (m *MemoryStore) Recall(key string) (MemoryEntry, error)

Recall returns the entry for a key. Returns ErrMemoryNotFound if the key is not stored.

func (*MemoryStore) Remember

func (m *MemoryStore) Remember(key, value string) (MemoryEntry, error)

Remember inserts or updates an entry. Returns the stored entry after normalization. Empty key or value is rejected.

type PendingAction

type PendingAction struct {
	ID        string         `json:"id"`
	Tool      string         `json:"tool"`
	Args      map[string]any `json:"args"`
	Summary   string         `json:"summary"` // human-readable description
	CreatedAt time.Time      `json:"created_at"`
	ExpiresAt time.Time      `json:"expires_at"`
	Status    PendingStatus  `json:"status"`

	// ToolCallID is the LLM-assigned id for the unresolved tool_call in
	// the assistant message. We use this when resuming to emit a tool
	// result message with the matching tool_call_id.
	ToolCallID string `json:"tool_call_id,omitempty"`

	// Denial reason (set only when Status == PendingDenied)
	DenialReason string `json:"denial_reason,omitempty"`
}

PendingAction is a record of a tool call that requires user approval before executing. Stored in memory with a TTL.

WHY in-memory only (no persistence): pending actions are short-lived and tied to the current commander turn. If the server restarts, the user's chat context resets anyway — losing pending actions on restart is the expected behavior (equivalent to cancelling all in-flight approvals).

type PendingStatus

type PendingStatus string

PendingStatus tracks the lifecycle of a pending action.

const (
	PendingOpen     PendingStatus = "open"     // awaiting user decision
	PendingApproved PendingStatus = "approved" // user approved (tool has run or is running)
	PendingDenied   PendingStatus = "denied"   // user denied
	PendingExpired  PendingStatus = "expired"  // TTL exceeded before decision
)

type PendingStore

type PendingStore struct {
	// contains filtered or unexported fields
}

PendingStore is an in-memory, goroutine-safe map of pending actions. Expired entries are swept lazily on Get/Add (no background goroutine, to keep shutdown simple).

func NewPendingStore

func NewPendingStore() *PendingStore

NewPendingStore creates an empty pending store with the default TTL.

func (*PendingStore) Add

func (s *PendingStore) Add(tool string, args map[string]any, summary, toolCallID string) *PendingAction

Add stores a new pending action and returns it. The caller is responsible for setting Tool, Args, Summary, and ToolCallID; Add fills in ID, timestamps, and Status.

func (*PendingStore) Approve

func (s *PendingStore) Approve(id string) (*PendingAction, error)

Approve marks a pending action as approved and returns a copy for the caller to execute. Returns an error if the action is not in open state.

func (*PendingStore) Deny

func (s *PendingStore) Deny(id, reason string) (*PendingAction, error)

Deny marks a pending action as denied with an optional reason.

func (*PendingStore) Get

func (s *PendingStore) Get(id string) (*PendingAction, error)

Get returns the pending action with the given id, or an error if missing. Expired actions are swept first.

type Registry

type Registry struct {
	// contains filtered or unexported fields
}

Registry holds the tools available to the commander. The registry is built once at Commander construction time and is read-only thereafter.

func NewRegistry

func NewRegistry(deps RegistryDeps) *Registry

NewRegistry builds a registry populated with the built-in tool set. Additional tools are registered here as the commander grows.

func (*Registry) Definitions

func (r *Registry) Definitions() []embedded.ToolDef

Definitions returns all registered tools as LLM-ready ToolDefs. Returns the cached slice built at construction time.

func (*Registry) Get

func (r *Registry) Get(name string) *Tool

Get returns the tool with the given name, or nil if not registered.

type RegistryDeps

type RegistryDeps struct {
	// Projects is the project store for read + write tools.
	Projects *project.Store
	// Chat is the project chat store, used by read_project_chat.
	Chat *project.ChatStore
	// Discovery runs agent discovery on demand; used by list_agents.
	Discovery func(ctx context.Context) discovery.Result
	// SendToProject injects a commander message into a project's chat
	// and kicks off the project orchestrator in the background (captain
	// responds asynchronously). Returns an error only if the injection
	// itself fails (project not found, etc.).
	SendToProject func(ctx context.Context, projectID, message string) error
	// RestartAgent stops then starts an agent by name. Best-effort;
	// returns an error if either step fails.
	RestartAgent func(ctx context.Context, name string) error
	// Memory is the commander's persistent key-value note store, exposed
	// through the remember/recall/forget tools.
	Memory *MemoryStore
}

RegistryDeps bundles the stores and callbacks the built-in tools need. Passed to NewRegistry so tool implementations can close over exactly what they need without the registry package importing server internals.

WHY function fields (not pointers to server types): discovery, message injection, and agent lifecycle live in the server package. Passing functions lets the server supply method values without the commander package importing server — avoids an import cycle.

type Risk

type Risk int

Risk classifies a tool's blast radius. The turn loop gates execution based on this: Auto tools run immediately; Confirm tools pause for out-of-band user approval before executing.

WHY a typed enum, not a bool: leaves room for a future Dangerous tier (e.g. "confirm + require typing target id") without another round of API changes. For MVP only Auto and Confirm are implemented.

const (
	// RiskAuto executes immediately. Use for read-only and trivially
	// reversible tools. Every Auto tool call still goes in the audit log.
	RiskAuto Risk = iota
	// RiskConfirm requires out-of-band user approval before execution.
	// The turn emits a confirm_required event, stores a pending action,
	// and ends the turn with an unresolved tool_call. Execution happens
	// only when /api/commander/confirm/{id} is POSTed with approved=true.
	RiskConfirm
)

type Store

type Store struct {
	// contains filtered or unexported fields
}

Store persists the commander's conversation as an append-only JSONL file. One file per process — there is exactly one commander conversation.

WHY one file (not per-session): The commander has a single persistent relationship with the user, not ephemeral per-project sessions. Memory and context span across projects; fragmenting into sessions would complicate that relationship. A single long-running file is simpler.

WHY JSONL not SQLite: Same reasoning as project chat — append-only, simple to inspect with jq, no WAL/locking complexity. If the commander ever needs random-access queries (e.g. "find the first time I mentioned X"), we can migrate to SQLite then.

func NewStore

func NewStore() (*Store, error)

NewStore creates a Store backed by ~/.eyrie/commander/chat.jsonl. The directory is created on first append, not here, so the store can be constructed before the config dir exists.

func (*Store) All

func (s *Store) All() ([]embedded.Message, error)

All returns every message in the conversation in insertion order. Returns (nil, nil) if the file does not yet exist.

func (*Store) Append

func (s *Store) Append(msg embedded.Message) error

Append adds one message to the conversation.

func (*Store) Clear

func (s *Store) Clear() error

Clear removes the conversation file entirely. Safe to call on a nonexistent file.

func (*Store) Rewrite

func (s *Store) Rewrite(messages []embedded.Message) error

Rewrite replaces the entire conversation file atomically. Used for operations that need to modify or prune history (not used in the skeleton, but exposed for future memory/summarization work).

type Tool

type Tool struct {
	Name        string
	Description string
	// Risk controls whether the tool requires user confirmation.
	// Zero value (RiskAuto) is safe for read-only tools.
	Risk Risk
	// Parameters is a JSON Schema describing the tool's arguments.
	// Passed to the LLM as the tool's `function.parameters` field.
	Parameters map[string]any
	// Execute runs the tool. `args` is the parsed JSON object the LLM
	// provided. Returns a string the LLM will see as the tool result.
	Execute func(ctx context.Context, args map[string]any) (string, error)
	// Summarize produces a one-line human-readable description of what
	// this tool call would do with these args. Used in confirm_required
	// events so the user sees a clear summary before approving. Optional
	// — defaults to "<tool_name>(<args>)".
	Summarize func(args map[string]any) string
}

Tool is an action the commander can invoke. Each tool is a plain Go function — no HTTP, no subprocess, no sandbox. The tool executes directly against Eyrie's in-process stores.

WHY no subprocess: The commander orchestrates; it does not do workspace work. Captains and talons are the ones that need sandboxing. Direct function calls eliminate a whole class of failures (serialization, streaming parser mismatches, subprocess lifecycle).

func (Tool) Definition

func (t Tool) Definition() embedded.ToolDef

Definition converts the Tool to the OpenAI-format ToolDef expected by the LLM provider.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL