daemon

package
v0.7.8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 12, 2026 License: MIT Imports: 34 Imported by: 0

README

daemon/ — long-running rensei-daemon runtime

Status: Wave 6 / Phase F.2.8 (REN-1461). Public package; the af daemon … CLI surface is in afcli/daemon.go. Architecture: rensei-architecture/004-sandbox-capability-matrix.md §Local daemon mode + 011-local-daemon-fleet.md.

The daemon is a single-machine, multi-project supervisor that:

  1. Registers itself with the platform (/api/workers/register) and exchanges a one-time rsp_live_* token for a scoped runtime JWT.
  2. Sends a periodic heartbeat (/api/workers/<id>/heartbeat) and polls for queued work (/api/workers/<id>/poll).
  3. Accepts inbound SessionSpec payloads and spawns a worker child process per accepted session.
  4. Exposes a localhost-only HTTP control API on 127.0.0.1:7734 for the af CLI and for the spawned worker children themselves.
  5. Optionally self-updates by drain → fetch → verify → swap → restart.

Spawn flow (F.2.8)

        ┌────────────────────┐
        │ platform.poll()    │  GET /api/workers/<id>/poll
        └──────────┬─────────┘
                   │ work[] item
                   ▼
        ┌────────────────────┐
        │ Daemon.AcceptWork  │
        │   WithDetail()     │
        └──────────┬─────────┘
                   │ stores SessionDetail
                   ▼
        ┌────────────────────┐
        │ WorkerSpawner.spawn│  exec.CommandContext(<af>, "agent", "run")
        │                    │  env: RENSEI_SESSION_ID=<id>,
        │                    │       RENSEI_REPOSITORY=<repo>, …
        └──────────┬─────────┘
                   │
                   ▼
        ┌────────────────────┐
        │ af agent run       │  GET 127.0.0.1:7734/api/daemon/sessions/<id>
        │   (afcli/agent_run)│  → SessionDetail with QueuedWork shape +
        │                    │     AuthToken + PlatformURL + WorkerID
        └──────────┬─────────┘
                   │ runner.Run(ctx, qw)
                   ▼
        ┌────────────────────┐
        │ runner orchestrator│  worktree → spawn provider → events →
        │                    │  tail recovery → result.Post
        └────────────────────┘

The af binary registered by daemon install doubles as both the daemon supervisor (af daemon run) and the per-session worker (af agent run) — the same binary, different subcommands. The WorkerCommand defaults to [<self-exe>, "agent", "run"] resolved via os.Executable(); operators rarely override this.

SessionDetail lifecycle
  • Set: Daemon.AcceptWorkWithDetail records the detail in an in-memory map keyed by session id when the poll loop dispatches a work item.
  • Read: GET /api/daemon/sessions/<id> (handled by daemon/server.go::handleSessionDetail) returns the JSON payload to the spawned af agent run worker.
  • Delete: the spawner emits SessionEventEnded when the worker child process exits; the daemon's listener removes the entry from the store so stale auth tokens do not linger in memory.
Repository URL resolution (REN-1464 / v0.5.2)

SessionDetail.repository is resolved from the daemon.yaml project allowlist by pollItemToSessionDetail (in poll.go). The runner uses this URL for git clone.

The platform's QueuedWork wire shape historically carries a projectName slug (e.g. "smoke-alpha") with no separate repository URL — slugs are not clonable. When the poll item arrives the daemon runs the same matcher as WorkerSpawner.findProjectLocked (REN-1448): by id, by repository, or by URL-suffix. The matching entry's repository field is substituted into SessionDetail.repository, and the canonical id is mirrored back into SessionDetail.projectName so downstream code that reads RENSEI_PROJECT_ID sees a stable value.

If no allowlist entry matches, the daemon falls back to whatever the platform sent (preserving prior behaviour) and emits a Warn log no allowlist match for projectName, falling back to as-given repo string so the misconfiguration is visible. Downstream WorkerSpawner.AcceptWork will then reject the spec with repository ... is not in the project allowlist, but the explicit log makes the resolution-time failure observable separately from the spawn-time rejection.

HTTP control API

Localhost-only (binds 127.0.0.1). Endpoints:

Method + Path Purpose
GET /api/daemon/status Daemon lifecycle state, version, uptime, sessions
GET /api/daemon/stats Capacity envelope, worker stats, allowed projects
POST /api/daemon/pause Stop accepting new work
POST /api/daemon/resume Resume accepting work
POST /api/daemon/stop Graceful stop
POST /api/daemon/drain Drain in-flight work
POST /api/daemon/update Trigger manual update check
POST /api/daemon/capacity Update a config key (e.g. capacity.poolMaxDiskGb)
GET /api/daemon/pool/stats Workarea pool snapshot
POST /api/daemon/pool/evict Evict pool members
GET /api/daemon/sessions List active session handles
POST /api/daemon/sessions Accept a session (test entrypoint)
GET /api/daemon/sessions/<id> F.2.8 — per-session detail for the spawned worker
GET /api/daemon/heartbeat Most-recent heartbeat payload
GET /api/daemon/doctor Aggregated health snapshot
GET /healthz Liveness probe

Operator runbook — debugging a stuck session

When a session appears wedged in the dashboard:

  1. Daemon logaf daemon logs --follow (default ~/.rensei/daemon.log). Look for the worker spawner lines showing pid=… and the matching [child stdout sessionID=<id>] (INFO) and [child stderr sessionID=<id>] (WARN) records from the spawned af agent run worker. Spawn output is wired to slog by default as of v0.5.1 (REN-1463) — earlier daemons drained child stdio silently.
  2. Session detailcurl http://127.0.0.1:7734/api/daemon/sessions/<id> to confirm the detail is recorded. A 404 here means the daemon never accepted the work (look for poll errors in the daemon log) or the session has already terminated and been cleaned up.
  3. af agent run log — the worker child writes its own slog output to stderr. The daemon's spawner captures both streams under [child stdout|stderr sessionID=<id>]; the same lines appear inline in af daemon logs and in the platform's session-activity stream.
  4. Provider logs — when the runner reaches step 8 (spawn provider), the per-provider subprocess is the next layer (claude JSONL on stdout, codex JSON-RPC over stdio). The provider package's README explains how to capture those streams (PROVIDER_DEBUG=1 for claude, CODEX_LOG_LEVEL=debug for codex).
  5. Platform-side statecurl http://app.rensei.ai/api/sessions/<id> (with bearer auth) to confirm the platform sees the session in the expected state. A divergence between the daemon's view (still active) and the platform's view (already terminal) usually indicates a missed result.Post — re-run af daemon stats to see whether the poller has retried.
  6. Worktree state~/.rensei/worktrees/<sessionId>/.agent/ contains the per-session state.json snapshot and the events.jsonl audit log. Look here when the agent emitted no visible output but the session is marked failed.

Failure modes the daemon classifies (high-level)

Symptom Where it surfaces
WorkerCommand falls through to /bin/sh stub worker spawner warn line in daemon log
Daemon HTTP unreachable from worker child af agent run preflight error, exit code 2
Session detail expired between fetch attempts af agent run preflight error, exit code 2
Provider probe failed at runner startup af agent run Warn log "claude provider unavailable" — falls through to stub if the session asked for stub; otherwise the runner's Resolve fails with FailureProviderResolve
Worker child exited with non-zero SessionEventEnded with ExitErr non-nil; daemon emits the failure to its log

See runner/README.md for the runner-level failure-mode table that the daemon receives via result.Post payloads.

Tests

# Unit + smoke
go test -race ./daemon/...

# F.2.8 wire-path integration test (requires git on PATH)
go test -tags=f28_integration ./afcli/...

Documentation

Overview

Package daemon handle_kit.go — HTTP handlers for the /api/daemon/kits* and /api/daemon/kit-sources* surfaces.

Wave-9 A2 — see ADR-2026-05-07-daemon-http-control-api.md § D1 for the canonical route list. Path-prefix dispatch follows the same pattern used by handleSessionDetail in server.go.

Package daemon handle_provider.go — HTTP handlers for the /api/daemon/providers* operator surface. Wave 9 / A1.

The handlers expose the daemon's in-process AgentRuntime registry (claude/codex/ollama/opencode/gemini/amp/stub) as JSON. The remaining seven Provider Families (Sandbox, Workarea, VCS, IssueTracker, Deployment, AgentRegistry, Kit) return empty until per-family registries land in a future wave. The endpoint MUST emit PartialCoverage=true and CoveredFamilies=["agent-runtime"] so consumers render the "other families coming" caveat without sniffing for emptiness — see ADR-2026-05-07-daemon-http-control-api.md §D4.

Package daemon handle_routing.go — HTTP handlers for the /api/daemon/routing/* operator surface. Wave 9 / A4.

The handlers expose the daemon's RoutingTraceStore as JSON. The wire shape is locked in rensei-architecture/ADR-2026-05-07-daemon-http-control-api.md §D4 and matches the surfaces the SaaS dashboard's Routing Intelligence panel (REN-205) consumes, so the same renderer composes both.

Read-only this wave. The /config endpoint surfaces the static scheduler configuration (weights, capability filters, sandbox/LLM provider state) plus the rolling tail of recent decisions; the /explain/<sessionID> endpoint returns the full per-session decision trace.

Package daemon handle_workarea.go — HTTP handlers for the /api/daemon/workareas* operator surface.

Wave 9 / Track A3 / ADR-2026-05-07-daemon-http-control-api.md §D4a.

Routes:

GET    /api/daemon/workareas                            list active + archived
GET    /api/daemon/workareas/<id>                       inspect (active or archived)
POST   /api/daemon/workareas/<archiveID>/restore        201 on success
GET    /api/daemon/workareas/<idA>/diff/<idB>           JSON or NDJSON

The streaming-NDJSON variant on /diff/ kicks in when the entry count exceeds the daemon's configured workarea.diffStreamingThreshold (default 1000 per ADR D4a). Below that, the response is a single WorkareaDiffEnvelope JSON object.

Package daemon kit_install_git.go — git-source kit fetcher (Wave 12 / Theme C / S3).

The fetcher clones the operator-provided git URL into a temp directory, locates the kit manifest (and its sibling `.sigstore` bundle, when present), and exposes both as on-disk paths so KitRegistry.Install can run the trust-gated verifier against the freshly-fetched material before persisting it into kit.scanPaths[0].

Design notes

  • Uses go-git/v5 (pure-Go) so the daemon does not depend on a `git` binary on the operator's PATH. Public-host or file:// URLs are both accepted; tests rely on file:// fixtures.
  • When KitInstallSource.ManifestPath is empty the fetcher walks the cloned tree for *.kit.toml files and selects the first one that parses cleanly. This matches the audit § 2.1 step 3 contract: "walk repo for *.kit.toml, pick the first; multi-manifest support is a Wave 13+ extension per 005-kit-manifest-spec.md".
  • Caller MUST defer the returned cleanup func; the temp tree is persisted only long enough for the registry to copy what it needs into the configured scanPath.

Errors

  • ErrKitInstallSourceFetchFailed — clone failed (network, auth, ref not found, etc.). Wrapped with the underlying go-git error.
  • ErrKitInstallManifestNotFound — clone succeeded but no usable `*.kit.toml` exists at the configured ManifestPath (or anywhere in the tree when ManifestPath was empty).

Package daemon kit_registry.go — minimal in-process Kit registry that scans the filesystem for installed kit manifests and exposes them via the operator control API.

This is the OSS-execution-layer's "Local manifests" registry source from the federation list in 005-kit-manifest-spec.md § "Registry sources" (item 1). Other registry sources (bundled, rensei, tessl, agentskills, community) are not implemented in this wave; the /api/daemon/kit-sources endpoint returns a static descriptor list surfacing the federation order.

Scan path defaults to ~/.rensei/kits/*.kit.toml. Multiple paths may be declared via daemon.yaml's optional `kit.scanPaths` override.

Behaviour:

  • Empty registry (no scan path entries, no .kit.toml files) → empty list, HTTP 200.
  • Malformed manifests log a warning via slog and are excluded from the listing rather than failing the whole request.
  • Enable/disable state is persisted to a sidecar file at ~/.rensei/kits/.state.json so toggle outcomes survive daemon restarts. The file is created on first toggle.
  • Install is currently a stub returning ErrKitInstallUnimplemented; fetching kits from a remote registry is deferred until the federation sources land.
  • Verify-signature returns KitTrustUnsigned for all kits in this wave (signing is partially implemented per the ADR caveat).

Package daemon kit_trust.go — sigstore bundle-mode kit signature verifier (Wave 12 / Theme C / S2).

The verifier consumes a sibling `<manifest>.sigstore` file (Q1 of WAVE12_PLAN — "bundle file shape: sibling .sigstore"), validates it against the configured trust root, and reports back a populated afclient.KitSignatureResult. Three trust outcomes:

  • KitTrustSignedVerified — bundle present and verifies against the trust root + issuer set.
  • KitTrustSignedUnverified — bundle present but verification failed (tampered manifest, untrusted issuer, expired chain, etc.).
  • KitTrustUnsigned — no sibling .sigstore file exists.

At install time the verifier outcome maps to a trust gate. The gate runs in the registry's Install path, NOT here — see the ErrKitTrustGateRejected sentinel in kit_registry.go and the trustOverride: "allowed-this-once" handling per audit § 1.3 / § 2.2.

Trust modes (§ "Signing and trust" in 002-provider-base-contract.md):

  • permissive — verifier still runs and reports state, but never blocks Install. OSS default per Q2 of WAVE12_PLAN.
  • signed-by-allowlist — Install rejects KitTrustUnsigned and KitTrustSignedUnverified.
  • attested — same as allowlist for Wave 12 (the SLSA attestation graph hookup lands in Wave 13+).

The embedded trust root is the public Sigstore production trust root (https://raw.githubusercontent.com/sigstore/sigstore-go/main/examples/trusted-root-public-good.json). It will be replaced with a Rensei-published trust root once the productionized signing CI from REN-1344 emits a Rensei-signed Fulcio + Rekor cert chain (Wave 13+ work).

Q-audit-2 resolution (taken 2026-05-07 by /loop coordinator): trust-actor lookup falls back to os.Getuid() when daemon.yaml's `trust.actor` is empty. The trustOverride audit log is best-effort identification; the override is still timestamped and key fields (kitId, signerId) are always populated.

Package daemon routing_state.go — in-process routing trace store and configuration projector for the /api/daemon/routing/* surface (Wave 9 / A4).

The OSS daemon does not yet ship a real cross-provider scheduler in production. The store therefore defines the shape the eventual scheduler will record decisions through, and the read paths used by the HTTP handlers in handle_routing.go.

See ADR-2026-05-07-daemon-http-control-api.md §D4 for the wire contract, 004-sandbox-capability-matrix.md for the cross-provider scheduler model, and the forward reference at /api/daemon/routing/explain/<sessionID> in the same doc.

Package daemon implements the long-running rensei-daemon runtime in Go.

The daemon is a single-machine, multi-project supervisor that:

  • Registers itself with the orchestrator (dial-out) and exchanges a one-time rsp_live_* token for a scoped JWT.
  • Sends a periodic heartbeat to the orchestrator.
  • Accepts inbound work specs (sessions) and spawns worker child processes.
  • Exposes an HTTP control API on 127.0.0.1:7734 for the af / rensei CLI.
  • Optionally self-updates by drain → fetch → verify → swap → restart.

Architecture reference:

rensei-architecture/004-sandbox-capability-matrix.md §Local daemon mode
rensei-architecture/011-local-daemon-fleet.md

This is the public package surface — downstream binaries can import it directly to embed the daemon runtime under their own command tree. The afcli package re-exports the runtime as the `daemon run` subcommand.

This package is the Go port of agentfactory/packages/daemon/src (REN-1408). The TS package @renseiai/daemon is deprecated; final removal is scheduled for cycle 6 after the smoke harness has soaked for 7 nights.

Package daemon workarea_archive.go — on-disk workarea archive registry powering the Layer-3 workarea operator surface.

Wave 9 / Track A3 / ADR-2026-05-07-daemon-http-control-api.md §D4a.

Archive layout. Each archive is a directory under the daemon's archive root (default ~/.rensei/workareas/<archiveID>/) containing:

manifest.json   — metadata sidecar (id, sessionId, createdAt,
                  sizeBytes, sourceProvider, capabilities,
                  disposition); free-form extra fields permitted.
tree/           — the workarea filesystem snapshot. Diffs and
                  restores walk this subtree only; everything outside
                  it (manifest.json, daemon-private bookkeeping) is
                  ignored. The well-known .rensei/ directory under
                  tree/ is also excluded from diff walks per ADR D4a.

The registry is stateless w.r.t. process lifecycle — every call hits disk. That's fine: archive directories are small in count (operator scale), the OS dentry cache absorbs repeated listings, and avoiding in-memory state means the daemon never serves a stale view after an out-of-band write to ~/.rensei/workareas/.

Index

Constants

View Source
const CapacityRefreshInterval = 60 * time.Second

CapacityRefreshInterval is how often the daemon re-emits its capacity snapshot. Mirrors the TS CAPACITY_REFRESH_INTERVAL_MS = 60_000.

View Source
const DefaultHTTPHost = "127.0.0.1"

DefaultHTTPHost is the bind address for the control HTTP server.

View Source
const DefaultHTTPPort = 7734

DefaultHTTPPort is the port the daemon's control HTTP server binds to. Keep in sync with afclient.DefaultDaemonConfig (port 7734).

View Source
const DefaultRoutingRingBufferSize = 50

DefaultRoutingRingBufferSize is the maximum number of recent routing decisions retained for the GetConfig view. The explain endpoint key is per-session and bounded by the same ring — a session whose decision has fallen out of the ring returns 404.

View Source
const ExitCodeRestart = 3

ExitCodeRestart is the exit code the daemon uses to signal the supervisor "restart requested" after a successful binary swap. The launchd plist / systemd unit treats code 3 as a clean restart, not a crash.

View Source
const HeartbeatDefaultInterval = 30 * time.Second

HeartbeatDefaultInterval is the fallback heartbeat cadence when the orchestrator does not return one in RegisterResponse. The TS path uses 30s as the fallback; we keep that here, but `15s` is the canonical SLO target.

View Source
const RegisterEndpoint = "/api/workers/register"

RegisterEndpoint is the relative path on the platform.

View Source
const RuntimeTokenRefreshEndpoint = "/api/workers/refresh-token"

RuntimeTokenRefreshEndpoint is the (probed) platform endpoint the daemon hits to refresh an expired runtime JWT WITHOUT re-registering. The platform owes a handler at this path that:

  • accepts the registration token in the Authorization: Bearer header
  • takes the existing workerId in the URL path
  • mints a fresh runtime JWT bound to the SAME workerId
  • returns { runtimeToken, runtimeTokenExpiresAt, heartbeatInterval, pollInterval }

As of 2026-05-03 this endpoint does NOT exist on the platform side — see REN-1481 platform-companion. Until it ships the daemon probes this URL, observes a 404, and falls back to full re-register (which mints a new workerId, the bug REN-1481 originally documented). When the platform side ships the endpoint the daemon picks it up automatically with no further changes. #nosec G101 -- URL endpoint path, not a credential

View Source
const UpdateCDNBase = "https://updates.rensei.dev"

UpdateCDNBase is the base URL for the rensei CDN that hosts release manifests and binaries.

Variables

View Source
var (
	// ErrArchiveNotFound — the named archive id is not present on disk.
	ErrArchiveNotFound = errors.New("workarea archive not found")
	// ErrArchiveCorrupted — the archive exists but its manifest is
	// missing/malformed, or the tree directory cannot be walked.
	ErrArchiveCorrupted = errors.New("workarea archive corrupted")
	// ErrArchiveExists — restore would collide with an existing archive
	// entry on disk (never reached today; archives are immutable, but
	// the check is here for a future "archive on restore" code path).
	ErrArchiveExists = errors.New("workarea archive already exists")
)

WorkareaArchiveErrCode is the sentinel set used by the registry for programmatic error discrimination at the HTTP layer. Wrapped with %w so handlers can errors.Is() against them.

View Source
var DefaultRoutingWeights = afclient.RoutingWeights{Cost: 0.7, Latency: 0.3}

DefaultRoutingWeights are the cost/latency scoring weights described in 004-sandbox-capability-matrix.md §"Open questions" — 70/30 cost/latency is the documented default. The store returns these on every GetConfig call until a tenant config layer overrides them in a future wave.

View Source
var ErrKitInstallManifestNotFound = errors.New("kit install: manifest not found in fetched source")

ErrKitInstallManifestNotFound is returned when the source fetch succeeds but no *.kit.toml is locatable inside the fetched tree (or at the operator-provided KitInstallSource.ManifestPath). Maps to HTTP 422.

View Source
var ErrKitInstallSourceFetchFailed = errors.New("kit install: source fetch failed")

ErrKitInstallSourceFetchFailed is returned when the configured source fetcher fails (e.g., git clone error, network failure, unreachable remote, missing ref). Maps to HTTP 502.

View Source
var ErrKitInstallUnimplemented = errors.New("kit install: remote registry fetch not implemented in this wave")

ErrKitInstallUnimplemented is returned by KitRegistry.Install for the Wave-9 backward-compat path: a request body with no `source` block (the shape the Wave-9 smoke + handler tests POST). Wave 12 / Phase 4 keeps this sentinel reserved for that empty-body case so existing 501 assertions stay green; new federation-source kinds (tessl, agentskills, community) return ErrKitSourceFederationUnimplemented instead.

View Source
var ErrKitNotFound = errors.New("kit not found")

ErrKitNotFound is returned when a kit id is not present in the registry.

View Source
var ErrKitSourceFederationUnimplemented = errors.New("kit install: federation source kind not yet implemented")

ErrKitSourceFederationUnimplemented is returned when KitInstallRequest names a federation source kind (`tessl` / `agentskills` / `community`) that the daemon does not yet know how to fetch from. Maps to HTTP 501 — the descriptor list returned by /api/daemon/kit-sources continues to surface those kinds so operators can see the federation order.

Federation cross-repo wave is REN-1308 follow-up.

View Source
var ErrKitSourceNotFound = errors.New("kit source not found")

ErrKitSourceNotFound is returned when a kit-source name is not known.

View Source
var ErrKitTrustGateRejected = errors.New("kit install: trust gate rejected (signed-by-allowlist requires verified signature)")

ErrKitTrustGateRejected is returned by KitRegistry.Install when the configured trust mode (signed-by-allowlist or attested) refuses an unsigned or signed-but-unverified kit. Maps to HTTP 403. The trustOverride: "allowed-this-once" install field bypasses this gate for a single request (audit-logged); see kit_trust.go.

View Source
var Version = "dev"

Version is the daemon binary version reported in DaemonStatus and in the registration payload.

Now a `var` (was `const`) so the binary's main can override it via `-ldflags "-X github.com/RenseiAI/agentfactory-tui/daemon.Version=$VERSION"` at build time, OR a downstream embedder (e.g. rensei-tui's daemon run command) can pass its own version via `Options.Version` at daemon construction. The const form pinned the value to whatever agentfactory-tui's source had at vendor time, which left the `rensei-daemon-run` HTTP /api/daemon/status endpoint reporting an outdated string forever — confusing operators who saw e.g. `0.7.1` even after upgrading both binaries past it.

Default is `"dev"` so an unreleased build (or a vendored copy that forgot to inject) is obvious in status output.

Functions

func DefaultConfigPath

func DefaultConfigPath() string

DefaultConfigPath returns the canonical path to ~/.rensei/daemon.yaml.

func DefaultJWTPath

func DefaultJWTPath() string

DefaultJWTPath returns the canonical path to the cached JWT.

func DefaultKitScanPath added in v0.7.0

func DefaultKitScanPath() string

DefaultKitScanPath returns the canonical scan path for installed kits (~/.rensei/kits). Used when daemon.yaml does not declare kit.scanPaths.

func DeriveDefaultMachineID

func DeriveDefaultMachineID() string

DeriveDefaultMachineID returns a hostname-derived identifier suitable for machine.id when the user has not set one.

func IsNewerVersion

func IsNewerVersion(candidate, current string) bool

IsNewerVersion returns true if candidate is strictly newer than current according to semver-prefix comparison. Falls back to lexicographic compare if either string is not a parseable semver prefix.

func ResolvePlatformSuffix

func ResolvePlatformSuffix() string

ResolvePlatformSuffix returns "<arch>-<os>" suitable for the CDN binary filename, e.g. "arm64-darwin", "amd64-linux".

func SaveCachedJWT

func SaveCachedJWT(jwtPath string, resp *RegisterResponse, now time.Time) error

SaveCachedJWT atomically writes the response to jwtPath with 0o600 perms.

func ShouldSkipWizard

func ShouldSkipWizard() bool

ShouldSkipWizard returns true when the wizard should be bypassed:

  • stdin is not a TTY, OR
  • RENSEI_DAEMON_SKIP_WIZARD is set.

func WipeCachedJWT added in v0.7.2

func WipeCachedJWT(jwtPath string) (bool, error)

WipeCachedJWT removes the cached JWT file at jwtPath. Returns wiped=true when the file existed and was removed, wiped=false when there was no cache to remove (idempotent — safe to call from uninstall paths on systems that never had the daemon installed).

Why this exists: Register() short-circuits with the cached JWT whenever the file is present, even when the workerId in it has been invalidated by the orchestrator (worker row deleted, registration token rotated, org migrated, manual cleanup, …). Without an explicit wipe, the daemon polls the dead workerId every poll interval forever — the token-refresh fallback re-mints credentials for the same dead id rather than triggering a true re-registration. Install / uninstall paths should call this so a fresh registration handshake happens on the next daemon boot.

func WriteConfig

func WriteConfig(path string, cfg *Config) error

WriteConfig atomically writes cfg to path (tmp file + rename), creating parent directories as needed.

Types

type ActiveWorkareaProvider added in v0.7.0

type ActiveWorkareaProvider interface {
	ActiveWorkareas() []afclient.WorkareaSummary
}

ActiveWorkareaProvider exposes the daemon's live pool members in the canonical wire shape so List can union them with on-disk archives. Implementations MUST return a stable order. Empty list (zero pool members) is a perfectly valid, non-error response.

type AutoUpdateConfig

type AutoUpdateConfig struct {
	Channel             UpdateChannel  `yaml:"channel"             json:"channel"`
	Schedule            UpdateSchedule `yaml:"schedule"            json:"schedule"`
	DrainTimeoutSeconds int            `yaml:"drainTimeoutSeconds" json:"drainTimeoutSeconds"`
}

AutoUpdateConfig is the auto-update preferences block.

type BinaryVerifier

type BinaryVerifier interface {
	Verify(ctx context.Context, contentHash, signatureValue string) (valid bool, reason string)
}

BinaryVerifier is a narrow signature-verification interface. The default production verifier rejects all signatures (until REN-1314 ships a Go sigstore adapter). Tests can inject a passing verifier.

type CachedJWT

type CachedJWT struct {
	WorkerID              string `json:"workerId"`
	RuntimeToken          string `json:"runtimeToken"`
	HeartbeatInterval     int    `json:"heartbeatInterval"` // ms
	PollInterval          int    `json:"pollInterval"`      // ms
	RuntimeTokenExpiresAt string `json:"runtimeTokenExpiresAt,omitempty"`
	CachedAt              string `json:"cachedAt"`

	// Legacy fields retained so old cache files written before REN-1422
	// still load successfully. Newer writes only populate the canonical
	// platform-named fields above.
	LegacyRuntimeJWT               string `json:"runtimeJwt,omitempty"`
	LegacyHeartbeatIntervalSeconds int    `json:"heartbeatIntervalSeconds,omitempty"`
	LegacyPollIntervalSeconds      int    `json:"pollIntervalSeconds,omitempty"`
}

CachedJWT is the on-disk cache entry. We persist this between daemon runs so re-registration is skipped while the runtime token is fresh.

func LoadCachedJWT

func LoadCachedJWT(jwtPath string) (*CachedJWT, error)

LoadCachedJWT reads ~/.rensei/daemon.jwt. Returns (nil, nil) when the file does not exist or cannot be parsed.

type CapacityConfig

type CapacityConfig struct {
	MaxConcurrentSessions int                `yaml:"maxConcurrentSessions"     json:"maxConcurrentSessions"`
	MaxVCpuPerSession     int                `yaml:"maxVCpuPerSession"         json:"maxVCpuPerSession"`
	MaxMemoryMbPerSession int                `yaml:"maxMemoryMbPerSession"     json:"maxMemoryMbPerSession"`
	ReservedForSystem     ReservedSystemSpec `yaml:"reservedForSystem"         json:"reservedForSystem"`
	// PoolMaxDiskGb is the LRU-eviction trigger for the workarea pool.
	// 0 means no limit. (REN-1334.)
	PoolMaxDiskGb int `yaml:"poolMaxDiskGb,omitempty" json:"poolMaxDiskGb,omitempty"`
}

CapacityConfig is the resource envelope declared in daemon.yaml.

type CloneStrategy

type CloneStrategy string

CloneStrategy controls how the daemon clones a project repo for new workarea pool members.

const (
	CloneShallow   CloneStrategy = "shallow"
	CloneFull      CloneStrategy = "full"
	CloneReference CloneStrategy = "reference-clone"
)

Clone strategy constants.

type Config

type Config struct {
	APIVersion    string               `yaml:"apiVersion"             json:"apiVersion"`
	Kind          string               `yaml:"kind"                   json:"kind"`
	Machine       MachineConfig        `yaml:"machine"                json:"machine"`
	Capacity      CapacityConfig       `yaml:"capacity"               json:"capacity"`
	Projects      []ProjectConfig      `yaml:"projects,omitempty"     json:"projects,omitempty"`
	Orchestrator  OrchestratorConfig   `yaml:"orchestrator"           json:"orchestrator"`
	AutoUpdate    AutoUpdateConfig     `yaml:"autoUpdate"             json:"autoUpdate"`
	Observability *ObservabilityConfig `yaml:"observability,omitempty" json:"observability,omitempty"`
	// Workarea holds Layer-3 workarea-surface tunables (archive root,
	// diff streaming threshold). Optional; populated with defaults if
	// absent.
	Workarea WorkareaConfig `yaml:"workarea,omitempty"     json:"workarea,omitempty"`
	// Kit holds Layer-4 kit-surface tunables (scan paths). Optional;
	// applyDefaults seeds ScanPaths to [DefaultKitScanPath()] when
	// absent. Per ADR-2026-05-07 § D4.
	Kit KitConfig `yaml:"kit,omitempty"          json:"kit,omitempty"`
	// Trust holds the daemon-wide signature-verification policy
	// (sigstore bundle-mode verifier mode + issuer allowlist + audit
	// actor). Optional; applyDefaults seeds Mode to
	// TrustModePermissive when absent. Per WAVE12_PLAN Q2 and
	// 002-provider-base-contract.md § "Signing and trust". Lives on
	// Config (not on KitConfig) because the trust mode applies across
	// all plugin families per 015-plugin-spec.md § "Auth + trust".
	Trust TrustConfig `yaml:"trust,omitempty"        json:"trust,omitempty"`
}

Config is the in-memory representation of ~/.rensei/daemon.yaml. The wire schema mirrors the TS DaemonConfig (rensei-architecture/004 §Configuration shape).

func BuildDefaultConfigFromExisting

func BuildDefaultConfigFromExisting(existing *Config, configPath string) (*Config, error)

BuildDefaultConfigFromExisting returns a default Config (or the existing one) and optionally persists it to configPath.

func DefaultConfig

func DefaultConfig() *Config

DefaultConfig returns a minimal Config suitable as a starting point when the wizard is skipped. Capacity defaults are derived from runtime info.

func LoadConfig

func LoadConfig(path string) (*Config, error)

LoadConfig reads daemon.yaml from path. Returns (nil, nil) when the file does not exist (so callers can branch into the setup wizard / default).

func RunSetupWizard

func RunSetupWizard(opts WizardOptions) (*Config, error)

RunSetupWizard runs the interactive first-run wizard (or returns the non-interactive default when stdin is not a TTY).

type Daemon

type Daemon struct {
	// contains filtered or unexported fields
}

Daemon is the top-level supervisor. It owns the loaded Config, the HeartbeatService, the WorkerSpawner, and (optionally) the AutoUpdater.

func New

func New(opts Options) *Daemon

New constructs a Daemon. Call Start() to bring it online.

func (*Daemon) AcceptWork

func (d *Daemon) AcceptWork(spec SessionSpec) (*SessionHandle, error)

AcceptWork dispatches a session spec to the spawner.

func (*Daemon) AcceptWorkWithDetail added in v0.5.0

func (d *Daemon) AcceptWorkWithDetail(spec SessionSpec, detail *SessionDetail) (*SessionHandle, error)

AcceptWorkWithDetail dispatches a session spec to the spawner and records the per-session detail used by the spawned `af agent run` process. Pass nil detail when the caller does not have one (legacy tests); the spawner falls through to env-only inputs.

Detail is stored before spawning and removed when the spawner emits the corresponding SessionEventEnded event so stale credentials do not linger in memory.

func (*Daemon) ActiveSessions

func (d *Daemon) ActiveSessions() []SessionHandle

ActiveSessions returns a snapshot of in-flight session handles.

func (*Daemon) Config

func (d *Daemon) Config() *Config

Config returns a copy of the loaded config (or nil if not started).

func (*Daemon) Done

func (d *Daemon) Done() <-chan struct{}

Done returns a channel that is closed when the daemon has fully stopped.

func (*Daemon) EffectiveVersion added in v0.7.8

func (d *Daemon) EffectiveVersion() string

EffectiveVersion returns the version string the daemon should report in HTTP status / heartbeat / registration payloads. Resolution order: `Options.Version` (downstream embedder override) → package `Version` (which itself is "dev" unless overridden via `-ldflags -X .../daemon.Version=…`). Empty option = fall through.

func (*Daemon) Pause

func (d *Daemon) Pause()

Pause stops accepting new work without draining.

func (*Daemon) Resume

func (d *Daemon) Resume()

Resume re-enables accepting work.

func (*Daemon) RoutingTraces added in v0.7.0

func (d *Daemon) RoutingTraces() *RoutingTraceStore

RoutingTraces returns the daemon's in-process routing trace store. The eventual cross-provider scheduler records its decisions here via store.RecordDecision; the /api/daemon/routing/* HTTP surface reads from it. Exposed so test harnesses (and a future scheduler wire-up) can drive recordings without reaching through internal fields. (Wave 9 / A4.)

func (*Daemon) SessionDetail added in v0.5.0

func (d *Daemon) SessionDetail(sessionID string) (*SessionDetail, bool)

SessionDetail returns the stored per-session detail for the given session id, or (nil, false) if no detail is recorded. Used by the HTTP server's /api/daemon/sessions/<id> handler.

func (*Daemon) SetWorkareaArchiveRegistry added in v0.7.0

func (d *Daemon) SetWorkareaArchiveRegistry(reg *WorkareaArchiveRegistry)

SetWorkareaArchiveRegistry replaces the daemon's archive registry with the provided one. Used by tests + by the future pool wire-up (REN-1280) to inject an ActiveWorkareaProvider that sees the live pool.

func (*Daemon) Start

func (d *Daemon) Start(ctx context.Context) error

Start brings the daemon online: load config (or wizard), register, start heartbeat, and start the spawner. The HTTP server is NOT started here; callers do that explicitly via Server.Start so they can pick the bind.

func (*Daemon) StartedAt

func (d *Daemon) StartedAt() time.Time

StartedAt returns the daemon's UTC start time (zero before Start()).

func (*Daemon) State

func (d *Daemon) State() State

State returns the current lifecycle state.

func (*Daemon) Stop

func (d *Daemon) Stop(_ context.Context) error

Stop performs a graceful shutdown: drain in-flight sessions, stop loops, and transition to stopped. The context is currently unused but is retained for future use (e.g. cancelling drain via ctx.Done).

func (*Daemon) Update

func (d *Daemon) Update(ctx context.Context) (*UpdateResult, error)

Update triggers a manual auto-update check.

Behavior: drain → fetch manifest → verify → swap → exit (3). If no update is available the call is idempotent and the daemon transitions back to running. If signature verification fails, the swap is aborted and an error is returned. The caller (HTTP handler) typically returns the outcome to the client and may then call Stop().

func (*Daemon) WorkerID

func (d *Daemon) WorkerID() string

WorkerID returns the assigned worker ID (empty until registered).

type EvictHandler

type EvictHandler interface {
	Evict(ctx context.Context, req afclient.EvictPoolRequest) (*afclient.EvictPoolResponse, error)
}

EvictHandler executes a pool eviction request and returns the response.

type HeartbeatOptions

type HeartbeatOptions struct {
	WorkerID        string
	Hostname        string
	OrchestratorURL string
	// RuntimeJWT is the runtime token (a JWT) returned by /api/workers/register
	// and sent in Authorization: Bearer on every heartbeat.
	RuntimeJWT      string
	IntervalSeconds int
	GetActiveCount  func() int
	GetMaxCount     func() int
	GetStatus       func() RegistrationStatus
	Region          string

	// HTTPClient is the client used for the real-endpoint call.
	HTTPClient *http.Client
	// LogWarn is called when the real-endpoint call fails (transient
	// failures are non-fatal — the platform will detect via missed
	// heartbeats and Redis TTL expiry).
	LogWarn func(format string, args ...any)
	// Now provides the heartbeat sentAt timestamp.
	Now func() time.Time
	// OnHeartbeat is invoked after each heartbeat payload is composed
	// (whether or not the network call succeeded). Used by tests and
	// observability.
	OnHeartbeat func(payload HeartbeatPayload)
	// OnReregister is called when the runtime token is rejected (HTTP 401)
	// or the worker is reported missing (HTTP 404 — likely Redis TTL
	// expired). Implementations re-issue Register() against the platform
	// and return the fresh worker id + runtime token. Returning a non-nil
	// error leaves the heartbeat in its prior state and logs via LogWarn;
	// the next tick retries the heartbeat with the stale token (which will
	// fail again and re-trigger this path).
	//
	// Required when the daemon runs against a real platform; tests that
	// only exercise the local stub path can leave it nil.
	OnReregister func(ctx context.Context) (workerID, runtimeJWT string, err error)
}

HeartbeatOptions configure a HeartbeatService.

type HeartbeatPayload

type HeartbeatPayload struct {
	WorkerID       string             `json:"workerId"`
	Hostname       string             `json:"hostname"`
	Status         RegistrationStatus `json:"status"`
	ActiveSessions int                `json:"activeSessions"`
	MaxSessions    int                `json:"maxSessions"`
	Region         string             `json:"region,omitempty"`
	SentAt         string             `json:"sentAt"`
}

HeartbeatPayload is the body sent on POST /v1/daemon/heartbeat.

type HeartbeatService

type HeartbeatService struct {
	// contains filtered or unexported fields
}

HeartbeatService manages the periodic heartbeat goroutine. It is safe to Start / Stop multiple times; consecutive Starts are idempotent.

func NewHeartbeatService

func NewHeartbeatService(opts HeartbeatOptions) *HeartbeatService

NewHeartbeatService constructs a HeartbeatService from opts. Required callbacks are GetActiveCount, GetMaxCount, and GetStatus.

func (*HeartbeatService) CurrentCredentials added in v0.4.0

func (h *HeartbeatService) CurrentCredentials() (workerID, runtimeJWT string)

CurrentCredentials returns the worker id and runtime JWT currently in use. They may differ from the values passed at construction time after a re-register on 401.

func (*HeartbeatService) IsRunning

func (h *HeartbeatService) IsRunning() bool

IsRunning reports whether the heartbeat goroutine is active.

func (*HeartbeatService) LastPayload

func (h *HeartbeatService) LastPayload() HeartbeatPayload

LastPayload returns the most recently composed heartbeat payload (for debugging / status surfaces).

func (*HeartbeatService) Start

func (h *HeartbeatService) Start()

Start launches the heartbeat goroutine. It sends an immediate heartbeat, then continues at IntervalSeconds. Subsequent calls are no-ops.

func (*HeartbeatService) Stop

func (h *HeartbeatService) Stop()

Stop terminates the heartbeat goroutine. Safe to call multiple times.

type KitConfig added in v0.7.1

type KitConfig struct {
	// ScanPaths is the ordered list of directories the kit registry walks
	// to find installed kits. Empty / absent means [DefaultKitScanPath()]
	// (resolved by applyDefaults).
	ScanPaths []string `yaml:"scanPaths,omitempty" json:"scanPaths,omitempty"`
}

KitConfig configures the Layer-4 kit operator surface — the scan paths the daemon walks to discover installed kits. Wave 11 / ADR-2026-05-07 § D4. ScanPaths are evaluated in declaration order; the first entry is also where the .state.json sidecar (enable/disable toggles) lives. A leading `~/` is expanded to the user's home directory by NewKitRegistry.

type KitRegistry added in v0.7.0

type KitRegistry struct {
	// contains filtered or unexported fields
}

KitRegistry is a minimal in-process Kit registry.

Methods are safe for concurrent use. The registry rescans on every List call so newly-installed manifests appear without a daemon restart; this is acceptable for an operator-facing surface where call volume is low.

func NewKitRegistry added in v0.7.0

func NewKitRegistry(scanPaths []string) *KitRegistry

NewKitRegistry constructs a KitRegistry with permissive trust mode.

scanPaths defaults to []string{DefaultKitScanPath()} when nil or empty. The first scan path is also where the .state.json sidecar lives.

Equivalent to NewKitRegistryWithTrust(scanPaths, TrustConfig{Mode: TrustModePermissive}). Callers wiring trust modes (or an issuer allowlist) from daemon.yaml should use NewKitRegistryWithTrust.

func NewKitRegistryWithTrust added in v0.7.1

func NewKitRegistryWithTrust(scanPaths []string, trust TrustConfig) *KitRegistry

NewKitRegistryWithTrust constructs a KitRegistry with the given trust configuration. Used by Server.kitRegistryOrEmpty to thread the daemon.Config().Trust block into the registry.

If the verifier fails to construct (e.g., the embedded trust root JSON fails to parse), a permissive verifier with no trusted material is installed instead — every signed manifest reports SignedUnverified, every unsigned reports Unsigned, and the install gate behaves as if Mode=Permissive. The construction error is logged via slog.Warn so operators can diagnose.

func (*KitRegistry) Disable added in v0.7.0

func (r *KitRegistry) Disable(id string) (afclient.Kit, error)

Disable marks the kit disabled in the persisted state. Returns the updated Kit summary or ErrKitNotFound when the id is unknown.

func (*KitRegistry) DisableSource added in v0.7.0

func (r *KitRegistry) DisableSource(name string) (afclient.KitRegistrySource, error)

DisableSource toggles a registry source off.

func (*KitRegistry) Enable added in v0.7.0

func (r *KitRegistry) Enable(id string) (afclient.Kit, error)

Enable marks the kit active in the persisted state. Returns the updated Kit summary or ErrKitNotFound when the id is unknown.

func (*KitRegistry) EnableSource added in v0.7.0

func (r *KitRegistry) EnableSource(name string) (afclient.KitRegistrySource, error)

EnableSource toggles a registry source on. Returns ErrKitSourceNotFound if the name is not in the federation list.

func (*KitRegistry) Get added in v0.7.0

Get returns the full manifest for a single kit id. Returns ErrKitNotFound when the id is not registered.

func (*KitRegistry) Install added in v0.7.0

Install fetches a kit from the operator-supplied source, runs the trust-gated verifier against the freshly-fetched manifest, and (when the gate allows) persists the manifest + sibling .sigstore bundle into the first configured scan path.

Behaviour by request shape (audit § 2.1, § 2.2):

  • req.Source == nil — the Wave-9 backward-compat path. Returns ErrKitInstallUnimplemented (HTTP 501) so the existing Wave-9 smoke + handler tests posting `{}` keep their assertions intact.
  • req.Source.Kind == "git" — clone source.URL @ source.Ref into a temp dir (via gitKitFetcher), locate the manifest, run the verifier, gate on r.verifier.config.Mode, persist into scanPaths[0]. Errors map to ErrKitInstallSourceFetchFailed (502) or ErrKitInstallManifestNotFound (422).
  • req.Source.Kind == "tessl" / "agentskills" / "community" — federation cross-repo wave (REN-1308 follow-up). Returns ErrKitSourceFederationUnimplemented (HTTP 501).
  • Any other kind — wrapped fmt error (handler-mapped to 400).

Trust override: `req.TrustOverride == "allowed-this-once"` bypasses the gate for a single install with structured slog audit logging. Otherwise an unsigned/unverified manifest under a non-permissive trust mode returns ErrKitTrustGateRejected (HTTP 403).

Manifest persistence uses the atomic tmp-then-rename pattern to match the kit_state writer at saveStateLocked. The on-disk filename is `<sanitizedID>.kit.toml` where slashes in the manifest's `kit.id` are replaced with `__` (the manifest's internal `kit.id` retains the canonical slash form).

func (*KitRegistry) List added in v0.7.0

func (r *KitRegistry) List() []afclient.Kit

List returns all installed kits across all scan paths. Malformed manifests log a warning and are excluded. Empty scan paths return an empty slice with no error.

func (*KitRegistry) ListSources added in v0.7.0

func (r *KitRegistry) ListSources() []afclient.KitRegistrySource

ListSources returns the federation order's registry source descriptors. Persisted disable state from .state.json is applied to the Enabled flag.

func (*KitRegistry) ScanPaths added in v0.7.0

func (r *KitRegistry) ScanPaths() []string

ScanPaths returns the registry's scan paths in declaration order.

func (*KitRegistry) VerifySignature added in v0.7.0

func (r *KitRegistry) VerifySignature(id string) (afclient.KitSignatureResult, error)

VerifySignature returns a KitSignatureResult for the kit, driven by the sigstore bundle-mode verifier (Wave 12 / S2). The verifier reads the sibling `<manifest>.sigstore` file alongside the kit manifest; missing-bundle returns KitTrustUnsigned with OK: true. Verification outcomes map to KitTrustSignedVerified / KitTrustSignedUnverified; see kit_trust.go for the full state machine.

type MachineConfig

type MachineConfig struct {
	ID     string `yaml:"id"               json:"id"`
	Region string `yaml:"region,omitempty" json:"region,omitempty"`
}

MachineConfig captures the machine identity block from daemon.yaml.

type ObservabilityConfig

type ObservabilityConfig struct {
	LogFormat   string `yaml:"logFormat,omitempty"   json:"logFormat,omitempty"`
	LogPath     string `yaml:"logPath,omitempty"     json:"logPath,omitempty"`
	MetricsPort int    `yaml:"metricsPort,omitempty" json:"metricsPort,omitempty"`
}

ObservabilityConfig holds optional log/metrics tuning.

type Options

type Options struct {
	// ConfigPath is where to load / persist daemon.yaml. Defaults to
	// DefaultConfigPath().
	ConfigPath string
	// JWTPath is where to cache the runtime JWT. Defaults to
	// DefaultJWTPath().
	JWTPath string
	// SkipWizard, when true, prevents the interactive wizard from running
	// even when stdin is a TTY. The default config (or existing config) is
	// used instead.
	SkipWizard bool
	// SkipRegistration, when true, skips the registration call (used when
	// the daemon is being started in setup-only or config-only modes).
	SkipRegistration bool
	// SpawnerOptions overrides the default spawner options. The Projects
	// and MaxConcurrentSessions fields are populated automatically from
	// loaded config.
	SpawnerOptions SpawnerOptions
	// HTTPHost overrides the default control server bind address.
	HTTPHost string
	// HTTPPort overrides the default control server port.
	//
	// Zero means "ephemeral port": the listener binds 127.0.0.1:0 and
	// the kernel picks a free port. The effective bound port is then
	// available via Server.Addr() after Server.Start succeeds.
	// Production callers (afcli/daemon_run.go) substitute the
	// well-known DefaultHTTPPort (7734) themselves before constructing
	// Options so operator behaviour is preserved; the daemon library
	// itself does NOT auto-fill — leaving zero-as-ephemeral makes
	// parallel tests collision-free under -race.
	HTTPPort int
	// PoolStatsProvider returns the current workarea pool snapshot. May be
	// nil — the /api/daemon/pool/stats endpoint will return an empty
	// snapshot in that case (acceptance criterion: pool integration is
	// optional in the runtime port; full WorkareaProvider wiring is REN-1280).
	PoolStatsProvider PoolStatsProvider
	// EvictHandler handles pool eviction requests. May be nil; the endpoint
	// returns 501 in that case.
	EvictHandler EvictHandler
	// ProviderRegistry exposes the daemon's locally-registered AgentRuntime
	// providers (claude/codex/ollama/opencode/gemini/amp/stub) to the
	// /api/daemon/providers* surface. May be nil — the endpoint will then
	// return an empty list with PartialCoverage=true, which is the correct
	// behaviour for a daemon that has not yet wired its runtime registry.
	// Wave 9 / ADR-2026-05-07-daemon-http-control-api.md §D4.
	ProviderRegistry ProviderRegistry

	// Version overrides the package-level `Version` for status reporting.
	// Empty falls back to the package var (which itself defaults to "dev"
	// unless the build injected via -ldflags). Downstream embedders that
	// ship their own binary (e.g. the rensei daemon) should set this to
	// their own version string so /api/daemon/status reports the
	// running binary, not whatever string agentfactory-tui's vendored
	// source had at the time.
	Version string
}

Options configure a Daemon.

type OrchestratorConfig

type OrchestratorConfig struct {
	URL       string `yaml:"url"                 json:"url"`
	AuthToken string `yaml:"authToken,omitempty" json:"authToken,omitempty"`
}

OrchestratorConfig is the orchestrator URL + registration token block.

type PollHTTPError added in v0.4.1

type PollHTTPError struct {
	Status int
	Body   string
}

PollHTTPError is returned by callPollEndpoint for non-2xx responses so the loop can branch on the HTTP status (401 → re-register).

func (*PollHTTPError) Error added in v0.4.1

func (e *PollHTTPError) Error() string

type PollOptions added in v0.4.1

type PollOptions struct {
	WorkerID        string
	OrchestratorURL string
	RuntimeJWT      string
	IntervalSeconds int

	// HTTPClient is the client used for poll calls. Defaults to a 30s-timeout
	// http.Client.
	HTTPClient *http.Client
	// LogWarn is called for transient poll failures. Defaults to no-op.
	LogWarn func(format string, args ...any)
	// LogInfo is called when work is dispatched / re-register fires.
	LogInfo func(format string, args ...any)
	// OnWork is invoked for each item returned in the work[] slice. Errors are
	// logged at warn and do not stop the loop. Required.
	OnWork func(item PollWorkItem) error
	// OnReregister is called on HTTP 401 (runtime JWT expired) or 404 (worker
	// fell out of Redis). Implementations re-issue Register() and return the
	// fresh worker id + runtime token. The poll loop swaps credentials and
	// continues. Returning an error logs and the loop retries on the next tick.
	OnReregister func(ctx context.Context) (workerID, runtimeJWT string, err error)
}

PollOptions configure a single poll loop run.

type PollResponse added in v0.4.1

type PollResponse struct {
	Work              []PollWorkItem `json:"work"`
	HasInboxMessages  bool           `json:"hasInboxMessages,omitempty"`
	PreClaimed        bool           `json:"preClaimed,omitempty"`
	ClaimedSessionIDs []string       `json:"claimedSessionIds,omitempty"`
}

PollResponse is the body of GET /api/workers/<id>/poll. Only the fields the daemon currently consumes are decoded; unknown fields are ignored.

type PollService added in v0.4.1

type PollService struct {
	// contains filtered or unexported fields
}

PollService manages the periodic poll goroutine. Like HeartbeatService it is safe to Start / Stop multiple times; consecutive Starts are idempotent.

func NewPollService added in v0.4.1

func NewPollService(opts PollOptions) *PollService

NewPollService constructs a PollService from opts. OnWork must be non-nil.

func (*PollService) IsRunning added in v0.4.1

func (p *PollService) IsRunning() bool

IsRunning reports whether the poll goroutine is active.

func (*PollService) Start added in v0.4.1

func (p *PollService) Start()

Start launches the poll goroutine. Subsequent calls are no-ops.

func (*PollService) Stop added in v0.4.1

func (p *PollService) Stop()

Stop terminates the poll goroutine. Safe to call multiple times.

type PollStageBudget added in v0.5.5

type PollStageBudget struct {
	MaxDurationSeconds int   `json:"maxDurationSeconds,omitempty"`
	MaxSubAgents       int   `json:"maxSubAgents,omitempty"`
	MaxTokens          int64 `json:"maxTokens,omitempty"`
}

PollStageBudget mirrors the platform's StageBudget shape so the daemon can decode + forward it without depending on the runner package (cardinal package-architecture rule: daemon does not import runner). The runner re-types this into prompt.StageBudget when it constructs the QueuedWork. (REN-1485 / REN-1487.)

type PollWorkItem added in v0.4.1

type PollWorkItem struct {
	SessionID    string            `json:"sessionId"`
	ProjectName  string            `json:"projectName,omitempty"`
	Repository   string            `json:"repository,omitempty"`
	Ref          string            `json:"ref,omitempty"`
	Priority     int               `json:"priority,omitempty"`
	Env          map[string]string `json:"env,omitempty"`
	MaxDuration  int               `json:"maxDurationSeconds,omitempty"`
	Resources    *SessionResources `json:"resources,omitempty"`
	QueuedAt     int64             `json:"queuedAt,omitempty"`
	ProjectScope string            `json:"projectScope,omitempty"`

	// REN-1461 / F.2.8 — enriched fields the platform may send so the
	// `af agent run` worker has the runner context it needs without
	// requiring a separate platform fetch. Optional during the rollout
	// window; absent fields fall through to the default render path.
	IssueID           string                  `json:"issueId,omitempty"`
	IssueIdentifier   string                  `json:"issueIdentifier,omitempty"`
	LinearSessionID   string                  `json:"linearSessionId,omitempty"`
	ProviderSessionID string                  `json:"providerSessionId,omitempty"`
	OrganizationID    string                  `json:"organizationId,omitempty"`
	WorkType          string                  `json:"workType,omitempty"`
	PromptContext     string                  `json:"promptContext,omitempty"`
	Body              string                  `json:"body,omitempty"`
	Title             string                  `json:"title,omitempty"`
	MentionContext    string                  `json:"mentionContext,omitempty"`
	ParentContext     string                  `json:"parentContext,omitempty"`
	Branch            string                  `json:"branch,omitempty"`
	ResolvedProfile   *SessionResolvedProfile `json:"resolvedProfile,omitempty"`

	// REN-1485 / REN-1487 Phase 2 stage-driven SDLC fields. Populated
	// by the platform's `agent.dispatch_stage` action; absent when the
	// work was queued by the legacy `agent.dispatch_to_queue` action.
	// Round-trip opaquely on the QueuedWork JSON; the daemon forwards
	// them onto SessionDetail without interpreting them.
	StagePrompt        string           `json:"stagePrompt,omitempty"`
	StageID            string           `json:"stageId,omitempty"`
	StageBudget        *PollStageBudget `json:"stageBudget,omitempty"`
	StageLifecycle     map[string]any   `json:"stageLifecycle,omitempty"`
	StageSourceEventID string           `json:"stageSourceEventId,omitempty"`
}

PollWorkItem mirrors one element of the platform's poll response `work[]` array. The platform serves GET /api/workers/<id>/poll and returns:

{
  work: QueuedWork[],
  inboxMessages: { [sessionId]: InboxMessage[] },
  hasInboxMessages: boolean,
  preClaimed: boolean,
  claimedSessionIds: string[],
  gitCredentials: { token, cloneUrl, expiresAt }[],
}

QueuedWork carries the session-spec fields the daemon needs to dispatch a session to the spawner. Field names follow the platform wire shape (camelCase).

QueuedAt is a Unix-millisecond epoch number on the wire — the platform's QueuedWork interface (packages/agentfactory-server work-queue.ts) defines it as `queuedAt: number`, and the Redis-stored session payload confirms a numeric value (e.g. 1777658441780). v0.4.1 mistakenly typed it as `string`, which caused the daemon's poll loop to fail decoding ("cannot unmarshal number into Go struct field PollWorkItem.work.queuedAt of type string") and silently drop pre-claimed sessions.

type PoolCapacityGuard added in v0.7.0

type PoolCapacityGuard interface {
	// CheckCapacity returns nil + zero retryAfter when a new member
	// fits, or a non-zero retryAfter and an explanatory error when the
	// pool is saturated.
	CheckCapacity() (retryAfter time.Duration, err error)
}

PoolCapacityGuard tells Restore whether a fresh pool member can be admitted. Returning a non-zero retryAfter indicates saturation — Restore propagates that to the HTTP handler as 503 + Retry-After.

type PoolStatsProvider

type PoolStatsProvider interface {
	Stats(ctx context.Context) (*afclient.WorkareaPoolStats, error)
}

PoolStatsProvider returns a workarea pool snapshot.

type PrefixWriterFunc

type PrefixWriterFunc func(workerID, line string)

PrefixWriterFunc adapts a function to PrefixedWriter.

func (PrefixWriterFunc) WriteWorkerLine

func (f PrefixWriterFunc) WriteWorkerLine(workerID, line string)

WriteWorkerLine implements PrefixedWriter.

type PrefixedWriter

type PrefixedWriter interface {
	WriteWorkerLine(workerID, line string)
}

PrefixedWriter is the minimal sink interface used by the spawner to emit child stdout/stderr. Implementations are responsible for prefixing each line with the worker tag.

type ProjectConfig

type ProjectConfig struct {
	ID            string        `yaml:"id"                       json:"id"`
	Repository    string        `yaml:"repository"               json:"repository"`
	CloneStrategy CloneStrategy `yaml:"cloneStrategy,omitempty"  json:"cloneStrategy,omitempty"`
	Git           *ProjectGit   `yaml:"git,omitempty"            json:"git,omitempty"`
}

ProjectConfig describes one entry in the project allowlist.

func (*ProjectConfig) UnmarshalYAML added in v0.4.0

func (p *ProjectConfig) UnmarshalYAML(node *yaml.Node) error

UnmarshalYAML accepts either the canonical `repository` key or the legacy `repoUrl` key (pre-REN-1419 daemon.yaml files written by older versions of `rensei project allow`). When the legacy key is found a one-line warning is logged so operators know to rewrite the file; this back-compat shim is scheduled for removal one release after the canonical writer ships.

type ProjectGit

type ProjectGit struct {
	CredentialHelper string `yaml:"credentialHelper,omitempty" json:"credentialHelper,omitempty"`
	SSHKey           string `yaml:"sshKey,omitempty"           json:"sshKey,omitempty"`
}

ProjectGit captures per-project credential helper / SSH key hints.

type ProviderRegistry added in v0.7.0

type ProviderRegistry interface {
	// Names returns the sorted list of registered provider name strings.
	// Each name is the canonical agent.ProviderName string (e.g. "claude",
	// "codex"). Order is stable across calls.
	Names() []string
	// Capabilities returns the typed capability struct serialised to a
	// flat map[string]any for the named provider. ok is false when the
	// provider is not registered. The map shape matches the JSON encoding
	// of agent.Capabilities so the wire shape on /api/daemon/providers
	// matches the contract.
	Capabilities(name string) (caps map[string]any, ok bool)
}

ProviderRegistry is the minimal read-only view of the runner's in-process AgentRuntime registry the /api/daemon/providers handler consumes. The daemon imports a satisfying type from runner.Registry — the interface keeps this package free of a runner import cycle. (Wave 9 / A1.)

type RefreshTokenResult added in v0.5.5

type RefreshTokenResult struct {
	// Mode is the path the refresh actually took: "refresh" (platform
	// honoured the refresh probe and minted a new JWT bound to the
	// same workerId), "reregister" (probe returned 404 / endpoint
	// missing — the daemon fell back to full POST /api/workers/register
	// and got a NEW workerId), or "error" (both paths failed).
	Mode string

	// WorkerID is the worker id in effect after the refresh attempt.
	// On Mode=refresh this is the SAME workerId; on Mode=reregister
	// it's a fresh one.
	WorkerID string

	// RuntimeToken is the fresh runtime JWT.
	RuntimeToken string

	// RegistrationTokenSwapped is true when Mode=reregister produced a
	// different workerId. Operators care about this signal because the
	// platform forgets the old workerId after a fresh registration —
	// any in-flight heartbeats / polls keyed on it 404 until the daemon
	// swaps credentials. (REN-1481 root cause.)
	RegistrationTokenSwapped bool

	// Reason is the structured reason the refresh path was taken
	// (e.g. "runtime-token-expired", "worker-not-found"). Surfaces in
	// the [runtime-token] log line.
	Reason string
}

RefreshTokenResult is the outcome of an attempted runtime-token refresh. The OnReregister callback wired into HeartbeatService and PollService synthesises one of these per attempt; logged via the `[runtime-token]` structured line.

func RefreshRuntimeToken added in v0.5.5

func RefreshRuntimeToken(
	ctx context.Context,
	regOpts RegistrationOptions,
	currentWorkerID string,
	reason string,
) (*RefreshTokenResult, error)

RefreshRuntimeToken attempts to refresh the daemon's runtime JWT without re-registering — i.e. preserving the workerId. This is the REN-1481 fix path. Behaviour:

  1. Probe POST /api/workers/<id>/refresh-token with the registration token in the Authorization: Bearer header. On 200, the platform has minted a fresh JWT bound to the same workerId — best case.
  2. On 404 (endpoint missing — current platform-side state) or 405 (method not allowed), fall through to FULL re-register via Register(ForceReregister=true). The runtime token gets refreshed but at the cost of a new workerId.
  3. On any other failure (5xx, network, 401-on-registration-token), return an error. Caller logs + retries on next tick.

This is the only path that should call Register() with ForceReregister=true outside boot. All in-flight 401/404 detection in HeartbeatService / PollService routes through here so the `[runtime-token]` log line is the single source of truth for operators investigating the 5-minute cycle in REN-1481.

type RegisterRequest

type RegisterRequest struct {
	MachineID string   `json:"machineId,omitempty"`
	Hostname  string   `json:"hostname"`
	Capacity  int      `json:"capacity"`
	Version   string   `json:"version,omitempty"`
	Projects  []string `json:"projects,omitempty"`
}

RegisterRequest is the JSON body sent on POST /api/workers/register.

The platform contract (see platform/src/app/api/workers/register/route.ts):

{ machineId?: string, hostname: string, capacity: number, version?: string, projects?: string[] }

The registration token is sent in the Authorization: Bearer header, NOT in the body. Status / region / capabilities / activeAgentCount are not part of the platform contract — they live in the heartbeat payload, or are inferred from the project's Linear tracker bindings on the server side.

type RegisterResponse

type RegisterResponse struct {
	WorkerID              string `json:"workerId"`
	HeartbeatInterval     int    `json:"heartbeatInterval"` // ms
	PollInterval          int    `json:"pollInterval"`      // ms
	RuntimeToken          string `json:"runtimeToken"`
	RuntimeTokenExpiresAt string `json:"runtimeTokenExpiresAt,omitempty"`
}

RegisterResponse is the JSON response from POST /api/workers/register.

Platform contract:

{ workerId, heartbeatInterval (ms), pollInterval (ms),
  runtimeToken, runtimeTokenExpiresAt }

Field names mirror the wire shape; helper methods provide seconds-based accessors used by the heartbeat scheduler.

func Register

Register dials the platform (or the stub path) and returns a RegisterResponse. The cache at jwtPath is consulted first unless opts.ForceReregister is set.

Real-platform registration is the default. The stub path is taken when:

  • RENSEI_DAEMON_FORCE_STUB env is set (e.g. =1), OR
  • the orchestrator URL is "file://...", OR
  • the registration token does not start with rsp_live_ or rsk_live_.

REN-1444 (v0.4.1) inverted the env-gate from opt-in to opt-out. The previous default required RENSEI_DAEMON_REAL_REGISTRATION=1 in the launchd plist; with that env unset, a daemon configured with a real rsk_live_* token would silently fall back to stub mode and never register against the platform.

func (*RegisterResponse) HeartbeatIntervalSeconds

func (r *RegisterResponse) HeartbeatIntervalSeconds() int

HeartbeatIntervalSeconds returns the heartbeat cadence in seconds (rounded up). The platform reports the cadence in milliseconds; daemon code that schedules tickers historically worked in seconds.

func (*RegisterResponse) PollIntervalSeconds

func (r *RegisterResponse) PollIntervalSeconds() int

PollIntervalSeconds returns the poll cadence in seconds (rounded up).

type RegistrationOptions

type RegistrationOptions struct {
	OrchestratorURL   string
	RegistrationToken string
	MachineID         string
	Hostname          string
	Version           string
	MaxAgents         int
	Capabilities      []string
	Region            string
	JWTPath           string
	ForceReregister   bool

	// HTTPClient is the client used when the real (non-stub) path is taken.
	// Defaults to http.DefaultClient with a 10s timeout.
	HTTPClient *http.Client
	// Now lets tests deterministically clock the cached-at timestamp.
	Now func() time.Time
}

RegistrationOptions configure a single Register call.

type RegistrationStatus

type RegistrationStatus string

RegistrationStatus is the worker-status string sent to the orchestrator in the heartbeat payload. Mirrors the TS DaemonRegistrationStatus.

const (
	RegistrationIdle     RegistrationStatus = "idle"
	RegistrationBusy     RegistrationStatus = "busy"
	RegistrationDraining RegistrationStatus = "draining"
)

Registration status constants.

type ReservedSystemSpec

type ReservedSystemSpec struct {
	VCpu     int `yaml:"vCpu"     json:"vCpu"`
	MemoryMb int `yaml:"memoryMb" json:"memoryMb"`
}

ReservedSystemSpec describes resources reserved for the host OS.

type RoutingTraceStore added in v0.7.0

type RoutingTraceStore struct {
	// contains filtered or unexported fields
}

RoutingTraceStore is the in-process record of routing decisions. The scheduler (or, in this wave, the test harness) feeds it via RecordDecision; HTTP handlers read via GetConfig and Explain.

The store is safe for concurrent use.

func NewRoutingTraceStore added in v0.7.0

func NewRoutingTraceStore(ringSize int) *RoutingTraceStore

NewRoutingTraceStore constructs a store with the given ring-buffer size. ringSize ≤ 0 falls back to DefaultRoutingRingBufferSize.

func (*RoutingTraceStore) Explain added in v0.7.0

Explain returns the recorded decision and trace for sessionID. Returns false when the session has no recorded decision (or the decision has been evicted from the ring).

func (*RoutingTraceStore) GetConfig added in v0.7.0

func (s *RoutingTraceStore) GetConfig(providerNames []string, capturedAt time.Time) afclient.RoutingConfig

GetConfig builds the wire-shape RoutingConfig for the /api/daemon/routing/config endpoint. It composes the static portions (weights, capability filters, sandbox/LLM provider state) with the rolling RecentDecisions tail.

The provider-state surfaces are seeded from the runner.Registry's Names() (passed in via providerNames) — this represents AgentRuntime providers. The sandbox state lists only "local" because that's the only OSS-shipped sandbox in this wave. Both lists default to Thompson-Sampling priors (alpha=1, beta=1) when no decisions have been recorded.

capturedAt sets the snapshot timestamp; pass time.Now().UTC() in production.

func (*RoutingTraceStore) Len added in v0.7.0

func (s *RoutingTraceStore) Len() int

Len returns the current number of recorded decisions in the ring buffer. Test-only helper.

func (*RoutingTraceStore) RecordDecision added in v0.7.0

func (s *RoutingTraceStore) RecordDecision(decision afclient.RoutingDecision, trace []afclient.RoutingTraceStep)

RecordDecision appends decision + trace to the store. If the store is already at ring capacity, the oldest entry is evicted from both the ring and the per-session lookup. Recording with an empty SessionID is allowed (the ring still tracks it) but the explain lookup is keyed by SessionID, so an unkeyed entry is invisible to Explain.

type Server

type Server struct {
	// contains filtered or unexported fields
}

Server is the daemon's HTTP control API. It wraps a Daemon and exposes the endpoints consumed by `af daemon …` and `rensei daemon …`.

func NewServer

func NewServer(d *Daemon) *Server

NewServer builds an HTTP server for d. The handler is registered but the server is not yet listening — call Start to bind.

func (*Server) Addr

func (s *Server) Addr() string

Addr returns the address the server is bound to (after Start succeeds).

func (*Server) Shutdown

func (s *Server) Shutdown(ctx context.Context) error

Shutdown gracefully shuts down the HTTP server.

func (*Server) Start

func (s *Server) Start() (<-chan error, error)

Start binds the listener and serves in a goroutine. Errors during accept are reported via the returned channel — callers should select on it alongside their own shutdown signal.

type SessionDetail added in v0.5.0

type SessionDetail struct {
	// SessionID is the platform session UUID. Always populated.
	SessionID string `json:"sessionId"`

	// IssueID is the Linear issue UUID this session was triggered for.
	IssueID string `json:"issueId,omitempty"`

	// IssueIdentifier is the human-readable Linear identifier
	// (e.g. "REN-1457").
	IssueIdentifier string `json:"issueIdentifier,omitempty"`

	// LinearSessionID is the Linear-side agent-session id.
	LinearSessionID string `json:"linearSessionId,omitempty"`

	// ProviderSessionID is the provider-native session id when this
	// is a resume (e.g. Claude session UUID).
	ProviderSessionID string `json:"providerSessionId,omitempty"`

	// ProjectName is the canonical Linear project identifier.
	ProjectName string `json:"projectName,omitempty"`

	// OrganizationID is the Rensei tenant UUID.
	OrganizationID string `json:"organizationId,omitempty"`

	// Repository is the git URL (or owner/name slug) the agent should
	// operate on.
	Repository string `json:"repository,omitempty"`

	// Ref is the base branch / ref to check out from.
	Ref string `json:"ref,omitempty"`

	// WorkType is the workflow discriminant ("development", "qa",
	// "research", ...).
	WorkType string `json:"workType,omitempty"`

	// PromptContext is the rendered Linear issue context block produced
	// by the platform-side dispatcher.
	PromptContext string `json:"promptContext,omitempty"`

	// Body is the raw Linear issue description text.
	Body string `json:"body,omitempty"`

	// Title is the Linear issue title.
	Title string `json:"title,omitempty"`

	// MentionContext is the optional user-mention text from the Linear
	// agent-session create event.
	MentionContext string `json:"mentionContext,omitempty"`

	// ParentContext is the optional parent-issue context block built
	// by the coordinator when this session is a sub-agent.
	ParentContext string `json:"parentContext,omitempty"`

	// Branch is the working branch the agent should create/use.
	Branch string `json:"branch,omitempty"`

	// ResolvedProfile carries the model-profile knobs the platform
	// resolved before queueing this work. Daemon stores opaquely.
	ResolvedProfile *SessionResolvedProfile `json:"resolvedProfile,omitempty"`

	// WorkerID is the daemon worker id that claimed this session.
	WorkerID string `json:"workerId,omitempty"`

	// AuthToken is the runtime JWT the runner uses for platform API
	// calls (heartbeat, result post). Scoped to this worker.
	AuthToken string `json:"authToken,omitempty"`

	// PlatformURL is the base URL of the platform.
	PlatformURL string `json:"platformUrl,omitempty"`

	// StagePrompt is the pre-rendered user-prompt body the platform
	// dispatcher built from the stage prompt template. When present
	// the runner uses it verbatim and skips the embedded user template.
	StagePrompt string `json:"stagePrompt,omitempty"`

	// StageID is the canonical stage id (e.g. "research",
	// "development", "qa"). Used for log correlation + env injection.
	StageID string `json:"stageId,omitempty"`

	// StageBudget is the per-stage runtime budget the runner enforces.
	StageBudget *PollStageBudget `json:"stageBudget,omitempty"`

	// StageLifecycle is the lifecycle config for the workflow this
	// stage instance belongs to. Forwarded opaquely on WORK_RESULT.
	StageLifecycle map[string]any `json:"stageLifecycle,omitempty"`

	// StageSourceEventID is the source CloudEvent id the stage trigger
	// normaliser emitted. Carried for end-to-end audit correlation.
	StageSourceEventID string `json:"stageSourceEventId,omitempty"`
}

SessionDetail is the per-session payload `af agent run` reads from the daemon's local control HTTP API on spawn. It carries the full runner-side QueuedWork shape (issue context, resolved profile, branch) plus the platform-side credentials the runner needs to talk back (auth token, platform URL, worker id, lock id).

The daemon stores one SessionDetail per accepted session in an in-memory map. A spawned `af agent run` process fetches its detail via GET /api/daemon/sessions/<id> at start-up.

Wire shape: JSON, camelCase tags. Forward-compat — new fields can be added freely; clients ignore unknown fields.

type SessionEvent

type SessionEvent struct {
	Kind    SessionEventKind
	Handle  SessionHandle
	Spec    SessionSpec
	ExitErr error
}

SessionEvent is emitted on the spawner's events channel.

type SessionEventKind

type SessionEventKind string

SessionEventKind identifies the kind of SessionEvent.

const (
	SessionEventStarted SessionEventKind = "started"
	SessionEventEnded   SessionEventKind = "ended"
)

Session event kind constants.

type SessionHandle

type SessionHandle struct {
	SessionID  string       `json:"sessionId"`
	PID        int          `json:"pid"`
	AcceptedAt string       `json:"acceptedAt"`
	State      SessionState `json:"state"`
}

SessionHandle is the daemon-side handle for an in-flight session.

type SessionResolvedProfile added in v0.5.0

type SessionResolvedProfile struct {
	Provider       string         `json:"provider,omitempty"`
	Runner         string         `json:"runner,omitempty"`
	Model          string         `json:"model,omitempty"`
	Effort         string         `json:"effort,omitempty"`
	CredentialID   string         `json:"credentialId,omitempty"`
	ProviderConfig map[string]any `json:"providerConfig,omitempty"`
}

SessionResolvedProfile mirrors runner.ResolvedProfile but lives in the daemon package to avoid an import cycle (the daemon package must stay independent of the runner package — `af agent run` constructs its own runner from this opaque payload).

type SessionResources

type SessionResources struct {
	VCpu     int `json:"vCpu,omitempty"`
	MemoryMB int `json:"memoryMb,omitempty"`
}

SessionResources is the optional resource request on a SessionSpec.

type SessionSpec

type SessionSpec struct {
	SessionID          string            `json:"sessionId"`
	Repository         string            `json:"repository"`
	Ref                string            `json:"ref"`
	Resources          *SessionResources `json:"resources,omitempty"`
	Env                map[string]string `json:"env,omitempty"`
	MaxDurationSeconds int               `json:"maxDurationSeconds,omitempty"`
}

SessionSpec is an inbound work specification dispatched by the orchestrator. Subset of SandboxSpec from 004 relevant to the daemon's session-dispatch path.

type SessionState

type SessionState string

SessionState is the lifecycle of a single worker child process spawned for an accepted session.

const (
	SessionStarting   SessionState = "starting"
	SessionRunning    SessionState = "running"
	SessionCompleted  SessionState = "completed"
	SessionFailed     SessionState = "failed"
	SessionTerminated SessionState = "terminated"
)

Session state constants.

type SpawnerOptions

type SpawnerOptions struct {
	Projects              []ProjectConfig
	MaxConcurrentSessions int
	// WorkerCommand is the command to run for each accepted session. The
	// caller may pass arbitrary args; the session-specific environment is
	// added on top of os.Environ() at spawn time.
	//
	// When empty, a short-lived /bin/sh stub is used that prints
	// "session-started:<id>" and exits 0 — sufficient for testing the
	// daemon's accept/lifecycle path without launching real worker binaries.
	WorkerCommand []string
	// BaseEnv is the environment injected into every worker process.
	BaseEnv map[string]string

	// Now lets tests deterministically clock acceptedAt timestamps.
	Now func() time.Time
	// Stdout is where worker stdout is forwarded with a "[worker:<id>]"
	// prefix. Defaults to os.Stdout. Set to io.Discard in tests.
	StdoutPrefixWriter PrefixedWriter
	StderrPrefixWriter PrefixedWriter
}

SpawnerOptions configure a WorkerSpawner.

type State

type State string

State is the lifecycle state of a Daemon instance.

const (
	StateStopped  State = "stopped"
	StateStarting State = "starting"
	StateRunning  State = "running"
	StatePaused   State = "paused"
	StateDraining State = "draining"
	StateUpdating State = "updating"
)

Lifecycle state constants.

type TrustConfig added in v0.7.1

type TrustConfig struct {
	// Mode is one of permissive | signed-by-allowlist | attested.
	// Empty defaults to permissive (set by applyDefaults).
	Mode TrustMode `yaml:"mode,omitempty" json:"mode,omitempty"`

	// IssuerSet is an OPTIONAL allowlist of OIDC subject identities
	// (Fulcio SAN) the operator considers trusted. Empty = trust any
	// signer the embedded trust root can validate (the bundle's chain
	// must still verify; this just skips the SAN allowlist filter).
	IssuerSet []string `yaml:"issuerSet,omitempty" json:"issuerSet,omitempty"`

	// Actor is the operator-declared identity used in the trustOverride
	// audit log entry. When empty the actor falls back to
	// fmt.Sprintf("uid:%d", os.Getuid()) per coordinator decision
	// Q-audit-2 (2026-05-07). The override is also timestamped and
	// names the kitId + signerId, so this field is best-effort.
	Actor string `yaml:"actor,omitempty" json:"actor,omitempty"`
}

TrustConfig is the daemon-wide trust policy. Lives on Config (NOT on KitConfig) per audit § 1.2: the trust mode applies across plugin families per 015-plugin-spec.md § "Auth + trust", not just kits.

type TrustMode added in v0.7.1

type TrustMode string

TrustMode is the operator-configured policy for how the install gate reacts to verifier outcomes.

const (
	// TrustModePermissive allows install regardless of verifier outcome.
	// The verifier still runs and the trust state is reported; this
	// matches OSS-execution-layer expectations vs the npm/pip/cargo
	// precedent. Default per Q2 of WAVE12_PLAN.
	TrustModePermissive TrustMode = "permissive"
	// TrustModeSignedByAllowlist rejects unsigned and unverified kits at
	// install time; verified-signed kits whose signer matches the
	// configured issuer set install normally.
	TrustModeSignedByAllowlist TrustMode = "signed-by-allowlist"
	// TrustModeAttested is allowlist + (future) SLSA attestation-graph
	// requirement. Wave 12 treats it as an alias for allowlist; the
	// attestation requirement lands in Wave 13+ alongside the SLSA
	// provenance parser.
	TrustModeAttested TrustMode = "attested"
)

Trust modes accepted on daemon.yaml `trust.mode`.

type UpdateChannel

type UpdateChannel string

UpdateChannel is the release channel for the auto-updater.

const (
	ChannelStable UpdateChannel = "stable"
	ChannelBeta   UpdateChannel = "beta"
	ChannelMain   UpdateChannel = "main"
)

Update channel constants.

type UpdateResult

type UpdateResult struct {
	Updated bool   `json:"updated"`
	Version string `json:"version"`
	Reason  string `json:"reason"`
}

UpdateResult describes the outcome of a runUpdate call.

type UpdateSchedule

type UpdateSchedule string

UpdateSchedule is the cadence the supervisor wakes the daemon to check.

const (
	ScheduleNightly   UpdateSchedule = "nightly"
	ScheduleOnRelease UpdateSchedule = "on-release"
	ScheduleManual    UpdateSchedule = "manual"
)

Update schedule constants.

type Updater

type Updater struct {
	// contains filtered or unexported fields
}

Updater runs the full update flow: check → fetch → verify → swap → restart.

func NewUpdater

func NewUpdater(opts UpdaterOptions) *Updater

NewUpdater returns an Updater with sane defaults.

func (*Updater) BuildBinaryURL

func (u *Updater) BuildBinaryURL(channel UpdateChannel, version string) string

BuildBinaryURL returns the binary URL for a channel/version.

func (*Updater) BuildManifestURL

func (u *Updater) BuildManifestURL(channel UpdateChannel) string

BuildManifestURL returns the manifest URL for a channel.

func (*Updater) BuildSignatureURL

func (u *Updater) BuildSignatureURL(binURL string) string

BuildSignatureURL returns the signature URL for a binary URL.

func (*Updater) CheckForUpdate

func (u *Updater) CheckForUpdate(ctx context.Context) (*VersionManifest, error)

CheckForUpdate fetches the version manifest and returns it iff a strictly newer version is available. Returns (nil, nil) when up-to-date.

func (*Updater) RunUpdate

func (u *Updater) RunUpdate(ctx context.Context) (*UpdateResult, error)

RunUpdate executes the complete update flow. When successful and SkipExit is false, it calls ExitFn(ExitCodeRestart) and does not return.

type UpdaterOptions

type UpdaterOptions struct {
	CurrentVersion    string
	CurrentBinaryPath string
	Config            AutoUpdateConfig

	// HTTPClient is the client used to fetch the manifest, binary, and
	// signature. Defaults to a 60s-timeout client.
	HTTPClient *http.Client
	// Verifier is the binary-signature verifier. Defaults to
	// alwaysFailVerifier (production-safe — no real swaps until configured).
	Verifier BinaryVerifier
	// SkipExit, when true, prevents the swap step from calling os.Exit. Used
	// by tests and by callers that want to handle the restart explicitly.
	SkipExit bool
	// ExitFn allows tests to inject a fake exit. Called only when SkipExit
	// is false. Defaults to os.Exit.
	ExitFn func(int)
	// CDNBase overrides UpdateCDNBase (test injection).
	CDNBase string
	// PlatformSuffix overrides the auto-detected suffix (test injection).
	PlatformSuffix string
}

UpdaterOptions configure an Updater.

type VersionManifest

type VersionManifest struct {
	Version    string `json:"version"`
	SHA256     string `json:"sha256"`
	ReleasedAt string `json:"releasedAt"`
}

VersionManifest is the schema of <channel>/latest.json.

type WizardOptions

type WizardOptions struct {
	// Existing is an existing config (if any) used as defaults.
	Existing *Config
	// ConfigPath is where to write the resulting config. Empty means do not
	// persist.
	ConfigPath string
	// Stdin is the TTY input. Defaults to os.Stdin.
	Stdin io.Reader
	// Stdout is where prompts are printed. Defaults to os.Stdout.
	Stdout io.Writer
	// IsTTY overrides the auto-detected TTY status. When false (and not
	// explicitly set true), the wizard returns the default config without
	// prompting.
	IsTTY *bool
	// SkipWizard, when true, returns DefaultConfig (or Existing) without
	// prompting. Mirrors the RENSEI_DAEMON_SKIP_WIZARD env var.
	SkipWizard bool
	// CPUCount overrides runtime.NumCPU() (test injection).
	CPUCount int
	// MemoryMB overrides total-memory detection (test injection). 0 means
	// "use a sensible default".
	MemoryMB int
	// DetectGitRemote returns the cwd's git remote URL or "" if none. Tests
	// inject a stub.
	DetectGitRemote func() string
}

WizardOptions configure the interactive setup wizard.

type WorkareaArchiveOptions added in v0.7.0

type WorkareaArchiveOptions struct {
	// Root is the directory the registry scans. Empty selects the
	// default ~/.rensei/workareas.
	Root string
	// ActiveProvider is the live pool view; may be nil (archives-only
	// list, see ActiveWorkareaProvider).
	ActiveProvider ActiveWorkareaProvider
	// PoolGuard is consulted on Restore. May be nil — restore proceeds
	// without a saturation check.
	PoolGuard PoolCapacityGuard
}

WorkareaArchiveOptions configures a registry.

type WorkareaArchiveRegistry added in v0.7.0

type WorkareaArchiveRegistry struct {
	// contains filtered or unexported fields
}

WorkareaArchiveRegistry is the on-disk archive index. Construct via NewWorkareaArchiveRegistry. Methods are safe for concurrent use.

func NewWorkareaArchiveRegistry added in v0.7.0

func NewWorkareaArchiveRegistry(opts WorkareaArchiveOptions) *WorkareaArchiveRegistry

NewWorkareaArchiveRegistry constructs a registry against the given archive root. The directory is NOT created at construction time — missing-or-empty roots return an empty list (HTTP 200) per ADR D4a.

func (*WorkareaArchiveRegistry) CountDiff added in v0.7.0

func (r *WorkareaArchiveRegistry) CountDiff(idA, idB string) (int, error)

CountDiff returns the number of differing entries between two archives without buffering or streaming them. The handler uses this to pick JSON vs NDJSON before opening the response stream.

func (*WorkareaArchiveRegistry) Diff added in v0.7.0

Diff returns the structured per-path delta between two archives. Both ids MUST resolve to archives (live diffs are out of scope per ADR D4a). Walks are deterministic — entries are sorted by path. The well-known .rensei/ subtree under each archive's tree/ root is excluded.

func (*WorkareaArchiveRegistry) DiffStream added in v0.7.0

DiffStream emits diff entries through the supplied callback as they are computed. The callback receives one entry at a time; if it returns a non-nil error the walk halts and the error is returned. After all entries are emitted DiffStream returns the aggregate summary so callers can write the trailing NDJSON line.

The streaming variant exists so the HTTP handler can switch its Content-Type on entry count without buffering the entire diff.

func (*WorkareaArchiveRegistry) Get added in v0.7.0

Get returns the full archive record for the named id. The Workarea Kind field is set to WorkareaKindArchived. Returns ErrArchiveNotFound when the id is absent.

func (*WorkareaArchiveRegistry) List added in v0.7.0

func (r *WorkareaArchiveRegistry) List() (active, archived []afclient.WorkareaSummary, err error)

List walks the archive root and returns the union of on-disk archives (ordered deterministically by id) and the active pool members reported by the configured ActiveWorkareaProvider, if any. Missing-or- empty root is NOT an error — the response is just (empty active + empty archived).

func (*WorkareaArchiveRegistry) Restore added in v0.7.0

Restore materialises an archive into a fresh active pool member. The returned Workarea has Kind=Active and a NEW id distinct from the archive id (archives are immutable per ADR D4a). The tree/ subtree is copied to a per-restore directory under the archive root's sibling "restored/" so operators can find the materialised state from the daemon's host filesystem.

IntoSessionID conflicts return ErrConflict; saturation returns ErrUnavailable + a non-zero retryAfter; corrupted archives return ErrArchiveCorrupted; missing archives return ErrArchiveNotFound.

func (*WorkareaArchiveRegistry) Root added in v0.7.0

func (r *WorkareaArchiveRegistry) Root() string

Root returns the archive root directory the registry scans. Exposed for tests and operators surfacing the path.

type WorkareaConfig added in v0.7.0

type WorkareaConfig struct {
	// ArchiveRoot is the directory the daemon scans for archived workareas.
	// Default ~/.rensei/workareas (resolved at runtime by the handler if
	// empty).
	ArchiveRoot string `yaml:"archiveRoot,omitempty" json:"archiveRoot,omitempty"`
	// DiffStreamingThreshold is the entry count above which the diff
	// endpoint switches from a single JSON envelope to NDJSON streaming.
	// Default 1000 per ADR D4a.
	DiffStreamingThreshold int `yaml:"diffStreamingThreshold,omitempty" json:"diffStreamingThreshold,omitempty"`
}

WorkareaConfig configures the Layer-3 workarea operator surface — archive root scan path, diff streaming threshold. Wave 9 / ADR-2026-05-07.

type WorkerSpawner

type WorkerSpawner struct {
	// contains filtered or unexported fields
}

WorkerSpawner manages the lifecycle of worker child processes.

func NewWorkerSpawner

func NewWorkerSpawner(opts SpawnerOptions) *WorkerSpawner

NewWorkerSpawner constructs a spawner. Workers will not be spawned until AcceptWork is called.

func (*WorkerSpawner) AcceptWork

func (s *WorkerSpawner) AcceptWork(spec SessionSpec) (*SessionHandle, error)

AcceptWork validates the spec, spawns a worker, and returns its handle.

func (*WorkerSpawner) ActiveCount

func (s *WorkerSpawner) ActiveCount() int

ActiveCount returns the number of in-flight sessions.

func (*WorkerSpawner) ActiveSessions

func (s *WorkerSpawner) ActiveSessions() []SessionHandle

ActiveSessions returns a snapshot of the current session handles.

func (*WorkerSpawner) ActiveWorkareas added in v0.7.1

func (s *WorkerSpawner) ActiveWorkareas() []afclient.WorkareaSummary

ActiveWorkareas projects the spawner's in-flight sessions onto the canonical afclient.WorkareaSummary wire shape so the WorkareaArchiveRegistry can union live-pool members with on-disk archives in the GET /api/daemon/workareas response (Wave 11 / S5; ADR-2026-05-07-daemon- http-control-api.md §D4a).

The projection is pull-based — the spawner holds no separate workarea map; each call materialises summaries from the live `sessions` map under the same `mu` lock that ActiveSessions uses. ProjectID is resolved via the project allowlist using the same matcher AcceptWork applies. The summary's ID is the spawner's session id so /api/daemon/workareas/<id> reaches the live entry.

Output is sorted by SessionID for deterministic test assertions.

func (*WorkerSpawner) Drain

func (s *WorkerSpawner) Drain(timeout time.Duration) error

Drain waits for all in-flight sessions to exit, then resolves. After timeout, remaining sessions receive SIGTERM via context cancellation and the function returns an error indicating how many were forcibly stopped.

func (*WorkerSpawner) IsAccepting

func (s *WorkerSpawner) IsAccepting() bool

IsAccepting reports whether the spawner is currently accepting work.

func (*WorkerSpawner) On

func (s *WorkerSpawner) On(fn func(SessionEvent))

On registers a session-event listener. Listeners are invoked synchronously from the spawner goroutine; do not block them.

func (*WorkerSpawner) Pause

func (s *WorkerSpawner) Pause()

Pause stops accepting new work but leaves running sessions alive.

func (*WorkerSpawner) Resume

func (s *WorkerSpawner) Resume()

Resume restores accepting state.

func (*WorkerSpawner) SetMaxConcurrentSessions added in v0.6.0

func (s *WorkerSpawner) SetMaxConcurrentSessions(n int) error

SetMaxConcurrentSessions updates the local session capacity used for future AcceptWork decisions. Existing sessions are never interrupted.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL