daemon

package
v0.9.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 28, 2026 License: MIT Imports: 37 Imported by: 0

README

daemon/ — long-running rensei-daemon runtime

Status: Wave 6 / Phase F.2.8 (REN-1461). Public package; the af daemon … CLI surface is in afcli/daemon.go. Architecture: rensei-architecture/004-sandbox-capability-matrix.md §Local daemon mode + 011-local-daemon-fleet.md.

The daemon is a single-machine, multi-project supervisor that:

  1. Registers itself with the platform (/api/workers/register) and exchanges a one-time rsp_live_* token for a scoped runtime JWT.
  2. Sends a periodic heartbeat (/api/workers/<id>/heartbeat) and polls for queued work (/api/workers/<id>/poll).
  3. Accepts inbound SessionSpec payloads and spawns a worker child process per accepted session.
  4. Exposes a localhost-only HTTP control API on 127.0.0.1:7734 for the af CLI and for the spawned worker children themselves.
  5. Optionally self-updates by drain → fetch → verify → swap → restart.

Spawn flow (F.2.8)

        ┌────────────────────┐
        │ platform.poll()    │  GET /api/workers/<id>/poll
        └──────────┬─────────┘
                   │ work[] item
                   ▼
        ┌────────────────────┐
        │ Daemon.AcceptWork  │
        │   WithDetail()     │
        └──────────┬─────────┘
                   │ stores SessionDetail
                   ▼
        ┌────────────────────┐
        │ WorkerSpawner.spawn│  exec.CommandContext(<af>, "agent", "run")
        │                    │  env: RENSEI_SESSION_ID=<id>,
        │                    │       RENSEI_REPOSITORY=<repo>, …
        └──────────┬─────────┘
                   │
                   ▼
        ┌────────────────────┐
        │ af agent run       │  GET 127.0.0.1:7734/api/daemon/sessions/<id>
        │   (afcli/agent_run)│  → SessionDetail with QueuedWork shape +
        │                    │     AuthToken + PlatformURL + WorkerID
        └──────────┬─────────┘
                   │ runner.Run(ctx, qw)
                   ▼
        ┌────────────────────┐
        │ runner orchestrator│  worktree → spawn provider → events →
        │                    │  tail recovery → result.Post
        └────────────────────┘

The af binary registered by daemon install doubles as both the daemon supervisor (af daemon run) and the per-session worker (af agent run) — the same binary, different subcommands. The WorkerCommand defaults to [<self-exe>, "agent", "run"] resolved via os.Executable(); operators rarely override this.

SessionDetail lifecycle
  • Set: Daemon.AcceptWorkWithDetail records the detail in an in-memory map keyed by session id when the poll loop dispatches a work item.
  • Read: GET /api/daemon/sessions/<id> (handled by daemon/server.go::handleSessionDetail) returns the JSON payload to the spawned af agent run worker.
  • Delete: the spawner emits SessionEventEnded when the worker child process exits; the daemon's listener removes the entry from the store so stale auth tokens do not linger in memory.
Repository URL resolution (REN-1464 / v0.5.2)

SessionDetail.repository is resolved from the daemon.yaml project allowlist by pollItemToSessionDetail (in poll.go). The runner uses this URL for git clone.

The platform's QueuedWork wire shape historically carries a projectName slug (e.g. "smoke-alpha") with no separate repository URL — slugs are not clonable. When the poll item arrives the daemon runs the same matcher as WorkerSpawner.findProjectLocked (REN-1448): by id, by repository, or by URL-suffix. The matching entry's repository field is substituted into SessionDetail.repository, and the canonical id is mirrored back into SessionDetail.projectName so downstream code that reads RENSEI_PROJECT_ID sees a stable value.

If no allowlist entry matches, the daemon falls back to whatever the platform sent (preserving prior behaviour) and emits a Warn log no allowlist match for projectName, falling back to as-given repo string so the misconfiguration is visible. Downstream WorkerSpawner.AcceptWork will then reject the spec with repository ... is not in the project allowlist, but the explicit log makes the resolution-time failure observable separately from the spawn-time rejection.

HTTP control API

Localhost-only (binds 127.0.0.1). Endpoints:

Method + Path Purpose
GET /api/daemon/status Daemon lifecycle state, version, uptime, sessions
GET /api/daemon/stats Capacity envelope, worker stats, allowed projects
POST /api/daemon/pause Stop accepting new work
POST /api/daemon/resume Resume accepting work
POST /api/daemon/stop Graceful stop
POST /api/daemon/drain Drain in-flight work
POST /api/daemon/update Trigger manual update check
POST /api/daemon/capacity Update a config key (e.g. capacity.poolMaxDiskGb)
GET /api/daemon/pool/stats Workarea pool snapshot
POST /api/daemon/pool/evict Evict pool members
GET /api/daemon/sessions List active session handles
POST /api/daemon/sessions Accept a session (test entrypoint)
GET /api/daemon/sessions/<id> F.2.8 — per-session detail for the spawned worker
GET /api/daemon/heartbeat Most-recent heartbeat payload
GET /api/daemon/doctor Aggregated health snapshot
GET /healthz Liveness probe

Operator runbook — debugging a stuck session

When a session appears wedged in the dashboard:

  1. Daemon logaf daemon logs --follow (default ~/.rensei/daemon.log). Look for the worker spawner lines showing pid=… and the matching [child stdout sessionID=<id>] (INFO) and [child stderr sessionID=<id>] (WARN) records from the spawned af agent run worker. Spawn output is wired to slog by default as of v0.5.1 (REN-1463) — earlier daemons drained child stdio silently.
  2. Session detailcurl http://127.0.0.1:7734/api/daemon/sessions/<id> to confirm the detail is recorded. A 404 here means the daemon never accepted the work (look for poll errors in the daemon log) or the session has already terminated and been cleaned up.
  3. af agent run log — the worker child writes its own slog output to stderr. The daemon's spawner captures both streams under [child stdout|stderr sessionID=<id>]; the same lines appear inline in af daemon logs and in the platform's session-activity stream.
  4. Provider logs — when the runner reaches step 8 (spawn provider), the per-provider subprocess is the next layer (claude JSONL on stdout, codex JSON-RPC over stdio). The provider package's README explains how to capture those streams (PROVIDER_DEBUG=1 for claude, CODEX_LOG_LEVEL=debug for codex).
  5. Platform-side statecurl http://app.rensei.ai/api/sessions/<id> (with bearer auth) to confirm the platform sees the session in the expected state. A divergence between the daemon's view (still active) and the platform's view (already terminal) usually indicates a missed result.Post — re-run af daemon stats to see whether the poller has retried.
  6. Worktree state~/.rensei/worktrees/<sessionId>/.agent/ contains the per-session state.json snapshot and the events.jsonl audit log. Look here when the agent emitted no visible output but the session is marked failed.

Failure modes the daemon classifies (high-level)

Symptom Where it surfaces
WorkerCommand falls through to /bin/sh stub worker spawner warn line in daemon log
Daemon HTTP unreachable from worker child af agent run preflight error, exit code 2
Session detail expired between fetch attempts af agent run preflight error, exit code 2
Provider probe failed at runner startup af agent run Warn log "claude provider unavailable" — falls through to stub if the session asked for stub; otherwise the runner's Resolve fails with FailureProviderResolve
Worker child exited with non-zero SessionEventEnded with ExitErr non-nil; daemon emits the failure to its log

See runner/README.md for the runner-level failure-mode table that the daemon receives via result.Post payloads.

Tests

# Unit + smoke
go test -race ./daemon/...

# F.2.8 wire-path integration test (requires git on PATH)
go test -tags=f28_integration ./afcli/...

Documentation

Overview

Package daemon handle_capabilities.go — HTTP handler for the GET /api/daemon/capabilities endpoint.

Exposes the local daemon's detected substrate capabilities to clients (rensei-tui, debugging tools, CI smoke tests). The response shape mirrors the provides[] array sent to POST /api/workers/register so consumers can verify what was advertised without re-detecting.

Architecture reference:

rensei-architecture/ADR-2026-05-12-capacity-pools-and-substrate-resolution.md
§ Stream H sub-lane — agentfactory-tui daemon pool awareness

Package daemon handle_kit.go — HTTP handlers for the /api/daemon/kits* and /api/daemon/kit-sources* surfaces.

Wave-9 A2 — see ADR-2026-05-07-daemon-http-control-api.md § D1 for the canonical route list. Path-prefix dispatch follows the same pattern used by handleSessionDetail in server.go.

Package daemon handle_provider.go — HTTP handlers for the /api/daemon/providers* operator surface. Wave 9 / A1.

The handlers expose the daemon's in-process AgentRuntime registry (claude/codex/ollama/opencode/gemini/amp/stub) as JSON. The remaining seven Provider Families (Sandbox, Workarea, VCS, IssueTracker, Deployment, AgentRegistry, Kit) return empty until per-family registries land in a future wave. The endpoint MUST emit PartialCoverage=true and CoveredFamilies=["agent-runtime"] so consumers render the "other families coming" caveat without sniffing for emptiness — see ADR-2026-05-07-daemon-http-control-api.md §D4.

Package daemon handle_routing.go — HTTP handlers for the /api/daemon/routing/* operator surface. Wave 9 / A4.

The handlers expose the daemon's RoutingTraceStore as JSON. The wire shape is locked in rensei-architecture/ADR-2026-05-07-daemon-http-control-api.md §D4 and matches the surfaces the SaaS dashboard's Routing Intelligence panel (REN-205) consumes, so the same renderer composes both.

Read-only this wave. The /config endpoint surfaces the static scheduler configuration (weights, capability filters, sandbox/LLM provider state) plus the rolling tail of recent decisions; the /explain/<sessionID> endpoint returns the full per-session decision trace.

Package daemon handle_workarea.go — HTTP handlers for the /api/daemon/workareas* operator surface.

Wave 9 / Track A3 / ADR-2026-05-07-daemon-http-control-api.md §D4a.

Routes:

GET    /api/daemon/workareas                            list active + archived
GET    /api/daemon/workareas/<id>                       inspect (active or archived)
POST   /api/daemon/workareas/<archiveID>/restore        201 on success
GET    /api/daemon/workareas/<idA>/diff/<idB>           JSON or NDJSON

The streaming-NDJSON variant on /diff/ kicks in when the entry count exceeds the daemon's configured workarea.diffStreamingThreshold (default 1000 per ADR D4a). Below that, the response is a single WorkareaDiffEnvelope JSON object.

Package daemon kit_install_git.go — git-source kit fetcher (Wave 12 / Theme C / S3).

The fetcher clones the operator-provided git URL into a temp directory, locates the kit manifest (and its sibling `.sigstore` bundle, when present), and exposes both as on-disk paths so KitRegistry.Install can run the trust-gated verifier against the freshly-fetched material before persisting it into kit.scanPaths[0].

Design notes

  • Uses go-git/v5 (pure-Go) so the daemon does not depend on a `git` binary on the operator's PATH. Public-host or file:// URLs are both accepted; tests rely on file:// fixtures.
  • When KitInstallSource.ManifestPath is empty the fetcher walks the cloned tree for *.kit.toml files and selects the first one that parses cleanly. This matches the audit § 2.1 step 3 contract: "walk repo for *.kit.toml, pick the first; multi-manifest support is a Wave 13+ extension per 005-kit-manifest-spec.md".
  • Caller MUST defer the returned cleanup func; the temp tree is persisted only long enough for the registry to copy what it needs into the configured scanPath.

Errors

  • ErrKitInstallSourceFetchFailed — clone failed (network, auth, ref not found, etc.). Wrapped with the underlying go-git error.
  • ErrKitInstallManifestNotFound — clone succeeded but no usable `*.kit.toml` exists at the configured ManifestPath (or anywhere in the tree when ManifestPath was empty).

Package daemon kit_registry.go — minimal in-process Kit registry that scans the filesystem for installed kit manifests and exposes them via the operator control API.

This is the OSS-execution-layer's "Local manifests" registry source from the federation list in 005-kit-manifest-spec.md § "Registry sources" (item 1). Other registry sources (bundled, rensei, tessl, agentskills, community) are not implemented in this wave; the /api/daemon/kit-sources endpoint returns a static descriptor list surfacing the federation order.

Scan path defaults to ~/.rensei/kits/*.kit.toml. Multiple paths may be declared via daemon.yaml's optional `kit.scanPaths` override.

Behaviour:

  • Empty registry (no scan path entries, no .kit.toml files) → empty list, HTTP 200.
  • Malformed manifests log a warning via slog and are excluded from the listing rather than failing the whole request.
  • Enable/disable state is persisted to a sidecar file at ~/.rensei/kits/.state.json so toggle outcomes survive daemon restarts. The file is created on first toggle.
  • Install is currently a stub returning ErrKitInstallUnimplemented; fetching kits from a remote registry is deferred until the federation sources land.
  • Verify-signature returns KitTrustUnsigned for all kits in this wave (signing is partially implemented per the ADR caveat).

Package daemon kit_trust.go — sigstore bundle-mode kit signature verifier (Wave 12 / Theme C / S2).

The verifier consumes a sibling `<manifest>.sigstore` file (Q1 of WAVE12_PLAN — "bundle file shape: sibling .sigstore"), validates it against the configured trust root, and reports back a populated afclient.KitSignatureResult. Three trust outcomes:

  • KitTrustSignedVerified — bundle present and verifies against the trust root + issuer set.
  • KitTrustSignedUnverified — bundle present but verification failed (tampered manifest, untrusted issuer, expired chain, etc.).
  • KitTrustUnsigned — no sibling .sigstore file exists.

At install time the verifier outcome maps to a trust gate. The gate runs in the registry's Install path, NOT here — see the ErrKitTrustGateRejected sentinel in kit_registry.go and the trustOverride: "allowed-this-once" handling per audit § 1.3 / § 2.2.

Trust modes (§ "Signing and trust" in 002-provider-base-contract.md):

  • permissive — verifier still runs and reports state, but never blocks Install. OSS default per Q2 of WAVE12_PLAN.
  • signed-by-allowlist — Install rejects KitTrustUnsigned and KitTrustSignedUnverified.
  • attested — same as allowlist for Wave 12 (the SLSA attestation graph hookup lands in Wave 13+).

The embedded trust root is the public Sigstore production trust root (https://raw.githubusercontent.com/sigstore/sigstore-go/main/examples/trusted-root-public-good.json). It will be replaced with a Rensei-published trust root once the productionized signing CI from REN-1344 emits a Rensei-signed Fulcio + Rekor cert chain (Wave 13+ work).

Q-audit-2 resolution (taken 2026-05-07 by /loop coordinator): trust-actor lookup falls back to os.Getuid() when daemon.yaml's `trust.actor` is empty. The trustOverride audit log is best-effort identification; the override is still timestamped and key fields (kitId, signerId) are always populated.

Package daemon routing_state.go — in-process routing trace store and configuration projector for the /api/daemon/routing/* surface (Wave 9 / A4).

The OSS daemon does not yet ship a real cross-provider scheduler in production. The store therefore defines the shape the eventual scheduler will record decisions through, and the read paths used by the HTTP handlers in handle_routing.go.

See ADR-2026-05-07-daemon-http-control-api.md §D4 for the wire contract, 004-sandbox-capability-matrix.md for the cross-provider scheduler model, and the forward reference at /api/daemon/routing/explain/<sessionID> in the same doc.

Package daemon implements the long-running rensei-daemon runtime in Go.

The daemon is a single-machine, multi-project supervisor that:

  • Registers itself with the orchestrator (dial-out) and exchanges a one-time rsp_live_* token for a scoped JWT.
  • Sends a periodic heartbeat to the orchestrator.
  • Accepts inbound work specs (sessions) and spawns worker child processes.
  • Exposes an HTTP control API on 127.0.0.1:7734 for the af / rensei CLI.
  • Optionally self-updates by drain → fetch → verify → swap → restart.

Architecture reference:

rensei-architecture/004-sandbox-capability-matrix.md §Local daemon mode
rensei-architecture/011-local-daemon-fleet.md

This is the public package surface — downstream binaries can import it directly to embed the daemon runtime under their own command tree. The afcli package re-exports the runtime as the `daemon run` subcommand.

This package is the Go port of agentfactory/packages/daemon/src (REN-1408). The TS package @renseiai/daemon is deprecated; final removal is scheduled for cycle 6 after the smoke harness has soaked for 7 nights.

Package daemon workarea_archive.go — on-disk workarea archive registry powering the Layer-3 workarea operator surface.

Wave 9 / Track A3 / ADR-2026-05-07-daemon-http-control-api.md §D4a.

Archive layout. Each archive is a directory under the daemon's archive root (default ~/.rensei/workareas/<archiveID>/) containing:

manifest.json   — metadata sidecar (id, sessionId, createdAt,
                  sizeBytes, sourceProvider, capabilities,
                  disposition); free-form extra fields permitted.
tree/           — the workarea filesystem snapshot. Diffs and
                  restores walk this subtree only; everything outside
                  it (manifest.json, daemon-private bookkeeping) is
                  ignored. The well-known .rensei/ directory under
                  tree/ is also excluded from diff walks per ADR D4a.

The registry is stateless w.r.t. process lifecycle — every call hits disk. That's fine: archive directories are small in count (operator scale), the OS dentry cache absorbs repeated listings, and avoiding in-memory state means the daemon never serves a stale view after an out-of-band write to ~/.rensei/workareas/.

Index

Constants

View Source
const CapacityRefreshInterval = 60 * time.Second

CapacityRefreshInterval is how often the daemon re-emits its capacity snapshot. Mirrors the TS CAPACITY_REFRESH_INTERVAL_MS = 60_000.

View Source
const DefaultHTTPHost = "127.0.0.1"

DefaultHTTPHost is the bind address for the control HTTP server.

View Source
const DefaultHTTPPort = 7734

DefaultHTTPPort is the port the daemon's control HTTP server binds to. Keep in sync with afclient.DefaultDaemonConfig (port 7734).

View Source
const DefaultRoutingRingBufferSize = 50

DefaultRoutingRingBufferSize is the maximum number of recent routing decisions retained for the GetConfig view. The explain endpoint key is per-session and bounded by the same ring — a session whose decision has fallen out of the ring returns 404.

View Source
const ExitCodeRestart = 3

ExitCodeRestart is the exit code the daemon uses to signal the supervisor "restart requested" after a successful binary swap. The launchd plist / systemd unit treats code 3 as a clean restart, not a crash.

View Source
const HeartbeatDefaultInterval = 30 * time.Second

HeartbeatDefaultInterval is the fallback heartbeat cadence when the orchestrator does not return one in RegisterResponse. The TS path uses 30s as the fallback; we keep that here, but `15s` is the canonical SLO target.

View Source
const RegisterEndpoint = "/api/workers/register"

RegisterEndpoint is the relative path on the platform.

View Source
const RuntimeTokenRefreshEndpoint = "/api/workers/refresh-token"

RuntimeTokenRefreshEndpoint is the (probed) platform endpoint the daemon hits to refresh an expired runtime JWT WITHOUT re-registering. The platform owes a handler at this path that:

  • accepts the registration token in the Authorization: Bearer header
  • takes the existing workerId in the URL path
  • mints a fresh runtime JWT bound to the SAME workerId
  • returns { runtimeToken, runtimeTokenExpiresAt, heartbeatInterval, pollInterval }

As of 2026-05-03 this endpoint does NOT exist on the platform side — see REN-1481 platform-companion. Until it ships the daemon probes this URL, observes a 404, and falls back to full re-register (which mints a new workerId, the bug REN-1481 originally documented). When the platform side ships the endpoint the daemon picks it up automatically with no further changes. #nosec G101 -- URL endpoint path, not a credential

View Source
const UpdateCDNBase = "https://updates.rensei.dev"

UpdateCDNBase is the base URL for the rensei CDN that hosts release manifests and binaries.

Variables

View Source
var (
	// ErrArchiveNotFound — the named archive id is not present on disk.
	ErrArchiveNotFound = errors.New("workarea archive not found")
	// ErrArchiveCorrupted — the archive exists but its manifest is
	// missing/malformed, or the tree directory cannot be walked.
	ErrArchiveCorrupted = errors.New("workarea archive corrupted")
	// ErrArchiveExists — restore would collide with an existing archive
	// entry on disk (never reached today; archives are immutable, but
	// the check is here for a future "archive on restore" code path).
	ErrArchiveExists = errors.New("workarea archive already exists")
)

WorkareaArchiveErrCode is the sentinel set used by the registry for programmatic error discrimination at the HTTP layer. Wrapped with %w so handlers can errors.Is() against them.

View Source
var DefaultRoutingWeights = afclient.RoutingWeights{Cost: 0.7, Latency: 0.3}

DefaultRoutingWeights are the cost/latency scoring weights described in 004-sandbox-capability-matrix.md §"Open questions" — 70/30 cost/latency is the documented default. The store returns these on every GetConfig call until a tenant config layer overrides them in a future wave.

View Source
var ErrKitInstallManifestNotFound = errors.New("kit install: manifest not found in fetched source")

ErrKitInstallManifestNotFound is returned when the source fetch succeeds but no *.kit.toml is locatable inside the fetched tree (or at the operator-provided KitInstallSource.ManifestPath). Maps to HTTP 422.

View Source
var ErrKitInstallSourceFetchFailed = errors.New("kit install: source fetch failed")

ErrKitInstallSourceFetchFailed is returned when the configured source fetcher fails (e.g., git clone error, network failure, unreachable remote, missing ref). Maps to HTTP 502.

View Source
var ErrKitInstallUnimplemented = errors.New("kit install: remote registry fetch not implemented in this wave")

ErrKitInstallUnimplemented is returned by KitRegistry.Install for the Wave-9 backward-compat path: a request body with no `source` block (the shape the Wave-9 smoke + handler tests POST). Wave 12 / Phase 4 keeps this sentinel reserved for that empty-body case so existing 501 assertions stay green; new federation-source kinds (tessl, agentskills, community) return ErrKitSourceFederationUnimplemented instead.

View Source
var ErrKitNotFound = errors.New("kit not found")

ErrKitNotFound is returned when a kit id is not present in the registry.

View Source
var ErrKitSourceFederationUnimplemented = errors.New("kit install: federation source kind not yet implemented")

ErrKitSourceFederationUnimplemented is returned when KitInstallRequest names a federation source kind (`tessl` / `agentskills` / `community`) that the daemon does not yet know how to fetch from. Maps to HTTP 501 — the descriptor list returned by /api/daemon/kit-sources continues to surface those kinds so operators can see the federation order.

Federation cross-repo wave is REN-1308 follow-up.

View Source
var ErrKitSourceNotFound = errors.New("kit source not found")

ErrKitSourceNotFound is returned when a kit-source name is not known.

View Source
var ErrKitTrustGateRejected = errors.New("kit install: trust gate rejected (signed-by-allowlist requires verified signature)")

ErrKitTrustGateRejected is returned by KitRegistry.Install when the configured trust mode (signed-by-allowlist or attested) refuses an unsigned or signed-but-unverified kit. Maps to HTTP 403. The trustOverride: "allowed-this-once" install field bypasses this gate for a single request (audit-logged); see kit_trust.go.

View Source
var Version = "dev"

Version is the daemon binary version reported in DaemonStatus and in the registration payload.

Now a `var` (was `const`) so the binary's main can override it via `-ldflags "-X github.com/RenseiAI/donmai/daemon.Version=$VERSION"` at build time, OR a downstream embedder (e.g. rensei-tui's daemon run command) can pass its own version via `Options.Version` at daemon construction. The const form pinned the value to whatever agentfactory-tui's source had at vendor time, which left the `rensei-daemon-run` HTTP /api/daemon/status endpoint reporting an outdated string forever — confusing operators who saw e.g. `0.7.1` even after upgrading both binaries past it.

Default is `"dev"` so an unreleased build (or a vendored copy that forgot to inject) is obvious in status output.

Functions

func DefaultConfigPath

func DefaultConfigPath() string

DefaultConfigPath returns the path to daemon.yaml, resolving to ~/.donmai/daemon.yaml for new installs with a one-release fallback to ~/.rensei/daemon.yaml when the legacy directory still exists.

func DefaultJWTPath

func DefaultJWTPath() string

DefaultJWTPath returns the path to the cached JWT, resolving to ~/.donmai/daemon.jwt for new installs with a one-release fallback to ~/.rensei/daemon.jwt when the legacy directory still exists.

func DefaultKitScanPath

func DefaultKitScanPath() string

DefaultKitScanPath returns the path to the installed-kits directory, resolving to ~/.donmai/kits for new installs with a one-release fallback to ~/.rensei/kits when the legacy directory still exists.

func DeriveDefaultMachineID

func DeriveDefaultMachineID() string

DeriveDefaultMachineID returns a hostname-derived identifier suitable for machine.id when the user has not set one.

func IsNewerVersion

func IsNewerVersion(candidate, current string) bool

IsNewerVersion returns true if candidate is strictly newer than current according to semver-prefix comparison. Falls back to lexicographic compare if either string is not a parseable semver prefix.

func ResolvePlatformSuffix

func ResolvePlatformSuffix() string

ResolvePlatformSuffix returns "<arch>-<os>" suitable for the CDN binary filename, e.g. "arm64-darwin", "amd64-linux".

func SaveCachedJWT

func SaveCachedJWT(jwtPath string, resp *RegisterResponse, now time.Time) error

SaveCachedJWT atomically writes the response to jwtPath with 0o600 perms.

func ShouldSkipWizard

func ShouldSkipWizard() bool

ShouldSkipWizard returns true when the wizard should be bypassed:

  • stdin is not a TTY, OR
  • RENSEI_DAEMON_SKIP_WIZARD is set.

func WipeCachedJWT

func WipeCachedJWT(jwtPath string) (bool, error)

WipeCachedJWT removes the cached JWT file at jwtPath. Returns wiped=true when the file existed and was removed, wiped=false when there was no cache to remove (idempotent — safe to call from uninstall paths on systems that never had the daemon installed).

Why this exists: Register() short-circuits with the cached JWT whenever the file is present, even when the workerId in it has been invalidated by the orchestrator (worker row deleted, registration token rotated, org migrated, manual cleanup, …). Without an explicit wipe, the daemon polls the dead workerId every poll interval forever — the token-refresh fallback re-mints credentials for the same dead id rather than triggering a true re-registration. Install / uninstall paths should call this so a fresh registration handshake happens on the next daemon boot.

func WriteConfig

func WriteConfig(path string, cfg *Config) error

WriteConfig atomically writes cfg to path (tmp file + rename), creating parent directories as needed.

Types

type ActiveWorkareaProvider

type ActiveWorkareaProvider interface {
	ActiveWorkareas() []afclient.WorkareaSummary
}

ActiveWorkareaProvider exposes the daemon's live pool members in the canonical wire shape so List can union them with on-disk archives. Implementations MUST return a stable order. Empty list (zero pool members) is a perfectly valid, non-error response.

type AutoUpdateConfig

type AutoUpdateConfig struct {
	Channel             UpdateChannel  `yaml:"channel"             json:"channel"`
	Schedule            UpdateSchedule `yaml:"schedule"            json:"schedule"`
	DrainTimeoutSeconds int            `yaml:"drainTimeoutSeconds" json:"drainTimeoutSeconds"`
}

AutoUpdateConfig is the auto-update preferences block.

type BinaryVerifier

type BinaryVerifier interface {
	Verify(ctx context.Context, contentHash, signatureValue string) (valid bool, reason string)
}

BinaryVerifier is a narrow signature-verification interface. The default production verifier rejects all signatures (until REN-1314 ships a Go sigstore adapter). Tests can inject a passing verifier.

type CachedJWT

type CachedJWT struct {
	WorkerID              string `json:"workerId"`
	RuntimeToken          string `json:"runtimeToken"`
	HeartbeatInterval     int    `json:"heartbeatInterval"` // ms
	PollInterval          int    `json:"pollInterval"`      // ms
	RuntimeTokenExpiresAt string `json:"runtimeTokenExpiresAt,omitempty"`
	CachedAt              string `json:"cachedAt"`

	// Legacy fields retained so old cache files written before REN-1422
	// still load successfully. Newer writes only populate the canonical
	// platform-named fields above.
	LegacyRuntimeJWT               string `json:"runtimeJwt,omitempty"`
	LegacyHeartbeatIntervalSeconds int    `json:"heartbeatIntervalSeconds,omitempty"`
	LegacyPollIntervalSeconds      int    `json:"pollIntervalSeconds,omitempty"`
}

CachedJWT is the on-disk cache entry. We persist this between daemon runs so re-registration is skipped while the runtime token is fresh.

func LoadCachedJWT

func LoadCachedJWT(jwtPath string) (*CachedJWT, error)

LoadCachedJWT reads ~/.rensei/daemon.jwt. Returns (nil, nil) when the file does not exist or cannot be parsed.

type CapabilitiesResponse

type CapabilitiesResponse struct {
	// Provides is the substrate capability set detected at daemon startup.
	// Each entry corresponds to a SubstrateCapabilityDeclaration.runtimeKinds
	// value (e.g. "native", "npm", "python-pip"). This matches the provides[]
	// array sent to POST /api/workers/register.
	Provides []ProvideCapability `json:"provides"`
	// Timestamp is the RFC3339 UTC time when this response was generated.
	Timestamp string `json:"timestamp"`
}

CapabilitiesResponse is the JSON response from GET /api/daemon/capabilities.

type CapacityConfig

type CapacityConfig struct {
	MaxConcurrentSessions int                `yaml:"maxConcurrentSessions"     json:"maxConcurrentSessions"`
	MaxVCpuPerSession     int                `yaml:"maxVCpuPerSession"         json:"maxVCpuPerSession"`
	MaxMemoryMbPerSession int                `yaml:"maxMemoryMbPerSession"     json:"maxMemoryMbPerSession"`
	ReservedForSystem     ReservedSystemSpec `yaml:"reservedForSystem"         json:"reservedForSystem"`
	// PoolMaxDiskGb is the LRU-eviction trigger for the workarea pool.
	// 0 means no limit. (REN-1334.)
	PoolMaxDiskGb int `yaml:"poolMaxDiskGb,omitempty" json:"poolMaxDiskGb,omitempty"`
}

CapacityConfig is the resource envelope declared in daemon.yaml.

type CloneStrategy

type CloneStrategy string

CloneStrategy controls how the daemon clones a project repo for new workarea pool members.

const (
	CloneShallow   CloneStrategy = "shallow"
	CloneFull      CloneStrategy = "full"
	CloneReference CloneStrategy = "reference-clone"
)

Clone strategy constants.

type Config

type Config struct {
	APIVersion    string               `yaml:"apiVersion"             json:"apiVersion"`
	Kind          string               `yaml:"kind"                   json:"kind"`
	Machine       MachineConfig        `yaml:"machine"                json:"machine"`
	Capacity      CapacityConfig       `yaml:"capacity"               json:"capacity"`
	Projects      []ProjectConfig      `yaml:"projects,omitempty"     json:"projects,omitempty"`
	Orchestrator  OrchestratorConfig   `yaml:"orchestrator"           json:"orchestrator"`
	AutoUpdate    AutoUpdateConfig     `yaml:"autoUpdate"             json:"autoUpdate"`
	Observability *ObservabilityConfig `yaml:"observability,omitempty" json:"observability,omitempty"`
	// Workarea holds Layer-3 workarea-surface tunables (archive root,
	// diff streaming threshold). Optional; populated with defaults if
	// absent.
	Workarea WorkareaConfig `yaml:"workarea,omitempty"     json:"workarea,omitempty"`
	// Kit holds Layer-4 kit-surface tunables (scan paths). Optional;
	// applyDefaults seeds ScanPaths to [DefaultKitScanPath()] when
	// absent. Per ADR-2026-05-07 § D4.
	Kit KitConfig `yaml:"kit,omitempty"          json:"kit,omitempty"`
	// Trust holds the daemon-wide signature-verification policy
	// (sigstore bundle-mode verifier mode + issuer allowlist + audit
	// actor). Optional; applyDefaults seeds Mode to
	// TrustModePermissive when absent. Per WAVE12_PLAN Q2 and
	// 002-provider-base-contract.md § "Signing and trust". Lives on
	// Config (not on KitConfig) because the trust mode applies across
	// all plugin families per 015-plugin-spec.md § "Auth + trust".
	Trust TrustConfig `yaml:"trust,omitempty"        json:"trust,omitempty"`
}

Config is the in-memory representation of ~/.rensei/daemon.yaml. The wire schema mirrors the TS DaemonConfig (rensei-architecture/004 §Configuration shape).

func BuildDefaultConfigFromExisting

func BuildDefaultConfigFromExisting(existing *Config, configPath string) (*Config, error)

BuildDefaultConfigFromExisting returns a default Config (or the existing one) and optionally persists it to configPath.

func DefaultConfig

func DefaultConfig() *Config

DefaultConfig returns a minimal Config suitable as a starting point when the wizard is skipped. Capacity defaults are derived from runtime info.

func LoadConfig

func LoadConfig(path string) (*Config, error)

LoadConfig reads daemon.yaml from path. Returns (nil, nil) when the file does not exist (so callers can branch into the setup wizard / default).

func RunSetupWizard

func RunSetupWizard(opts WizardOptions) (*Config, error)

RunSetupWizard runs the interactive first-run wizard (or returns the non-interactive default when stdin is not a TTY).

type Daemon

type Daemon struct {
	// contains filtered or unexported fields
}

Daemon is the top-level supervisor. It owns the loaded Config, the HeartbeatService, the WorkerSpawner, and (optionally) the AutoUpdater.

func New

func New(opts Options) *Daemon

New constructs a Daemon. Call Start() to bring it online.

func (*Daemon) AcceptWork

func (d *Daemon) AcceptWork(spec SessionSpec) (*SessionHandle, error)

AcceptWork dispatches a session spec to the spawner.

func (*Daemon) AcceptWorkWithDetail

func (d *Daemon) AcceptWorkWithDetail(spec SessionSpec, detail *SessionDetail) (*SessionHandle, error)

AcceptWorkWithDetail dispatches a session spec to the spawner and records the per-session detail used by the spawned `donmai agent run` process. Pass nil detail when the caller does not have one (legacy tests); the spawner falls through to env-only inputs.

Detail is stored before spawning and removed when the spawner emits the corresponding SessionEventEnded event so stale credentials do not linger in memory.

func (*Daemon) ActiveSessions

func (d *Daemon) ActiveSessions() []SessionHandle

ActiveSessions returns a snapshot of in-flight session handles.

func (*Daemon) Config

func (d *Daemon) Config() *Config

Config returns a copy of the loaded config (or nil if not started).

func (*Daemon) Done

func (d *Daemon) Done() <-chan struct{}

Done returns a channel that is closed when the daemon has fully stopped.

func (*Daemon) EffectiveVersion

func (d *Daemon) EffectiveVersion() string

EffectiveVersion returns the version string the daemon should report in HTTP status / heartbeat / registration payloads. Resolution order: `Options.Version` (downstream embedder override) → package `Version` (which itself is "dev" unless overridden via `-ldflags -X .../daemon.Version=…`). Empty option = fall through.

func (*Daemon) HostStatus

func (d *Daemon) HostStatus() *HostStatusDetail

HostStatus returns the most recent hostStatus reported by the platform in a heartbeat response. nil until at least one beat has been ACK'd (or the platform predates Phase 2e). Phase 2e of 2026-05-18-daemon-config-sync-DESIGN.md.

af daemon stats can surface this so an operator sees "your pool was deleted — re-register against pool X" without parsing daemon.log.

func (*Daemon) Pause

func (d *Daemon) Pause()

Pause stops accepting new work without draining.

func (*Daemon) Resume

func (d *Daemon) Resume()

Resume re-enables accepting work.

func (*Daemon) RoutingTraces

func (d *Daemon) RoutingTraces() *RoutingTraceStore

RoutingTraces returns the daemon's in-process routing trace store. The eventual cross-provider scheduler records its decisions here via store.RecordDecision; the /api/daemon/routing/* HTTP surface reads from it. Exposed so test harnesses (and a future scheduler wire-up) can drive recordings without reaching through internal fields. (Wave 9 / A4.)

func (*Daemon) SessionDetail

func (d *Daemon) SessionDetail(sessionID string) (*SessionDetail, bool)

SessionDetail returns the stored per-session detail for the given session id, or (nil, false) if no detail is recorded. Used by the HTTP server's /api/daemon/sessions/<id> handler.

func (*Daemon) SetWorkareaArchiveRegistry

func (d *Daemon) SetWorkareaArchiveRegistry(reg *WorkareaArchiveRegistry)

SetWorkareaArchiveRegistry replaces the daemon's archive registry with the provided one. Used by tests + by the future pool wire-up (REN-1280) to inject an ActiveWorkareaProvider that sees the live pool.

func (*Daemon) Start

func (d *Daemon) Start(ctx context.Context) error

Start brings the daemon online: load config (or wizard), register, start heartbeat, and start the spawner. The HTTP server is NOT started here; callers do that explicitly via Server.Start so they can pick the bind.

func (*Daemon) StartedAt

func (d *Daemon) StartedAt() time.Time

StartedAt returns the daemon's UTC start time (zero before Start()).

func (*Daemon) State

func (d *Daemon) State() State

State returns the current lifecycle state.

func (*Daemon) Stop

func (d *Daemon) Stop(_ context.Context) error

Stop performs a graceful shutdown: drain in-flight sessions, stop loops, and transition to stopped. The context is currently unused but is retained for future use (e.g. cancelling drain via ctx.Done). Stop drains spawned work, halts the heartbeat/poller loops, closes the yaml watcher, and transitions to StateStopped. Safe to call concurrently or repeatedly — the whole body is gated by stopOnce so a deferred Stop() in a test fixture racing with an HTTP /stop handler is benign.

func (*Daemon) SubstrateCapabilities

func (d *Daemon) SubstrateCapabilities() []internaldaemon.SubstrateCapability

SubstrateCapabilities returns the substrate capabilities detected at daemon startup. The slice is nil before Start() is called and non-nil afterwards (even when no optional toolchains were found — the always-present set is returned). The returned slice is a copy; callers may mutate it freely. (ADR-2026-05-12-capacity-pools-and-substrate-resolution.md §H.)

func (*Daemon) Update

func (d *Daemon) Update(ctx context.Context) (*UpdateResult, error)

Update triggers a manual auto-update check.

Behavior: drain → fetch manifest → verify → swap → exit (3). If no update is available the call is idempotent and the daemon transitions back to running. If signature verification fails, the swap is aborted and an error is returned. The caller (HTTP handler) typically returns the outcome to the client and may then call Stop().

func (*Daemon) WorkerID

func (d *Daemon) WorkerID() string

WorkerID returns the assigned worker ID (empty until registered).

type EvictHandler

type EvictHandler interface {
	Evict(ctx context.Context, req afclient.EvictPoolRequest) (*afclient.EvictPoolResponse, error)
}

EvictHandler executes a pool eviction request and returns the response.

type HeartbeatMutationFailure

type HeartbeatMutationFailure struct {
	ID    string `json:"id"`
	Error string `json:"error"`
}

HeartbeatMutationFailure is sent in the request body's mutationFailures[] to ACK a queued daemon-config mutation that failed locally.

type HeartbeatOptions

type HeartbeatOptions struct {
	WorkerID        string
	Hostname        string
	OrchestratorURL string
	// RuntimeJWT is the runtime token (a JWT) returned by /api/workers/register
	// and sent in Authorization: Bearer on every heartbeat.
	RuntimeJWT      string
	IntervalSeconds int
	GetActiveCount  func() int
	GetMaxCount     func() int
	GetStatus       func() RegistrationStatus
	Region          string

	// GetAllowlist returns the daemon's current project allowlist entries
	// (derived from cfg.Projects). Called every beat so a hot yaml reload
	// (when that lands) or in-process mutation reflects in the next
	// heartbeat. Returning nil is the canonical "no projects configured"
	// signal and triggers an empty AllowlistHash. Optional — callers that
	// don't care about allowlist sync can leave it nil.
	//
	// Phase 1d of 2026-05-18-daemon-config-sync-DESIGN.md.
	GetAllowlist func() []ProjectAllowlistEntry

	// OnPendingMutations is invoked when the platform attaches one or more
	// queued daemon-config mutations to a heartbeat response. The callback
	// is expected to apply each mutation against daemon.yaml and return
	// which ones succeeded (appliedIDs) and which failed
	// (failures). The HeartbeatService buffers these and includes them in
	// the NEXT beat's appliedMutations[] / mutationFailures[] fields so
	// the platform can ACK and emit audit events.
	//
	// Optional — leave nil to ignore platform-initiated mutations (the
	// daemon will keep working off its yaml as-edited locally). Phase 2c.
	OnPendingMutations func(ctx context.Context, mutations []PendingMutation) (appliedIDs []string, failures []HeartbeatMutationFailure)

	// OnHostStatus is invoked when the platform's heartbeat response
	// reports a non-ok hostStatus (e.g. pool_deleted). The daemon can
	// use this to surface re-register guidance in `donmai daemon stats` or
	// to enter a non-claiming state. Called with the latest status on
	// every beat that includes one, so callers can rely on it to clear
	// (status='ok') as well.
	//
	// Optional — leave nil to ignore. Phase 2e.
	OnHostStatus func(detail HostStatusDetail)

	// HTTPClient is the client used for the real-endpoint call.
	HTTPClient *http.Client
	// LogWarn is called when the real-endpoint call fails (transient
	// failures are non-fatal — the platform will detect via missed
	// heartbeats and Redis TTL expiry).
	LogWarn func(format string, args ...any)
	// Now provides the heartbeat sentAt timestamp.
	Now func() time.Time
	// OnHeartbeat is invoked after each heartbeat payload is composed
	// (whether or not the network call succeeded). Used by tests and
	// observability.
	OnHeartbeat func(payload HeartbeatPayload)
	// OnReregister is called when the runtime token is rejected (HTTP 401)
	// or the worker is reported missing (HTTP 404 — likely Redis TTL
	// expired). Implementations re-issue Register() against the platform
	// and return the fresh worker id + runtime token. Returning a non-nil
	// error leaves the heartbeat in its prior state and logs via LogWarn;
	// the next tick retries the heartbeat with the stale token (which will
	// fail again and re-trigger this path).
	//
	// reason is the structured failure reason ("worker-not-found",
	// "runtime-token-expired", "unauthorized", "auth-failure"). Callers
	// should pass it through to RefreshRuntimeToken so the correct
	// recovery path is taken — in particular, "worker-not-found" skips
	// the JWT refresh probe and goes directly to full re-registration
	// (creating a new Redis entry), while "runtime-token-expired" tries
	// the refresh probe first to preserve the workerId.
	//
	// Required when the daemon runs against a real platform; tests that
	// only exercise the local stub path can leave it nil.
	OnReregister func(ctx context.Context, reason string) (workerID, runtimeJWT string, err error)
}

HeartbeatOptions configure a HeartbeatService.

type HeartbeatPayload

type HeartbeatPayload struct {
	WorkerID       string             `json:"workerId"`
	Hostname       string             `json:"hostname"`
	Status         RegistrationStatus `json:"status"`
	ActiveSessions int                `json:"activeSessions"`
	MaxSessions    int                `json:"maxSessions"`
	Region         string             `json:"region,omitempty"`
	SentAt         string             `json:"sentAt"`

	// AllowlistHash is the SHA-256 of the daemon's current project
	// allowlist (see allowlist_report.go). Sent on every beat so the
	// platform can detect drift cheaply. Empty string when the daemon
	// has no projects configured.
	//
	// Phase 1d of 2026-05-18-daemon-config-sync-DESIGN.md.
	AllowlistHash string `json:"allowlistHash,omitempty"`

	// Allowlist is the full structured allowlist payload. Included only
	// when AllowlistHash changes from the platform's last-known value
	// (the daemon caches its previously-reported hash and includes the
	// list only on first beat or on change). Steady-state overhead per
	// beat is the 64-byte hash + ~8 bytes of JSON framing.
	Allowlist []ProjectAllowlistEntry `json:"allowlist,omitempty"`
}

HeartbeatPayload is the body sent on POST /v1/daemon/heartbeat.

type HeartbeatService

type HeartbeatService struct {
	// contains filtered or unexported fields
}

HeartbeatService manages the periodic heartbeat goroutine. It is safe to Start / Stop multiple times; consecutive Starts are idempotent.

func NewHeartbeatService

func NewHeartbeatService(opts HeartbeatOptions) *HeartbeatService

NewHeartbeatService constructs a HeartbeatService from opts. Required callbacks are GetActiveCount, GetMaxCount, and GetStatus.

func (*HeartbeatService) CurrentCredentials

func (h *HeartbeatService) CurrentCredentials() (workerID, runtimeJWT string)

CurrentCredentials returns the worker id and runtime JWT currently in use. They may differ from the values passed at construction time after a re-register on 401.

func (*HeartbeatService) IsRunning

func (h *HeartbeatService) IsRunning() bool

IsRunning reports whether the heartbeat goroutine is active.

func (*HeartbeatService) LastPayload

func (h *HeartbeatService) LastPayload() HeartbeatPayload

LastPayload returns the most recently composed heartbeat payload (for debugging / status surfaces).

func (*HeartbeatService) Start

func (h *HeartbeatService) Start()

Start launches the heartbeat goroutine. It sends an immediate heartbeat, then continues at IntervalSeconds. Subsequent calls are no-ops.

func (*HeartbeatService) Stop

func (h *HeartbeatService) Stop()

Stop terminates the heartbeat goroutine. Safe to call multiple times.

type HostStatusDetail

type HostStatusDetail struct {
	Status            string   `json:"status"` // ok | pool_deleted | pool_draining | pool_disabled | unauthorized
	RecommendedAction string   `json:"recommendedAction,omitempty"`
	CandidatePoolIDs  []string `json:"candidatePoolIds,omitempty"`
}

HostStatusDetail mirrors the platform's wire shape for hostStatus in the heartbeat response. The daemon uses this to decide whether to keep claiming work or surface a re-register recommendation.

type KitConfig

type KitConfig struct {
	// ScanPaths is the ordered list of directories the kit registry walks
	// to find installed kits. Empty / absent means [DefaultKitScanPath()]
	// (resolved by applyDefaults).
	ScanPaths []string `yaml:"scanPaths,omitempty" json:"scanPaths,omitempty"`
}

KitConfig configures the Layer-4 kit operator surface — the scan paths the daemon walks to discover installed kits. Wave 11 / ADR-2026-05-07 § D4. ScanPaths are evaluated in declaration order; the first entry is also where the .state.json sidecar (enable/disable toggles) lives. A leading `~/` is expanded to the user's home directory by NewKitRegistry.

type KitRegistry

type KitRegistry struct {
	// contains filtered or unexported fields
}

KitRegistry is a minimal in-process Kit registry.

Methods are safe for concurrent use. The registry rescans on every List call so newly-installed manifests appear without a daemon restart; this is acceptable for an operator-facing surface where call volume is low.

func NewKitRegistry

func NewKitRegistry(scanPaths []string) *KitRegistry

NewKitRegistry constructs a KitRegistry with permissive trust mode.

scanPaths defaults to []string{DefaultKitScanPath()} when nil or empty. The first scan path is also where the .state.json sidecar lives.

Equivalent to NewKitRegistryWithTrust(scanPaths, TrustConfig{Mode: TrustModePermissive}). Callers wiring trust modes (or an issuer allowlist) from daemon.yaml should use NewKitRegistryWithTrust.

func NewKitRegistryWithTrust

func NewKitRegistryWithTrust(scanPaths []string, trust TrustConfig) *KitRegistry

NewKitRegistryWithTrust constructs a KitRegistry with the given trust configuration. Used by Server.kitRegistryOrEmpty to thread the daemon.Config().Trust block into the registry.

If the verifier fails to construct (e.g., the embedded trust root JSON fails to parse), a permissive verifier with no trusted material is installed instead — every signed manifest reports SignedUnverified, every unsigned reports Unsigned, and the install gate behaves as if Mode=Permissive. The construction error is logged via slog.Warn so operators can diagnose.

func (*KitRegistry) Disable

func (r *KitRegistry) Disable(id string) (afclient.Kit, error)

Disable marks the kit disabled in the persisted state. Returns the updated Kit summary or ErrKitNotFound when the id is unknown.

func (*KitRegistry) DisableSource

func (r *KitRegistry) DisableSource(name string) (afclient.KitRegistrySource, error)

DisableSource toggles a registry source off.

func (*KitRegistry) Enable

func (r *KitRegistry) Enable(id string) (afclient.Kit, error)

Enable marks the kit active in the persisted state. Returns the updated Kit summary or ErrKitNotFound when the id is unknown.

func (*KitRegistry) EnableSource

func (r *KitRegistry) EnableSource(name string) (afclient.KitRegistrySource, error)

EnableSource toggles a registry source on. Returns ErrKitSourceNotFound if the name is not in the federation list.

func (*KitRegistry) Get

Get returns the full manifest for a single kit id. Returns ErrKitNotFound when the id is not registered.

func (*KitRegistry) Install

Install fetches a kit from the operator-supplied source, runs the trust-gated verifier against the freshly-fetched manifest, and (when the gate allows) persists the manifest + sibling .sigstore bundle into the first configured scan path.

Behaviour by request shape (audit § 2.1, § 2.2):

  • req.Source == nil — the Wave-9 backward-compat path. Returns ErrKitInstallUnimplemented (HTTP 501) so the existing Wave-9 smoke + handler tests posting `{}` keep their assertions intact.
  • req.Source.Kind == "git" — clone source.URL @ source.Ref into a temp dir (via gitKitFetcher), locate the manifest, run the verifier, gate on r.verifier.config.Mode, persist into scanPaths[0]. Errors map to ErrKitInstallSourceFetchFailed (502) or ErrKitInstallManifestNotFound (422).
  • req.Source.Kind == "tessl" / "agentskills" / "community" — federation cross-repo wave (REN-1308 follow-up). Returns ErrKitSourceFederationUnimplemented (HTTP 501).
  • Any other kind — wrapped fmt error (handler-mapped to 400).

Trust override: `req.TrustOverride == "allowed-this-once"` bypasses the gate for a single install with structured slog audit logging. Otherwise an unsigned/unverified manifest under a non-permissive trust mode returns ErrKitTrustGateRejected (HTTP 403).

Manifest persistence uses the atomic tmp-then-rename pattern to match the kit_state writer at saveStateLocked. The on-disk filename is `<sanitizedID>.kit.toml` where slashes in the manifest's `kit.id` are replaced with `__` (the manifest's internal `kit.id` retains the canonical slash form).

func (*KitRegistry) List

func (r *KitRegistry) List() []afclient.Kit

List returns all installed kits across all scan paths. Malformed manifests log a warning and are excluded. Empty scan paths return an empty slice with no error.

func (*KitRegistry) ListSources

func (r *KitRegistry) ListSources() []afclient.KitRegistrySource

ListSources returns the federation order's registry source descriptors. Persisted disable state from .state.json is applied to the Enabled flag.

func (*KitRegistry) ScanPaths

func (r *KitRegistry) ScanPaths() []string

ScanPaths returns the registry's scan paths in declaration order.

func (*KitRegistry) VerifySignature

func (r *KitRegistry) VerifySignature(id string) (afclient.KitSignatureResult, error)

VerifySignature returns a KitSignatureResult for the kit, driven by the sigstore bundle-mode verifier (Wave 12 / S2). The verifier reads the sibling `<manifest>.sigstore` file alongside the kit manifest; missing-bundle returns KitTrustUnsigned with OK: true. Verification outcomes map to KitTrustSignedVerified / KitTrustSignedUnverified; see kit_trust.go for the full state machine.

type MachineConfig

type MachineConfig struct {
	ID     string `yaml:"id"               json:"id"`
	Region string `yaml:"region,omitempty" json:"region,omitempty"`
}

MachineConfig captures the machine identity block from daemon.yaml.

type ObservabilityConfig

type ObservabilityConfig struct {
	LogFormat   string `yaml:"logFormat,omitempty"   json:"logFormat,omitempty"`
	LogPath     string `yaml:"logPath,omitempty"     json:"logPath,omitempty"`
	MetricsPort int    `yaml:"metricsPort,omitempty" json:"metricsPort,omitempty"`
}

ObservabilityConfig holds optional log/metrics tuning.

type Options

type Options struct {
	// ConfigPath is where to load / persist daemon.yaml. Defaults to
	// DefaultConfigPath().
	ConfigPath string
	// JWTPath is where to cache the runtime JWT. Defaults to
	// DefaultJWTPath().
	JWTPath string
	// SkipWizard, when true, prevents the interactive wizard from running
	// even when stdin is a TTY. The default config (or existing config) is
	// used instead.
	SkipWizard bool
	// SkipRegistration, when true, skips the registration call (used when
	// the daemon is being started in setup-only or config-only modes).
	SkipRegistration bool
	// SpawnerOptions overrides the default spawner options. The Projects
	// and MaxConcurrentSessions fields are populated automatically from
	// loaded config.
	SpawnerOptions SpawnerOptions
	// HTTPHost overrides the default control server bind address.
	HTTPHost string
	// HTTPPort overrides the default control server port.
	//
	// Zero means "ephemeral port": the listener binds 127.0.0.1:0 and
	// the kernel picks a free port. The effective bound port is then
	// available via Server.Addr() after Server.Start succeeds.
	// Production callers (afcli/daemon_run.go) substitute the
	// well-known DefaultHTTPPort (7734) themselves before constructing
	// Options so operator behaviour is preserved; the daemon library
	// itself does NOT auto-fill — leaving zero-as-ephemeral makes
	// parallel tests collision-free under -race.
	HTTPPort int
	// PoolStatsProvider returns the current workarea pool snapshot. May be
	// nil — the /api/daemon/pool/stats endpoint will return an empty
	// snapshot in that case (acceptance criterion: pool integration is
	// optional in the runtime port; full WorkareaProvider wiring is REN-1280).
	PoolStatsProvider PoolStatsProvider
	// EvictHandler handles pool eviction requests. May be nil; the endpoint
	// returns 501 in that case.
	EvictHandler EvictHandler
	// ProviderRegistry exposes the daemon's locally-registered AgentRuntime
	// providers (claude/codex/ollama/opencode/gemini/amp/stub) to the
	// /api/daemon/providers* surface. May be nil — the endpoint will then
	// return an empty list with PartialCoverage=true, which is the correct
	// behaviour for a daemon that has not yet wired its runtime registry.
	// Wave 9 / ADR-2026-05-07-daemon-http-control-api.md §D4.
	ProviderRegistry ProviderRegistry

	// Version overrides the package-level `Version` for status reporting.
	// Empty falls back to the package var (which itself defaults to "dev"
	// unless the build injected via -ldflags). Downstream embedders that
	// ship their own binary (e.g. the rensei daemon) should set this to
	// their own version string so /api/daemon/status reports the
	// running binary, not whatever string agentfactory-tui's vendored
	// source had at the time.
	Version string
}

Options configure a Daemon.

type OrchestratorConfig

type OrchestratorConfig struct {
	URL       string `yaml:"url"                 json:"url"`
	AuthToken string `yaml:"authToken,omitempty" json:"authToken,omitempty"`
}

OrchestratorConfig is the orchestrator URL + registration token block.

type PendingMutation

type PendingMutation struct {
	ID          string          `json:"id"`
	Op          string          `json:"op"` // project.add | project.remove
	Params      json.RawMessage `json:"params"`
	RequestedAt string          `json:"requestedAt"`
	RequestedBy string          `json:"requestedBy"`
}

PendingMutation mirrors the platform's serializePendingMutation wire shape — included in the heartbeat response so the daemon can apply queued proposals and ACK on the next beat. Phase 2 of 2026-05-18-daemon-config-sync-DESIGN.md.

type PollHTTPError

type PollHTTPError struct {
	Status int
	Body   string
}

PollHTTPError is returned by callPollEndpoint for non-2xx responses so the loop can branch on the HTTP status (401 → re-register).

func (*PollHTTPError) Error

func (e *PollHTTPError) Error() string

type PollOptions

type PollOptions struct {
	WorkerID        string
	OrchestratorURL string
	RuntimeJWT      string
	IntervalSeconds int

	// HTTPClient is the client used for poll calls. Defaults to a 30s-timeout
	// http.Client.
	HTTPClient *http.Client
	// LogWarn is called for transient poll failures. Defaults to no-op.
	LogWarn func(format string, args ...any)
	// LogInfo is called when work is dispatched / re-register fires.
	LogInfo func(format string, args ...any)
	// OnWork is invoked for each item returned in the work[] slice. Errors are
	// logged at warn and do not stop the loop. Required.
	OnWork func(item PollWorkItem) error
	// OnReregister is called on HTTP 401 (runtime JWT expired) or 404 (worker
	// fell out of Redis). Implementations re-issue Register() and return the
	// fresh worker id + runtime token. The poll loop swaps credentials and
	// continues. Returning an error logs and the loop retries on the next tick.
	//
	// reason is the structured failure reason ("worker-not-found",
	// "runtime-token-expired", "unauthorized", "auth-failure"). Pass it
	// through to RefreshRuntimeToken so the correct recovery path is taken
	// — "worker-not-found" skips the JWT refresh probe and goes directly to
	// full re-registration to create a new Redis entry.
	OnReregister func(ctx context.Context, reason string) (workerID, runtimeJWT string, err error)
}

PollOptions configure a single poll loop run.

type PollResponse

type PollResponse struct {
	Work              []PollWorkItem `json:"work"`
	HasInboxMessages  bool           `json:"hasInboxMessages,omitempty"`
	PreClaimed        bool           `json:"preClaimed,omitempty"`
	ClaimedSessionIDs []string       `json:"claimedSessionIds,omitempty"`
}

PollResponse is the body of GET /api/workers/<id>/poll. Only the fields the daemon currently consumes are decoded; unknown fields are ignored.

type PollService

type PollService struct {
	// contains filtered or unexported fields
}

PollService manages the periodic poll goroutine. Like HeartbeatService it is safe to Start / Stop multiple times; consecutive Starts are idempotent.

func NewPollService

func NewPollService(opts PollOptions) *PollService

NewPollService constructs a PollService from opts. OnWork must be non-nil.

func (*PollService) IsRunning

func (p *PollService) IsRunning() bool

IsRunning reports whether the poll goroutine is active.

func (*PollService) Start

func (p *PollService) Start()

Start launches the poll goroutine. Subsequent calls are no-ops.

func (*PollService) Stop

func (p *PollService) Stop()

Stop terminates the poll goroutine. Safe to call multiple times.

type PollStageBudget

type PollStageBudget struct {
	MaxDurationSeconds int   `json:"maxDurationSeconds,omitempty"`
	MaxSubAgents       int   `json:"maxSubAgents,omitempty"`
	MaxTokens          int64 `json:"maxTokens,omitempty"`
}

PollStageBudget mirrors the platform's StageBudget shape so the daemon can decode + forward it without depending on the runner package (cardinal package-architecture rule: daemon does not import runner). The runner re-types this into prompt.StageBudget when it constructs the QueuedWork. (REN-1485 / REN-1487.)

type PollWorkItem

type PollWorkItem struct {
	SessionID    string            `json:"sessionId"`
	ProjectName  string            `json:"projectName,omitempty"`
	Repository   string            `json:"repository,omitempty"`
	Ref          string            `json:"ref,omitempty"`
	Priority     int               `json:"priority,omitempty"`
	Env          map[string]string `json:"env,omitempty"`
	MaxDuration  int               `json:"maxDurationSeconds,omitempty"`
	Resources    *SessionResources `json:"resources,omitempty"`
	QueuedAt     int64             `json:"queuedAt,omitempty"`
	ProjectScope string            `json:"projectScope,omitempty"`

	// REN-1461 / F.2.8 — enriched fields the platform may send so the
	// `donmai agent run` worker has the runner context it needs without
	// requiring a separate platform fetch. Optional during the rollout
	// window; absent fields fall through to the default render path.
	IssueID           string                  `json:"issueId,omitempty"`
	IssueIdentifier   string                  `json:"issueIdentifier,omitempty"`
	LinearSessionID   string                  `json:"linearSessionId,omitempty"`
	ProviderSessionID string                  `json:"providerSessionId,omitempty"`
	OrganizationID    string                  `json:"organizationId,omitempty"`
	WorkType          string                  `json:"workType,omitempty"`
	PromptContext     string                  `json:"promptContext,omitempty"`
	Body              string                  `json:"body,omitempty"`
	Title             string                  `json:"title,omitempty"`
	MentionContext    string                  `json:"mentionContext,omitempty"`
	ParentContext     string                  `json:"parentContext,omitempty"`
	Branch            string                  `json:"branch,omitempty"`
	ResolvedProfile   *SessionResolvedProfile `json:"resolvedProfile,omitempty"`
	ModelProfile      *SessionModelProfile    `json:"modelProfile,omitempty"`

	// REN-1485 / REN-1487 Phase 2 stage-driven SDLC fields. Populated
	// by the platform's `agent.dispatch_stage` action; absent when the
	// work was queued by the legacy `agent.dispatch_to_queue` action.
	// Round-trip opaquely on the QueuedWork JSON; the daemon forwards
	// them onto SessionDetail without interpreting them.
	StagePrompt        string           `json:"stagePrompt,omitempty"`
	StageID            string           `json:"stageId,omitempty"`
	StageBudget        *PollStageBudget `json:"stageBudget,omitempty"`
	StageLifecycle     map[string]any   `json:"stageLifecycle,omitempty"`
	StageSourceEventID string           `json:"stageSourceEventId,omitempty"`

	// SystemPromptOverride is the per-session platform-supplied system
	// prompt that replaces the runner's default system_base.tmpl render
	// when non-empty. The leaf consumer at `prompt/builder.go` already
	// reads `qw.SystemPromptOverride`; this struct field is the wire-
	// shape forwarder. Without it Go's strict JSON decoder drops the
	// platform's emit (unknown-field discard) — SUP-1840 backlog-writer
	// sessions fell through to system_base.tmpl and produced developer-
	// style behavior (`pnpm af-linear`, Bash/Write/Edit churn).
	SystemPromptOverride string `json:"systemPromptOverride,omitempty"`

	// DisallowedTools is the platform-supplied set of tool-name patterns
	// the credential-injection layer stamps onto QueuedWork via
	// stampCredSurfaceDisallowedTools(). Without this field Go's strict
	// JSON decoder silently drops the platform's emit; the runner's
	// spec_translation.go then never appends the per-workType restrictions
	// to Spec.DisallowedTools. Mirror of the v0.9.3 SystemPromptOverride
	// fix — opaque forwarder only, no new logic.
	DisallowedTools []string `json:"disallowedTools,omitempty"`
}

PollWorkItem mirrors one element of the platform's poll response `work[]` array. The platform serves GET /api/workers/<id>/poll and returns:

{
  work: QueuedWork[],
  inboxMessages: { [sessionId]: InboxMessage[] },
  hasInboxMessages: boolean,
  preClaimed: boolean,
  claimedSessionIds: string[],
  gitCredentials: { token, cloneUrl, expiresAt }[],
}

QueuedWork carries the session-spec fields the daemon needs to dispatch a session to the spawner. Field names follow the platform wire shape (camelCase).

QueuedAt is a Unix-millisecond epoch number on the wire — the platform's QueuedWork interface (packages/agentfactory-server work-queue.ts) defines it as `queuedAt: number`, and the Redis-stored session payload confirms a numeric value (e.g. 1777658441780). v0.4.1 mistakenly typed it as `string`, which caused the daemon's poll loop to fail decoding ("cannot unmarshal number into Go struct field PollWorkItem.work.queuedAt of type string") and silently drop pre-claimed sessions.

type PoolCapacityGuard

type PoolCapacityGuard interface {
	// CheckCapacity returns nil + zero retryAfter when a new member
	// fits, or a non-zero retryAfter and an explanatory error when the
	// pool is saturated.
	CheckCapacity() (retryAfter time.Duration, err error)
}

PoolCapacityGuard tells Restore whether a fresh pool member can be admitted. Returning a non-zero retryAfter indicates saturation — Restore propagates that to the HTTP handler as 503 + Retry-After.

type PoolStatsProvider

type PoolStatsProvider interface {
	Stats(ctx context.Context) (*afclient.WorkareaPoolStats, error)
}

PoolStatsProvider returns a workarea pool snapshot.

type PrefixWriterFunc

type PrefixWriterFunc func(workerID, line string)

PrefixWriterFunc adapts a function to PrefixedWriter.

func (PrefixWriterFunc) WriteWorkerLine

func (f PrefixWriterFunc) WriteWorkerLine(workerID, line string)

WriteWorkerLine implements PrefixedWriter.

type PrefixedWriter

type PrefixedWriter interface {
	WriteWorkerLine(workerID, line string)
}

PrefixedWriter is the minimal sink interface used by the spawner to emit child stdout/stderr. Implementations are responsible for prefixing each line with the worker tag.

type ProjectAllowlistEntry

type ProjectAllowlistEntry struct {
	ID         string `json:"id"`
	Repository string `json:"repository"`
}

ProjectAllowlistEntry is the wire shape for a single allowlisted project reported by the daemon. Mirrors the on-disk daemon.yaml `projects[]` shape (daemon/config.go ProjectConfig) but trimmed to the fields the platform needs for display + routing-decision visibility.

type ProjectConfig

type ProjectConfig struct {
	ID            string        `yaml:"id"                       json:"id"`
	Repository    string        `yaml:"repository"               json:"repository"`
	CloneStrategy CloneStrategy `yaml:"cloneStrategy,omitempty"  json:"cloneStrategy,omitempty"`
	Git           *ProjectGit   `yaml:"git,omitempty"            json:"git,omitempty"`
}

ProjectConfig describes one entry in the project allowlist.

func (*ProjectConfig) UnmarshalYAML

func (p *ProjectConfig) UnmarshalYAML(node *yaml.Node) error

UnmarshalYAML accepts either the canonical `repository` key or the legacy `repoUrl` key (pre-REN-1419 daemon.yaml files written by older versions of `rensei project allow`). When the legacy key is found a one-line warning is logged so operators know to rewrite the file; this back-compat shim is scheduled for removal one release after the canonical writer ships.

type ProjectGit

type ProjectGit struct {
	CredentialHelper string `yaml:"credentialHelper,omitempty" json:"credentialHelper,omitempty"`
	SSHKey           string `yaml:"sshKey,omitempty"           json:"sshKey,omitempty"`
}

ProjectGit captures per-project credential helper / SSH key hints.

type ProvideCapability

type ProvideCapability struct {
	Kind string `json:"kind"`
}

ProvideCapability is a single entry in the RegisterRequest.Provides array. It mirrors the SubstrateCapability wire shape used by the internal capability detector but is kept in the daemon package (public API surface) to avoid a cross-package import cycle between daemon and internal/daemon.

type ProviderRegistry

type ProviderRegistry interface {
	// Names returns the sorted list of registered provider name strings.
	// Each name is the canonical agent.ProviderName string (e.g. "claude",
	// "codex"). Order is stable across calls.
	Names() []string
	// Capabilities returns the typed capability struct serialised to a
	// flat map[string]any for the named provider. ok is false when the
	// provider is not registered. The map shape matches the JSON encoding
	// of agent.Capabilities so the wire shape on /api/daemon/providers
	// matches the contract.
	Capabilities(name string) (caps map[string]any, ok bool)
}

ProviderRegistry is the minimal read-only view of the runner's in-process AgentRuntime registry the /api/daemon/providers handler consumes. The daemon imports a satisfying type from runner.Registry — the interface keeps this package free of a runner import cycle. (Wave 9 / A1.)

type RefreshTokenResult

type RefreshTokenResult struct {
	// Mode is the path the refresh actually took: "refresh" (platform
	// honoured the refresh probe and minted a new JWT bound to the
	// same workerId), "reregister" (probe returned 404 / endpoint
	// missing — the daemon fell back to full POST /api/workers/register
	// and got a NEW workerId), or "error" (both paths failed).
	Mode string

	// WorkerID is the worker id in effect after the refresh attempt.
	// On Mode=refresh this is the SAME workerId; on Mode=reregister
	// it's a fresh one.
	WorkerID string

	// RuntimeToken is the fresh runtime JWT.
	RuntimeToken string

	// RegistrationTokenSwapped is true when Mode=reregister produced a
	// different workerId. Operators care about this signal because the
	// platform forgets the old workerId after a fresh registration —
	// any in-flight heartbeats / polls keyed on it 404 until the daemon
	// swaps credentials. (REN-1481 root cause.)
	RegistrationTokenSwapped bool

	// Reason is the structured reason the refresh path was taken
	// (e.g. "runtime-token-expired", "worker-not-found"). Surfaces in
	// the [runtime-token] log line.
	Reason string
}

RefreshTokenResult is the outcome of an attempted runtime-token refresh. The OnReregister callback wired into HeartbeatService and PollService synthesises one of these per attempt; logged via the `[runtime-token]` structured line.

func RefreshRuntimeToken

func RefreshRuntimeToken(
	ctx context.Context,
	regOpts RegistrationOptions,
	currentWorkerID string,
	reason string,
) (*RefreshTokenResult, error)

RefreshRuntimeToken attempts to refresh the daemon's runtime JWT without re-registering — i.e. preserving the workerId. This is the REN-1481 fix path. Behaviour:

  1. When reason is "worker-not-found" (HTTP 404 on poll or heartbeat), the worker's Redis registration entry has expired — the runtime token itself is still valid, but the platform has no record of this worker. Probing the refresh endpoint would return a fresh JWT for the SAME workerId, which would loop forever. Skip the probe and go directly to full re-register to create a new Redis entry.
  2. Otherwise, probe POST /api/workers/<id>/refresh-token with the registration token in the Authorization: Bearer header. On 200, the platform mints a fresh JWT bound to the same workerId — best case.
  3. On 404 (endpoint missing — current platform-side state) or 405 (method not allowed) from the refresh probe, fall through to FULL re-register via Register(ForceReregister=true). The runtime token gets refreshed but at the cost of a new workerId.
  4. On any other failure (5xx, network, 401-on-registration-token), return an error. Caller logs + retries on next tick.

This is the only path that should call Register() with ForceReregister=true outside boot. All in-flight 401/404 detection in HeartbeatService / PollService routes through here so the `[runtime-token]` log line is the single source of truth for operators investigating the 5-minute cycle in REN-1481.

type RegisterRequest

type RegisterRequest struct {
	MachineID string              `json:"machineId,omitempty"`
	Hostname  string              `json:"hostname"`
	Capacity  int                 `json:"capacity"`
	Version   string              `json:"version,omitempty"`
	Projects  []string            `json:"projects,omitempty"`
	Provides  []ProvideCapability `json:"provides,omitempty"`

	// DaemonProjects is the structured project allowlist read from the
	// daemon's local config (daemon.yaml's projects[]). Each entry carries
	// the project id and resolved repository URL the daemon enforces at
	// WorkerSpawner.AcceptWork time. Distinct from the legacy `Projects`
	// []string above, which the platform overwrites with Linear-resolved
	// names for registration-token auth (see platform/src/app/api/workers/
	// register/route.ts:265).
	//
	// Phase 1c of 2026-05-18-daemon-config-sync-DESIGN.md — read-only mirror;
	// platform persists into worker_hosts.allowed_projects jsonb so the
	// capacity UI can surface "this host serves projects X, Y, Z" without
	// SSH-ing to the host. Omitted when the daemon yaml has no projects[]
	// entries; the platform falls back to "unknown / unrestricted" semantics.
	DaemonProjects []ProjectAllowlistEntry `json:"daemonProjects,omitempty"`
}

RegisterRequest is the JSON body sent on POST /api/workers/register.

The platform contract (see platform/src/app/api/workers/register/route.ts):

{ machineId?: string, hostname: string, capacity: number, version?: string,
  projects?: string[], provides?: []{ kind: string } }

The registration token is sent in the Authorization: Bearer header, NOT in the body. Status / region / activeAgentCount are not part of the platform contract — they live in the heartbeat payload, or are inferred from the project's Linear tracker bindings on the server side.

provides[] carries the substrate capabilities detected by the daemon at startup. Each entry has a `kind` field matching the platform v1 SubstrateCapabilityDeclaration.runtimeKinds enum (ADR-2026-05-12-capacity-pools-and-substrate-resolution.md §2). The platform stamps pool-level capability on worker_hosts.capabilities when this field is present. Omitted on older daemon versions; the platform falls back to provider-class defaults in that case.

type RegisterResponse

type RegisterResponse struct {
	WorkerID              string `json:"workerId"`
	HeartbeatInterval     int    `json:"heartbeatInterval"` // ms
	PollInterval          int    `json:"pollInterval"`      // ms
	RuntimeToken          string `json:"runtimeToken"`
	RuntimeTokenExpiresAt string `json:"runtimeTokenExpiresAt,omitempty"`
}

RegisterResponse is the JSON response from POST /api/workers/register.

Platform contract:

{ workerId, heartbeatInterval (ms), pollInterval (ms),
  runtimeToken, runtimeTokenExpiresAt }

Field names mirror the wire shape; helper methods provide seconds-based accessors used by the heartbeat scheduler.

func Register

Register dials the platform (or the stub path) and returns a RegisterResponse. The cache at jwtPath is consulted first unless opts.ForceReregister is set.

Real-platform registration is the default. The stub path is taken when:

  • RENSEI_DAEMON_FORCE_STUB env is set (e.g. =1), OR
  • the orchestrator URL is "file://...", OR
  • the registration token does not start with rsp_live_ or rsk_live_.

REN-1444 (v0.4.1) inverted the env-gate from opt-in to opt-out. The previous default required RENSEI_DAEMON_REAL_REGISTRATION=1 in the launchd plist; with that env unset, a daemon configured with a real rsk_live_* token would silently fall back to stub mode and never register against the platform.

func (*RegisterResponse) HeartbeatIntervalSeconds

func (r *RegisterResponse) HeartbeatIntervalSeconds() int

HeartbeatIntervalSeconds returns the heartbeat cadence in seconds (rounded up). The platform reports the cadence in milliseconds; daemon code that schedules tickers historically worked in seconds.

func (*RegisterResponse) PollIntervalSeconds

func (r *RegisterResponse) PollIntervalSeconds() int

PollIntervalSeconds returns the poll cadence in seconds (rounded up).

type RegistrationOptions

type RegistrationOptions struct {
	OrchestratorURL   string
	RegistrationToken string
	MachineID         string
	Hostname          string
	Version           string
	MaxAgents         int
	Capabilities      []string
	Region            string
	JWTPath           string
	ForceReregister   bool

	// Provides is the substrate capability set the daemon advertises to the
	// platform at registration time. Each entry corresponds to a
	// SubstrateCapabilityDeclaration.runtimeKinds value (e.g. "native",
	// "npm", "python-pip"). When nil the field is omitted from the wire
	// payload and the platform falls back to provider-class defaults.
	// (ADR-2026-05-12-capacity-pools-and-substrate-resolution.md §2, Stream H.)
	Provides []ProvideCapability

	// DaemonProjects is the structured allowlist reported to the platform
	// for read-only mirroring (Phase 1c of daemon-config-sync DESIGN).
	// Populate from cfg.Projects at the call site.
	DaemonProjects []ProjectAllowlistEntry

	// HTTPClient is the client used when the real (non-stub) path is taken.
	// Defaults to http.DefaultClient with a 10s timeout.
	HTTPClient *http.Client
	// Now lets tests deterministically clock the cached-at timestamp.
	Now func() time.Time
}

RegistrationOptions configure a single Register call.

type RegistrationStatus

type RegistrationStatus string

RegistrationStatus is the worker-status string sent to the orchestrator in the heartbeat payload. Mirrors the TS DaemonRegistrationStatus.

const (
	RegistrationIdle     RegistrationStatus = "idle"
	RegistrationBusy     RegistrationStatus = "busy"
	RegistrationDraining RegistrationStatus = "draining"
)

Registration status constants.

type ReservedSystemSpec

type ReservedSystemSpec struct {
	VCpu     int `yaml:"vCpu"     json:"vCpu"`
	MemoryMb int `yaml:"memoryMb" json:"memoryMb"`
}

ReservedSystemSpec describes resources reserved for the host OS.

type RoutingTraceStore

type RoutingTraceStore struct {
	// contains filtered or unexported fields
}

RoutingTraceStore is the in-process record of routing decisions. The scheduler (or, in this wave, the test harness) feeds it via RecordDecision; HTTP handlers read via GetConfig and Explain.

The store is safe for concurrent use.

func NewRoutingTraceStore

func NewRoutingTraceStore(ringSize int) *RoutingTraceStore

NewRoutingTraceStore constructs a store with the given ring-buffer size. ringSize ≤ 0 falls back to DefaultRoutingRingBufferSize.

func (*RoutingTraceStore) Explain

Explain returns the recorded decision and trace for sessionID. Returns false when the session has no recorded decision (or the decision has been evicted from the ring).

func (*RoutingTraceStore) GetConfig

func (s *RoutingTraceStore) GetConfig(providerNames []string, capturedAt time.Time) afclient.RoutingConfig

GetConfig builds the wire-shape RoutingConfig for the /api/daemon/routing/config endpoint. It composes the static portions (weights, capability filters, sandbox/LLM provider state) with the rolling RecentDecisions tail.

The provider-state surfaces are seeded from the runner.Registry's Names() (passed in via providerNames) — this represents AgentRuntime providers. The sandbox state lists only "local" because that's the only OSS-shipped sandbox in this wave. Both lists default to Thompson-Sampling priors (alpha=1, beta=1) when no decisions have been recorded.

capturedAt sets the snapshot timestamp; pass time.Now().UTC() in production.

func (*RoutingTraceStore) Len

func (s *RoutingTraceStore) Len() int

Len returns the current number of recorded decisions in the ring buffer. Test-only helper.

func (*RoutingTraceStore) RecordDecision

func (s *RoutingTraceStore) RecordDecision(decision afclient.RoutingDecision, trace []afclient.RoutingTraceStep)

RecordDecision appends decision + trace to the store. If the store is already at ring capacity, the oldest entry is evicted from both the ring and the per-session lookup. Recording with an empty SessionID is allowed (the ring still tracks it) but the explain lookup is keyed by SessionID, so an unkeyed entry is invisible to Explain.

type Server

type Server struct {
	// contains filtered or unexported fields
}

Server is the daemon's HTTP control API. It wraps a Daemon and exposes the endpoints consumed by `donmai daemon …` and `rensei daemon …`.

func NewServer

func NewServer(d *Daemon) *Server

NewServer builds an HTTP server for d. The handler is registered but the server is not yet listening — call Start to bind.

func (*Server) Addr

func (s *Server) Addr() string

Addr returns the address the server is bound to (after Start succeeds).

func (*Server) Shutdown

func (s *Server) Shutdown(ctx context.Context) error

Shutdown gracefully shuts down the HTTP server.

func (*Server) Start

func (s *Server) Start() (<-chan error, error)

Start binds the listener and serves in a goroutine. Errors during accept are reported via the returned channel — callers should select on it alongside their own shutdown signal.

type SessionDetail

type SessionDetail struct {
	// SessionID is the platform session UUID. Always populated.
	SessionID string `json:"sessionId"`

	// IssueID is the Linear issue UUID this session was triggered for.
	IssueID string `json:"issueId,omitempty"`

	// IssueIdentifier is the human-readable Linear identifier
	// (e.g. "REN-1457").
	IssueIdentifier string `json:"issueIdentifier,omitempty"`

	// LinearSessionID is the Linear-side agent-session id.
	LinearSessionID string `json:"linearSessionId,omitempty"`

	// ProviderSessionID is the provider-native session id when this
	// is a resume (e.g. Claude session UUID).
	ProviderSessionID string `json:"providerSessionId,omitempty"`

	// ProjectName is the canonical Linear project identifier.
	ProjectName string `json:"projectName,omitempty"`

	// OrganizationID is the Rensei tenant UUID.
	OrganizationID string `json:"organizationId,omitempty"`

	// Repository is the git URL (or owner/name slug) the agent should
	// operate on.
	Repository string `json:"repository,omitempty"`

	// Ref is the base branch / ref to check out from.
	Ref string `json:"ref,omitempty"`

	// WorkType is the workflow discriminant ("development", "qa",
	// "research", ...).
	WorkType string `json:"workType,omitempty"`

	// PromptContext is the rendered Linear issue context block produced
	// by the platform-side dispatcher.
	PromptContext string `json:"promptContext,omitempty"`

	// Body is the raw Linear issue description text.
	Body string `json:"body,omitempty"`

	// Title is the Linear issue title.
	Title string `json:"title,omitempty"`

	// MentionContext is the optional user-mention text from the Linear
	// agent-session create event.
	MentionContext string `json:"mentionContext,omitempty"`

	// ParentContext is the optional parent-issue context block built
	// by the coordinator when this session is a sub-agent.
	ParentContext string `json:"parentContext,omitempty"`

	// Branch is the working branch the agent should create/use.
	Branch string `json:"branch,omitempty"`

	// ResolvedProfile carries the model-profile knobs the platform
	// resolved before queueing this work. Daemon stores opaquely.
	ResolvedProfile *SessionResolvedProfile `json:"resolvedProfile,omitempty"`

	// ModelProfile is the richer, fully-rendered model-profile the
	// platform passes with each dispatch when workType+model-profile
	// routing is active (ADR-2026-05-12-worktype-and-model-profile-
	// routing). When present it supersedes ResolvedProfile.Provider /
	// Model / Effort in the runner. Forwarded opaquely by the daemon.
	ModelProfile *SessionModelProfile `json:"modelProfile,omitempty"`

	// WorkerID is the daemon worker id that claimed this session.
	WorkerID string `json:"workerId,omitempty"`

	// AuthToken is the runtime JWT the runner uses for platform API
	// calls (heartbeat, result post). Scoped to this worker.
	AuthToken string `json:"authToken,omitempty"`

	// PlatformURL is the base URL of the platform.
	PlatformURL string `json:"platformUrl,omitempty"`

	// StagePrompt is the pre-rendered user-prompt body the platform
	// dispatcher built from the stage prompt template. When present
	// the runner uses it verbatim and skips the embedded user template.
	StagePrompt string `json:"stagePrompt,omitempty"`

	// StageID is the canonical stage id (e.g. "research",
	// "development", "qa"). Used for log correlation + env injection.
	StageID string `json:"stageId,omitempty"`

	// StageBudget is the per-stage runtime budget the runner enforces.
	StageBudget *PollStageBudget `json:"stageBudget,omitempty"`

	// StageLifecycle is the lifecycle config for the workflow this
	// stage instance belongs to. Forwarded opaquely on WORK_RESULT.
	StageLifecycle map[string]any `json:"stageLifecycle,omitempty"`

	// StageSourceEventID is the source CloudEvent id the stage trigger
	// normaliser emitted. Carried for end-to-end audit correlation.
	StageSourceEventID string `json:"stageSourceEventId,omitempty"`

	// SystemPromptOverride forwards the per-session platform-supplied
	// system prompt from PollWorkItem onto the runner's QueuedWork.
	// Read by `prompt/builder.go` (already wired) — this field closes
	// the daemon→runner wire-shape gap. SUP-1840 precedent.
	SystemPromptOverride string `json:"systemPromptOverride,omitempty"`

	// DisallowedTools forwards the platform-stamped credential-surface
	// tool restrictions from PollWorkItem onto the runner's QueuedWork.
	// Consumed by runner/spec_translation.go (already wired via 70bf4c0)
	// — this field closes the daemon→runner wire-shape gap.
	// Mirror of the v0.9.3 SystemPromptOverride fix.
	DisallowedTools []string `json:"disallowedTools,omitempty"`
}

SessionDetail is the per-session payload `donmai agent run` reads from the daemon's local control HTTP API on spawn. It carries the full runner-side QueuedWork shape (issue context, resolved profile, branch) plus the platform-side credentials the runner needs to talk back (auth token, platform URL, worker id, lock id).

The daemon stores one SessionDetail per accepted session in an in-memory map. A spawned `donmai agent run` process fetches its detail via GET /api/daemon/sessions/<id> at start-up.

Wire shape: JSON, camelCase tags. Forward-compat — new fields can be added freely; clients ignore unknown fields.

type SessionEvent

type SessionEvent struct {
	Kind    SessionEventKind
	Handle  SessionHandle
	Spec    SessionSpec
	ExitErr error
}

SessionEvent is emitted on the spawner's events channel.

type SessionEventKind

type SessionEventKind string

SessionEventKind identifies the kind of SessionEvent.

const (
	SessionEventStarted SessionEventKind = "started"
	SessionEventEnded   SessionEventKind = "ended"
)

Session event kind constants.

type SessionHandle

type SessionHandle struct {
	SessionID  string       `json:"sessionId"`
	PID        int          `json:"pid"`
	AcceptedAt string       `json:"acceptedAt"`
	State      SessionState `json:"state"`
}

SessionHandle is the daemon-side handle for an in-flight session.

type SessionModelProfile

type SessionModelProfile struct {
	// ID is the model_profile row UUID (e.g. "mp_01jt5...").
	ID string `json:"id"`

	// ProviderID is the canonical provider family (e.g. "claude", "codex",
	// "gemini", "ollama").
	ProviderID string `json:"providerId"`

	// Model is the model variant within the provider family.
	Model string `json:"model"`

	// Mode is the reasoning-effort/speed tier (e.g. "xhigh").
	Mode string `json:"mode,omitempty"`

	// Context is the context-window size in tokens required for this
	// dispatch. Zero means "use the model default".
	Context int `json:"context,omitempty"`

	// MaxOutputTokens is the per-response output-token budget. Zero
	// means "use the model default".
	MaxOutputTokens int `json:"maxOutputTokens,omitempty"`
}

SessionModelProfile mirrors runner.ResolvedModelProfile but lives in the daemon package to avoid an import cycle. It carries the richer fully-rendered model-profile the platform resolves via the three-axis workType + model-profile routing algorithm (ADR-2026-05-12-worktype-and-model-profile-routing). The daemon forwards it opaquely; `donmai agent run` bridges it into runner.ResolvedModelProfile via detailToQueuedWork.

type SessionResolvedProfile

type SessionResolvedProfile struct {
	Provider       string         `json:"provider,omitempty"`
	Runner         string         `json:"runner,omitempty"`
	Model          string         `json:"model,omitempty"`
	Effort         string         `json:"effort,omitempty"`
	CredentialID   string         `json:"credentialId,omitempty"`
	ProviderConfig map[string]any `json:"providerConfig,omitempty"`
}

SessionResolvedProfile mirrors runner.ResolvedProfile but lives in the daemon package to avoid an import cycle (the daemon package must stay independent of the runner package — `donmai agent run` constructs its own runner from this opaque payload).

type SessionResources

type SessionResources struct {
	VCpu     int `json:"vCpu,omitempty"`
	MemoryMB int `json:"memoryMb,omitempty"`
}

SessionResources is the optional resource request on a SessionSpec.

type SessionSpec

type SessionSpec struct {
	SessionID          string            `json:"sessionId"`
	Repository         string            `json:"repository"`
	Ref                string            `json:"ref"`
	Resources          *SessionResources `json:"resources,omitempty"`
	Env                map[string]string `json:"env,omitempty"`
	MaxDurationSeconds int               `json:"maxDurationSeconds,omitempty"`
}

SessionSpec is an inbound work specification dispatched by the orchestrator. Subset of SandboxSpec from 004 relevant to the daemon's session-dispatch path.

type SessionState

type SessionState string

SessionState is the lifecycle of a single worker child process spawned for an accepted session.

const (
	SessionStarting   SessionState = "starting"
	SessionRunning    SessionState = "running"
	SessionCompleted  SessionState = "completed"
	SessionFailed     SessionState = "failed"
	SessionTerminated SessionState = "terminated"
)

Session state constants.

type SpawnerOptions

type SpawnerOptions struct {
	Projects              []ProjectConfig
	MaxConcurrentSessions int
	// WorkerCommand is the command to run for each accepted session. The
	// caller may pass arbitrary args; the session-specific environment is
	// added on top of os.Environ() at spawn time.
	//
	// When empty, a short-lived /bin/sh stub is used that prints
	// "session-started:<id>" and exits 0 — sufficient for testing the
	// daemon's accept/lifecycle path without launching real worker binaries.
	WorkerCommand []string
	// BaseEnv is the environment injected into every worker process.
	BaseEnv map[string]string

	// OnPreSpawn is an optional hook invoked once per spawn, immediately
	// before the child process is exec'd. It receives the final SessionSpec
	// and the env slice that would otherwise be exec'd, and returns the env
	// slice that will actually be exec'd. Returning nil is equivalent to
	// returning the input unchanged.
	//
	// Callers may use this to layer per-session env entries (e.g.,
	// credentials resolved at spawn time) over the spawner's BaseEnv.
	// BaseEnv is set once at spawner construction and cannot express
	// per-session values; this hook is the extension point for callers
	// that need to compute env entries from the inbound SessionSpec.
	//
	// The hook runs AFTER the BaseEnv + SessionSpec.Env composition, so
	// returned entries can both add new keys and override BaseEnv keys.
	//
	// The hook MUST NOT block on I/O paths that can hang indefinitely.
	// Spawn latency budget is on the order of 250ms; if the hook needs
	// to do I/O, it should have its own timeout.
	OnPreSpawn func(spec SessionSpec, env []string) []string

	// Now lets tests deterministically clock acceptedAt timestamps.
	Now func() time.Time
	// Stdout is where worker stdout is forwarded with a "[worker:<id>]"
	// prefix. Defaults to os.Stdout. Set to io.Discard in tests.
	StdoutPrefixWriter PrefixedWriter
	StderrPrefixWriter PrefixedWriter
}

SpawnerOptions configure a WorkerSpawner.

type State

type State string

State is the lifecycle state of a Daemon instance.

const (
	StateStopped  State = "stopped"
	StateStarting State = "starting"
	StateRunning  State = "running"
	StatePaused   State = "paused"
	StateDraining State = "draining"
	StateUpdating State = "updating"
)

Lifecycle state constants.

type TrustConfig

type TrustConfig struct {
	// Mode is one of permissive | signed-by-allowlist | attested.
	// Empty defaults to permissive (set by applyDefaults).
	Mode TrustMode `yaml:"mode,omitempty" json:"mode,omitempty"`

	// IssuerSet is an OPTIONAL allowlist of OIDC subject identities
	// (Fulcio SAN) the operator considers trusted. Empty = trust any
	// signer the embedded trust root can validate (the bundle's chain
	// must still verify; this just skips the SAN allowlist filter).
	IssuerSet []string `yaml:"issuerSet,omitempty" json:"issuerSet,omitempty"`

	// Actor is the operator-declared identity used in the trustOverride
	// audit log entry. When empty the actor falls back to
	// fmt.Sprintf("uid:%d", os.Getuid()) per coordinator decision
	// Q-audit-2 (2026-05-07). The override is also timestamped and
	// names the kitId + signerId, so this field is best-effort.
	Actor string `yaml:"actor,omitempty" json:"actor,omitempty"`
}

TrustConfig is the daemon-wide trust policy. Lives on Config (NOT on KitConfig) per audit § 1.2: the trust mode applies across plugin families per 015-plugin-spec.md § "Auth + trust", not just kits.

type TrustMode

type TrustMode string

TrustMode is the operator-configured policy for how the install gate reacts to verifier outcomes.

const (
	// TrustModePermissive allows install regardless of verifier outcome.
	// The verifier still runs and the trust state is reported; this
	// matches OSS-execution-layer expectations vs the npm/pip/cargo
	// precedent. Default per Q2 of WAVE12_PLAN.
	TrustModePermissive TrustMode = "permissive"
	// TrustModeSignedByAllowlist rejects unsigned and unverified kits at
	// install time; verified-signed kits whose signer matches the
	// configured issuer set install normally.
	TrustModeSignedByAllowlist TrustMode = "signed-by-allowlist"
	// TrustModeAttested is allowlist + (future) SLSA attestation-graph
	// requirement. Wave 12 treats it as an alias for allowlist; the
	// attestation requirement lands in Wave 13+ alongside the SLSA
	// provenance parser.
	TrustModeAttested TrustMode = "attested"
)

Trust modes accepted on daemon.yaml `trust.mode`.

type UpdateChannel

type UpdateChannel string

UpdateChannel is the release channel for the auto-updater.

const (
	ChannelStable UpdateChannel = "stable"
	ChannelBeta   UpdateChannel = "beta"
	ChannelMain   UpdateChannel = "main"
)

Update channel constants.

type UpdateResult

type UpdateResult struct {
	Updated bool   `json:"updated"`
	Version string `json:"version"`
	Reason  string `json:"reason"`
}

UpdateResult describes the outcome of a runUpdate call.

type UpdateSchedule

type UpdateSchedule string

UpdateSchedule is the cadence the supervisor wakes the daemon to check.

const (
	ScheduleNightly   UpdateSchedule = "nightly"
	ScheduleOnRelease UpdateSchedule = "on-release"
	ScheduleManual    UpdateSchedule = "manual"
)

Update schedule constants.

type Updater

type Updater struct {
	// contains filtered or unexported fields
}

Updater runs the full update flow: check → fetch → verify → swap → restart.

func NewUpdater

func NewUpdater(opts UpdaterOptions) *Updater

NewUpdater returns an Updater with sane defaults.

func (*Updater) BuildBinaryURL

func (u *Updater) BuildBinaryURL(channel UpdateChannel, version string) string

BuildBinaryURL returns the binary URL for a channel/version.

func (*Updater) BuildManifestURL

func (u *Updater) BuildManifestURL(channel UpdateChannel) string

BuildManifestURL returns the manifest URL for a channel.

func (*Updater) BuildSignatureURL

func (u *Updater) BuildSignatureURL(binURL string) string

BuildSignatureURL returns the signature URL for a binary URL.

func (*Updater) CheckForUpdate

func (u *Updater) CheckForUpdate(ctx context.Context) (*VersionManifest, error)

CheckForUpdate fetches the version manifest and returns it iff a strictly newer version is available. Returns (nil, nil) when up-to-date.

func (*Updater) RunUpdate

func (u *Updater) RunUpdate(ctx context.Context) (*UpdateResult, error)

RunUpdate executes the complete update flow. When successful and SkipExit is false, it calls ExitFn(ExitCodeRestart) and does not return.

type UpdaterOptions

type UpdaterOptions struct {
	CurrentVersion    string
	CurrentBinaryPath string
	Config            AutoUpdateConfig

	// HTTPClient is the client used to fetch the manifest, binary, and
	// signature. Defaults to a 60s-timeout client.
	HTTPClient *http.Client
	// Verifier is the binary-signature verifier. Defaults to
	// alwaysFailVerifier (production-safe — no real swaps until configured).
	Verifier BinaryVerifier
	// SkipExit, when true, prevents the swap step from calling os.Exit. Used
	// by tests and by callers that want to handle the restart explicitly.
	SkipExit bool
	// ExitFn allows tests to inject a fake exit. Called only when SkipExit
	// is false. Defaults to os.Exit.
	ExitFn func(int)
	// CDNBase overrides UpdateCDNBase (test injection).
	CDNBase string
	// PlatformSuffix overrides the auto-detected suffix (test injection).
	PlatformSuffix string
}

UpdaterOptions configure an Updater.

type VersionManifest

type VersionManifest struct {
	Version    string `json:"version"`
	SHA256     string `json:"sha256"`
	ReleasedAt string `json:"releasedAt"`
}

VersionManifest is the schema of <channel>/latest.json.

type WizardOptions

type WizardOptions struct {
	// Existing is an existing config (if any) used as defaults.
	Existing *Config
	// ConfigPath is where to write the resulting config. Empty means do not
	// persist.
	ConfigPath string
	// Stdin is the TTY input. Defaults to os.Stdin.
	Stdin io.Reader
	// Stdout is where prompts are printed. Defaults to os.Stdout.
	Stdout io.Writer
	// IsTTY overrides the auto-detected TTY status. When false (and not
	// explicitly set true), the wizard returns the default config without
	// prompting.
	IsTTY *bool
	// SkipWizard, when true, returns DefaultConfig (or Existing) without
	// prompting. Mirrors the RENSEI_DAEMON_SKIP_WIZARD env var.
	SkipWizard bool
	// CPUCount overrides runtime.NumCPU() (test injection).
	CPUCount int
	// MemoryMB overrides total-memory detection (test injection). 0 means
	// "use a sensible default".
	MemoryMB int
	// DetectGitRemote returns the cwd's git remote URL or "" if none. Tests
	// inject a stub.
	DetectGitRemote func() string
}

WizardOptions configure the interactive setup wizard.

type WorkareaArchiveOptions

type WorkareaArchiveOptions struct {
	// Root is the directory the registry scans. Empty selects the
	// default ~/.rensei/workareas.
	Root string
	// ActiveProvider is the live pool view; may be nil (archives-only
	// list, see ActiveWorkareaProvider).
	ActiveProvider ActiveWorkareaProvider
	// PoolGuard is consulted on Restore. May be nil — restore proceeds
	// without a saturation check.
	PoolGuard PoolCapacityGuard
}

WorkareaArchiveOptions configures a registry.

type WorkareaArchiveRegistry

type WorkareaArchiveRegistry struct {
	// contains filtered or unexported fields
}

WorkareaArchiveRegistry is the on-disk archive index. Construct via NewWorkareaArchiveRegistry. Methods are safe for concurrent use.

func NewWorkareaArchiveRegistry

func NewWorkareaArchiveRegistry(opts WorkareaArchiveOptions) *WorkareaArchiveRegistry

NewWorkareaArchiveRegistry constructs a registry against the given archive root. The directory is NOT created at construction time — missing-or-empty roots return an empty list (HTTP 200) per ADR D4a.

func (*WorkareaArchiveRegistry) CountDiff

func (r *WorkareaArchiveRegistry) CountDiff(idA, idB string) (int, error)

CountDiff returns the number of differing entries between two archives without buffering or streaming them. The handler uses this to pick JSON vs NDJSON before opening the response stream.

func (*WorkareaArchiveRegistry) Diff

Diff returns the structured per-path delta between two archives. Both ids MUST resolve to archives (live diffs are out of scope per ADR D4a). Walks are deterministic — entries are sorted by path. The well-known .rensei/ subtree under each archive's tree/ root is excluded.

func (*WorkareaArchiveRegistry) DiffStream

DiffStream emits diff entries through the supplied callback as they are computed. The callback receives one entry at a time; if it returns a non-nil error the walk halts and the error is returned. After all entries are emitted DiffStream returns the aggregate summary so callers can write the trailing NDJSON line.

The streaming variant exists so the HTTP handler can switch its Content-Type on entry count without buffering the entire diff.

func (*WorkareaArchiveRegistry) Get

Get returns the full archive record for the named id. The Workarea Kind field is set to WorkareaKindArchived. Returns ErrArchiveNotFound when the id is absent.

func (*WorkareaArchiveRegistry) List

func (r *WorkareaArchiveRegistry) List() (active, archived []afclient.WorkareaSummary, err error)

List walks the archive root and returns the union of on-disk archives (ordered deterministically by id) and the active pool members reported by the configured ActiveWorkareaProvider, if any. Missing-or- empty root is NOT an error — the response is just (empty active + empty archived).

func (*WorkareaArchiveRegistry) Restore

Restore materialises an archive into a fresh active pool member. The returned Workarea has Kind=Active and a NEW id distinct from the archive id (archives are immutable per ADR D4a). The tree/ subtree is copied to a per-restore directory under the archive root's sibling "restored/" so operators can find the materialised state from the daemon's host filesystem.

IntoSessionID conflicts return ErrConflict; saturation returns ErrUnavailable + a non-zero retryAfter; corrupted archives return ErrArchiveCorrupted; missing archives return ErrArchiveNotFound.

func (*WorkareaArchiveRegistry) Root

func (r *WorkareaArchiveRegistry) Root() string

Root returns the archive root directory the registry scans. Exposed for tests and operators surfacing the path.

type WorkareaConfig

type WorkareaConfig struct {
	// ArchiveRoot is the directory the daemon scans for archived workareas.
	// Default ~/.rensei/workareas (resolved at runtime by the handler if
	// empty).
	ArchiveRoot string `yaml:"archiveRoot,omitempty" json:"archiveRoot,omitempty"`
	// DiffStreamingThreshold is the entry count above which the diff
	// endpoint switches from a single JSON envelope to NDJSON streaming.
	// Default 1000 per ADR D4a.
	DiffStreamingThreshold int `yaml:"diffStreamingThreshold,omitempty" json:"diffStreamingThreshold,omitempty"`
}

WorkareaConfig configures the Layer-3 workarea operator surface — archive root scan path, diff streaming threshold. Wave 9 / ADR-2026-05-07.

type WorkerSpawner

type WorkerSpawner struct {
	// contains filtered or unexported fields
}

WorkerSpawner manages the lifecycle of worker child processes.

func NewWorkerSpawner

func NewWorkerSpawner(opts SpawnerOptions) *WorkerSpawner

NewWorkerSpawner constructs a spawner. Workers will not be spawned until AcceptWork is called.

func (*WorkerSpawner) AcceptWork

func (s *WorkerSpawner) AcceptWork(spec SessionSpec) (*SessionHandle, error)

AcceptWork validates the spec, spawns a worker, and returns its handle.

func (*WorkerSpawner) ActiveCount

func (s *WorkerSpawner) ActiveCount() int

ActiveCount returns the number of in-flight sessions.

func (*WorkerSpawner) ActiveSessions

func (s *WorkerSpawner) ActiveSessions() []SessionHandle

ActiveSessions returns a snapshot of the current session handles.

func (*WorkerSpawner) ActiveWorkareas

func (s *WorkerSpawner) ActiveWorkareas() []afclient.WorkareaSummary

ActiveWorkareas projects the spawner's in-flight sessions onto the canonical afclient.WorkareaSummary wire shape so the WorkareaArchiveRegistry can union live-pool members with on-disk archives in the GET /api/daemon/workareas response (Wave 11 / S5; ADR-2026-05-07-daemon- http-control-api.md §D4a).

The projection is pull-based — the spawner holds no separate workarea map; each call materialises summaries from the live `sessions` map under the same `mu` lock that ActiveSessions uses. ProjectID is resolved via the project allowlist using the same matcher AcceptWork applies. The summary's ID is the spawner's session id so /api/daemon/workareas/<id> reaches the live entry.

Output is sorted by SessionID for deterministic test assertions.

func (*WorkerSpawner) Drain

func (s *WorkerSpawner) Drain(timeout time.Duration) error

Drain waits for all in-flight sessions to exit, then resolves. After timeout, remaining sessions receive SIGTERM via context cancellation and the function returns an error indicating how many were forcibly stopped.

func (*WorkerSpawner) IsAccepting

func (s *WorkerSpawner) IsAccepting() bool

IsAccepting reports whether the spawner is currently accepting work.

func (*WorkerSpawner) On

func (s *WorkerSpawner) On(fn func(SessionEvent))

On registers a session-event listener. Listeners are invoked synchronously from the spawner goroutine; do not block them.

func (*WorkerSpawner) Pause

func (s *WorkerSpawner) Pause()

Pause stops accepting new work but leaves running sessions alive.

func (*WorkerSpawner) Resume

func (s *WorkerSpawner) Resume()

Resume restores accepting state.

func (*WorkerSpawner) SetMaxConcurrentSessions

func (s *WorkerSpawner) SetMaxConcurrentSessions(n int) error

SetMaxConcurrentSessions updates the local session capacity used for future AcceptWork decisions. Existing sessions are never interrupted.

func (*WorkerSpawner) SetProjects

func (s *WorkerSpawner) SetProjects(projects []ProjectConfig)

SetProjects atomically swaps the spawner's project allowlist used by AcceptWork's findProjectLocked check. Existing in-flight sessions continue against whichever project they were dispatched under — the new list governs only future AcceptWork calls.

Phase 2c of 2026-05-18-daemon-config-sync-DESIGN.md — wired by the mutation-applier so platform-driven project.add / project.remove proposals take effect on the very next claim without a daemon restart.

A defensive copy is taken so subsequent mutations to the caller's slice (e.g. daemon.go reusing a single buffer) don't race the spawner.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL