Documentation
¶
Overview ¶
Package daemon handle_capabilities.go — HTTP handler for the GET /api/daemon/capabilities endpoint.
Exposes the local daemon's detected substrate capabilities to clients (rensei-tui, debugging tools, CI smoke tests). The response shape mirrors the provides[] array sent to POST /api/workers/register so consumers can verify what was advertised without re-detecting.
Architecture reference:
rensei-architecture/ADR-2026-05-12-capacity-pools-and-substrate-resolution.md § Stream H sub-lane — agentfactory-tui daemon pool awareness
Package daemon handle_kit.go — HTTP handlers for the /api/daemon/kits* and /api/daemon/kit-sources* surfaces.
Wave-9 A2 — see ADR-2026-05-07-daemon-http-control-api.md § D1 for the canonical route list. Path-prefix dispatch follows the same pattern used by handleSessionDetail in server.go.
Package daemon handle_provider.go — HTTP handlers for the /api/daemon/providers* operator surface. Wave 9 / A1.
The handlers expose the daemon's in-process AgentRuntime registry (claude/codex/ollama/opencode/gemini/amp/stub) as JSON. The remaining seven Provider Families (Sandbox, Workarea, VCS, IssueTracker, Deployment, AgentRegistry, Kit) return empty until per-family registries land in a future wave. The endpoint MUST emit PartialCoverage=true and CoveredFamilies=["agent-runtime"] so consumers render the "other families coming" caveat without sniffing for emptiness — see ADR-2026-05-07-daemon-http-control-api.md §D4.
Package daemon handle_routing.go — HTTP handlers for the /api/daemon/routing/* operator surface. Wave 9 / A4.
The handlers expose the daemon's RoutingTraceStore as JSON. The wire shape is locked in rensei-architecture/ADR-2026-05-07-daemon-http-control-api.md §D4 and matches the surfaces the SaaS dashboard's Routing Intelligence panel (REN-205) consumes, so the same renderer composes both.
Read-only this wave. The /config endpoint surfaces the static scheduler configuration (weights, capability filters, sandbox/LLM provider state) plus the rolling tail of recent decisions; the /explain/<sessionID> endpoint returns the full per-session decision trace.
Package daemon handle_workarea.go — HTTP handlers for the /api/daemon/workareas* operator surface.
Wave 9 / Track A3 / ADR-2026-05-07-daemon-http-control-api.md §D4a.
Routes:
GET /api/daemon/workareas list active + archived GET /api/daemon/workareas/<id> inspect (active or archived) POST /api/daemon/workareas/<archiveID>/restore 201 on success GET /api/daemon/workareas/<idA>/diff/<idB> JSON or NDJSON
The streaming-NDJSON variant on /diff/ kicks in when the entry count exceeds the daemon's configured workarea.diffStreamingThreshold (default 1000 per ADR D4a). Below that, the response is a single WorkareaDiffEnvelope JSON object.
Package daemon kit_install_git.go — git-source kit fetcher (Wave 12 / Theme C / S3).
The fetcher clones the operator-provided git URL into a temp directory, locates the kit manifest (and its sibling `.sigstore` bundle, when present), and exposes both as on-disk paths so KitRegistry.Install can run the trust-gated verifier against the freshly-fetched material before persisting it into kit.scanPaths[0].
Design notes
- Uses go-git/v5 (pure-Go) so the daemon does not depend on a `git` binary on the operator's PATH. Public-host or file:// URLs are both accepted; tests rely on file:// fixtures.
- When KitInstallSource.ManifestPath is empty the fetcher walks the cloned tree for *.kit.toml files and selects the first one that parses cleanly. This matches the audit § 2.1 step 3 contract: "walk repo for *.kit.toml, pick the first; multi-manifest support is a Wave 13+ extension per 005-kit-manifest-spec.md".
- Caller MUST defer the returned cleanup func; the temp tree is persisted only long enough for the registry to copy what it needs into the configured scanPath.
Errors
- ErrKitInstallSourceFetchFailed — clone failed (network, auth, ref not found, etc.). Wrapped with the underlying go-git error.
- ErrKitInstallManifestNotFound — clone succeeded but no usable `*.kit.toml` exists at the configured ManifestPath (or anywhere in the tree when ManifestPath was empty).
Package daemon kit_registry.go — minimal in-process Kit registry that scans the filesystem for installed kit manifests and exposes them via the operator control API.
This is the OSS-execution-layer's "Local manifests" registry source from the federation list in 005-kit-manifest-spec.md § "Registry sources" (item 1). Other registry sources (bundled, rensei, tessl, agentskills, community) are not implemented in this wave; the /api/daemon/kit-sources endpoint returns a static descriptor list surfacing the federation order.
Scan path defaults to ~/.rensei/kits/*.kit.toml. Multiple paths may be declared via daemon.yaml's optional `kit.scanPaths` override.
Behaviour:
- Empty registry (no scan path entries, no .kit.toml files) → empty list, HTTP 200.
- Malformed manifests log a warning via slog and are excluded from the listing rather than failing the whole request.
- Enable/disable state is persisted to a sidecar file at ~/.rensei/kits/.state.json so toggle outcomes survive daemon restarts. The file is created on first toggle.
- Install is currently a stub returning ErrKitInstallUnimplemented; fetching kits from a remote registry is deferred until the federation sources land.
- Verify-signature returns KitTrustUnsigned for all kits in this wave (signing is partially implemented per the ADR caveat).
Package daemon kit_trust.go — sigstore bundle-mode kit signature verifier (Wave 12 / Theme C / S2).
The verifier consumes a sibling `<manifest>.sigstore` file (Q1 of WAVE12_PLAN — "bundle file shape: sibling .sigstore"), validates it against the configured trust root, and reports back a populated afclient.KitSignatureResult. Three trust outcomes:
- KitTrustSignedVerified — bundle present and verifies against the trust root + issuer set.
- KitTrustSignedUnverified — bundle present but verification failed (tampered manifest, untrusted issuer, expired chain, etc.).
- KitTrustUnsigned — no sibling .sigstore file exists.
At install time the verifier outcome maps to a trust gate. The gate runs in the registry's Install path, NOT here — see the ErrKitTrustGateRejected sentinel in kit_registry.go and the trustOverride: "allowed-this-once" handling per audit § 1.3 / § 2.2.
Trust modes (§ "Signing and trust" in 002-provider-base-contract.md):
- permissive — verifier still runs and reports state, but never blocks Install. OSS default per Q2 of WAVE12_PLAN.
- signed-by-allowlist — Install rejects KitTrustUnsigned and KitTrustSignedUnverified.
- attested — same as allowlist for Wave 12 (the SLSA attestation graph hookup lands in Wave 13+).
The embedded trust root is the public Sigstore production trust root (https://raw.githubusercontent.com/sigstore/sigstore-go/main/examples/trusted-root-public-good.json). It will be replaced with a Rensei-published trust root once the productionized signing CI from REN-1344 emits a Rensei-signed Fulcio + Rekor cert chain (Wave 13+ work).
Q-audit-2 resolution (taken 2026-05-07 by /loop coordinator): trust-actor lookup falls back to os.Getuid() when daemon.yaml's `trust.actor` is empty. The trustOverride audit log is best-effort identification; the override is still timestamped and key fields (kitId, signerId) are always populated.
Package daemon routing_state.go — in-process routing trace store and configuration projector for the /api/daemon/routing/* surface (Wave 9 / A4).
The OSS daemon does not yet ship a real cross-provider scheduler in production. The store therefore defines the shape the eventual scheduler will record decisions through, and the read paths used by the HTTP handlers in handle_routing.go.
See ADR-2026-05-07-daemon-http-control-api.md §D4 for the wire contract, 004-sandbox-capability-matrix.md for the cross-provider scheduler model, and the forward reference at /api/daemon/routing/explain/<sessionID> in the same doc.
Package daemon implements the long-running rensei-daemon runtime in Go.
The daemon is a single-machine, multi-project supervisor that:
- Registers itself with the orchestrator (dial-out) and exchanges a one-time rsp_live_* token for a scoped JWT.
- Sends a periodic heartbeat to the orchestrator.
- Accepts inbound work specs (sessions) and spawns worker child processes.
- Exposes an HTTP control API on 127.0.0.1:7734 for the af / rensei CLI.
- Optionally self-updates by drain → fetch → verify → swap → restart.
Architecture reference:
rensei-architecture/004-sandbox-capability-matrix.md §Local daemon mode rensei-architecture/011-local-daemon-fleet.md
This is the public package surface — downstream binaries can import it directly to embed the daemon runtime under their own command tree. The afcli package re-exports the runtime as the `daemon run` subcommand.
This package is the Go port of agentfactory/packages/daemon/src (REN-1408). The TS package @renseiai/daemon is deprecated; final removal is scheduled for cycle 6 after the smoke harness has soaked for 7 nights.
Package daemon workarea_archive.go — on-disk workarea archive registry powering the Layer-3 workarea operator surface.
Wave 9 / Track A3 / ADR-2026-05-07-daemon-http-control-api.md §D4a.
Archive layout. Each archive is a directory under the daemon's archive root (default ~/.rensei/workareas/<archiveID>/) containing:
manifest.json — metadata sidecar (id, sessionId, createdAt,
sizeBytes, sourceProvider, capabilities,
disposition); free-form extra fields permitted.
tree/ — the workarea filesystem snapshot. Diffs and
restores walk this subtree only; everything outside
it (manifest.json, daemon-private bookkeeping) is
ignored. The well-known .rensei/ directory under
tree/ is also excluded from diff walks per ADR D4a.
The registry is stateless w.r.t. process lifecycle — every call hits disk. That's fine: archive directories are small in count (operator scale), the OS dentry cache absorbs repeated listings, and avoiding in-memory state means the daemon never serves a stale view after an out-of-band write to ~/.rensei/workareas/.
Index ¶
- Constants
- Variables
- func DefaultConfigPath() string
- func DefaultJWTPath() string
- func DefaultKitScanPath() string
- func DeriveDefaultMachineID() string
- func IsNewerVersion(candidate, current string) bool
- func ResolvePlatformSuffix() string
- func SaveCachedJWT(jwtPath string, resp *RegisterResponse, now time.Time) error
- func ShouldSkipWizard() bool
- func WipeCachedJWT(jwtPath string) (bool, error)
- func WriteConfig(path string, cfg *Config) error
- type ActiveWorkareaProvider
- type AutoUpdateConfig
- type BinaryVerifier
- type CachedJWT
- type CapabilitiesResponse
- type CapacityConfig
- type CloneStrategy
- type Config
- type Daemon
- func (d *Daemon) AcceptWork(spec SessionSpec) (*SessionHandle, error)
- func (d *Daemon) AcceptWorkWithDetail(spec SessionSpec, detail *SessionDetail) (*SessionHandle, error)
- func (d *Daemon) ActiveSessions() []SessionHandle
- func (d *Daemon) Config() *Config
- func (d *Daemon) Done() <-chan struct{}
- func (d *Daemon) EffectiveVersion() string
- func (d *Daemon) HostStatus() *HostStatusDetail
- func (d *Daemon) Pause()
- func (d *Daemon) Resume()
- func (d *Daemon) RoutingTraces() *RoutingTraceStore
- func (d *Daemon) SessionDetail(sessionID string) (*SessionDetail, bool)
- func (d *Daemon) SetWorkareaArchiveRegistry(reg *WorkareaArchiveRegistry)
- func (d *Daemon) Start(ctx context.Context) error
- func (d *Daemon) StartedAt() time.Time
- func (d *Daemon) State() State
- func (d *Daemon) Stop(_ context.Context) error
- func (d *Daemon) SubstrateCapabilities() []internaldaemon.SubstrateCapability
- func (d *Daemon) Update(ctx context.Context) (*UpdateResult, error)
- func (d *Daemon) WorkerID() string
- type EvictHandler
- type HeartbeatMutationFailure
- type HeartbeatOptions
- type HeartbeatPayload
- type HeartbeatService
- type HostStatusDetail
- type KitConfig
- type KitRegistry
- func (r *KitRegistry) Disable(id string) (afclient.Kit, error)
- func (r *KitRegistry) DisableSource(name string) (afclient.KitRegistrySource, error)
- func (r *KitRegistry) Enable(id string) (afclient.Kit, error)
- func (r *KitRegistry) EnableSource(name string) (afclient.KitRegistrySource, error)
- func (r *KitRegistry) Get(id string) (afclient.KitManifest, error)
- func (r *KitRegistry) Install(id string, req afclient.KitInstallRequest) (afclient.KitInstallResult, error)
- func (r *KitRegistry) List() []afclient.Kit
- func (r *KitRegistry) ListSources() []afclient.KitRegistrySource
- func (r *KitRegistry) ScanPaths() []string
- func (r *KitRegistry) VerifySignature(id string) (afclient.KitSignatureResult, error)
- type MachineConfig
- type ObservabilityConfig
- type Options
- type OrchestratorConfig
- type PendingMutation
- type PollHTTPError
- type PollOptions
- type PollResponse
- type PollService
- type PollStageBudget
- type PollWorkItem
- type PoolCapacityGuard
- type PoolStatsProvider
- type PrefixWriterFunc
- type PrefixedWriter
- type ProjectAllowlistEntry
- type ProjectConfig
- type ProjectGit
- type ProvideCapability
- type ProviderRegistry
- type RefreshTokenResult
- type RegisterRequest
- type RegisterResponse
- type RegistrationOptions
- type RegistrationStatus
- type ReservedSystemSpec
- type RoutingTraceStore
- func (s *RoutingTraceStore) Explain(sessionID string) (afclient.RoutingDecision, []afclient.RoutingTraceStep, bool)
- func (s *RoutingTraceStore) GetConfig(providerNames []string, capturedAt time.Time) afclient.RoutingConfig
- func (s *RoutingTraceStore) Len() int
- func (s *RoutingTraceStore) RecordDecision(decision afclient.RoutingDecision, trace []afclient.RoutingTraceStep)
- type Server
- type SessionDetail
- type SessionEvent
- type SessionEventKind
- type SessionHandle
- type SessionModelProfile
- type SessionResolvedProfile
- type SessionResources
- type SessionSpec
- type SessionState
- type SpawnerOptions
- type State
- type TrustConfig
- type TrustMode
- type UpdateChannel
- type UpdateResult
- type UpdateSchedule
- type Updater
- func (u *Updater) BuildBinaryURL(channel UpdateChannel, version string) string
- func (u *Updater) BuildManifestURL(channel UpdateChannel) string
- func (u *Updater) BuildSignatureURL(binURL string) string
- func (u *Updater) CheckForUpdate(ctx context.Context) (*VersionManifest, error)
- func (u *Updater) RunUpdate(ctx context.Context) (*UpdateResult, error)
- type UpdaterOptions
- type VersionManifest
- type WizardOptions
- type WorkareaArchiveOptions
- type WorkareaArchiveRegistry
- func (r *WorkareaArchiveRegistry) CountDiff(idA, idB string) (int, error)
- func (r *WorkareaArchiveRegistry) Diff(idA, idB string) (*afclient.WorkareaDiffResult, error)
- func (r *WorkareaArchiveRegistry) DiffStream(idA, idB string, emit func(afclient.WorkareaDiffEntry) error) (*afclient.WorkareaDiffSummary, error)
- func (r *WorkareaArchiveRegistry) Get(id string) (*afclient.Workarea, error)
- func (r *WorkareaArchiveRegistry) List() (active, archived []afclient.WorkareaSummary, err error)
- func (r *WorkareaArchiveRegistry) Restore(archiveID string, req afclient.WorkareaRestoreRequest) (*afclient.Workarea, time.Duration, error)
- func (r *WorkareaArchiveRegistry) Root() string
- type WorkareaConfig
- type WorkerSpawner
- func (s *WorkerSpawner) AcceptWork(spec SessionSpec) (*SessionHandle, error)
- func (s *WorkerSpawner) ActiveCount() int
- func (s *WorkerSpawner) ActiveSessions() []SessionHandle
- func (s *WorkerSpawner) ActiveWorkareas() []afclient.WorkareaSummary
- func (s *WorkerSpawner) Drain(timeout time.Duration) error
- func (s *WorkerSpawner) IsAccepting() bool
- func (s *WorkerSpawner) On(fn func(SessionEvent))
- func (s *WorkerSpawner) Pause()
- func (s *WorkerSpawner) Resume()
- func (s *WorkerSpawner) SetMaxConcurrentSessions(n int) error
- func (s *WorkerSpawner) SetProjects(projects []ProjectConfig)
Constants ¶
const CapacityRefreshInterval = 60 * time.Second
CapacityRefreshInterval is how often the daemon re-emits its capacity snapshot. Mirrors the TS CAPACITY_REFRESH_INTERVAL_MS = 60_000.
const DefaultHTTPHost = "127.0.0.1"
DefaultHTTPHost is the bind address for the control HTTP server.
const DefaultHTTPPort = 7734
DefaultHTTPPort is the port the daemon's control HTTP server binds to. Keep in sync with afclient.DefaultDaemonConfig (port 7734).
const DefaultRoutingRingBufferSize = 50
DefaultRoutingRingBufferSize is the maximum number of recent routing decisions retained for the GetConfig view. The explain endpoint key is per-session and bounded by the same ring — a session whose decision has fallen out of the ring returns 404.
const ExitCodeRestart = 3
ExitCodeRestart is the exit code the daemon uses to signal the supervisor "restart requested" after a successful binary swap. The launchd plist / systemd unit treats code 3 as a clean restart, not a crash.
const HeartbeatDefaultInterval = 30 * time.Second
HeartbeatDefaultInterval is the fallback heartbeat cadence when the orchestrator does not return one in RegisterResponse. The TS path uses 30s as the fallback; we keep that here, but `15s` is the canonical SLO target.
const RegisterEndpoint = "/api/workers/register"
RegisterEndpoint is the relative path on the platform.
const RuntimeTokenRefreshEndpoint = "/api/workers/refresh-token"
RuntimeTokenRefreshEndpoint is the (probed) platform endpoint the daemon hits to refresh an expired runtime JWT WITHOUT re-registering. The platform owes a handler at this path that:
- accepts the registration token in the Authorization: Bearer header
- takes the existing workerId in the URL path
- mints a fresh runtime JWT bound to the SAME workerId
- returns { runtimeToken, runtimeTokenExpiresAt, heartbeatInterval, pollInterval }
As of 2026-05-03 this endpoint does NOT exist on the platform side — see REN-1481 platform-companion. Until it ships the daemon probes this URL, observes a 404, and falls back to full re-register (which mints a new workerId, the bug REN-1481 originally documented). When the platform side ships the endpoint the daemon picks it up automatically with no further changes. #nosec G101 -- URL endpoint path, not a credential
const UpdateCDNBase = "https://updates.rensei.dev"
UpdateCDNBase is the base URL for the rensei CDN that hosts release manifests and binaries.
Variables ¶
var ( // ErrArchiveNotFound — the named archive id is not present on disk. ErrArchiveNotFound = errors.New("workarea archive not found") // ErrArchiveCorrupted — the archive exists but its manifest is // missing/malformed, or the tree directory cannot be walked. ErrArchiveCorrupted = errors.New("workarea archive corrupted") // ErrArchiveExists — restore would collide with an existing archive // entry on disk (never reached today; archives are immutable, but // the check is here for a future "archive on restore" code path). ErrArchiveExists = errors.New("workarea archive already exists") )
WorkareaArchiveErrCode is the sentinel set used by the registry for programmatic error discrimination at the HTTP layer. Wrapped with %w so handlers can errors.Is() against them.
var DefaultRoutingWeights = afclient.RoutingWeights{Cost: 0.7, Latency: 0.3}
DefaultRoutingWeights are the cost/latency scoring weights described in 004-sandbox-capability-matrix.md §"Open questions" — 70/30 cost/latency is the documented default. The store returns these on every GetConfig call until a tenant config layer overrides them in a future wave.
var ErrKitInstallManifestNotFound = errors.New("kit install: manifest not found in fetched source")
ErrKitInstallManifestNotFound is returned when the source fetch succeeds but no *.kit.toml is locatable inside the fetched tree (or at the operator-provided KitInstallSource.ManifestPath). Maps to HTTP 422.
var ErrKitInstallSourceFetchFailed = errors.New("kit install: source fetch failed")
ErrKitInstallSourceFetchFailed is returned when the configured source fetcher fails (e.g., git clone error, network failure, unreachable remote, missing ref). Maps to HTTP 502.
var ErrKitInstallUnimplemented = errors.New("kit install: remote registry fetch not implemented in this wave")
ErrKitInstallUnimplemented is returned by KitRegistry.Install for the Wave-9 backward-compat path: a request body with no `source` block (the shape the Wave-9 smoke + handler tests POST). Wave 12 / Phase 4 keeps this sentinel reserved for that empty-body case so existing 501 assertions stay green; new federation-source kinds (tessl, agentskills, community) return ErrKitSourceFederationUnimplemented instead.
var ErrKitNotFound = errors.New("kit not found")
ErrKitNotFound is returned when a kit id is not present in the registry.
var ErrKitSourceFederationUnimplemented = errors.New("kit install: federation source kind not yet implemented")
ErrKitSourceFederationUnimplemented is returned when KitInstallRequest names a federation source kind (`tessl` / `agentskills` / `community`) that the daemon does not yet know how to fetch from. Maps to HTTP 501 — the descriptor list returned by /api/daemon/kit-sources continues to surface those kinds so operators can see the federation order.
Federation cross-repo wave is REN-1308 follow-up.
var ErrKitSourceNotFound = errors.New("kit source not found")
ErrKitSourceNotFound is returned when a kit-source name is not known.
var ErrKitTrustGateRejected = errors.New("kit install: trust gate rejected (signed-by-allowlist requires verified signature)")
ErrKitTrustGateRejected is returned by KitRegistry.Install when the configured trust mode (signed-by-allowlist or attested) refuses an unsigned or signed-but-unverified kit. Maps to HTTP 403. The trustOverride: "allowed-this-once" install field bypasses this gate for a single request (audit-logged); see kit_trust.go.
var Version = "dev"
Version is the daemon binary version reported in DaemonStatus and in the registration payload.
Now a `var` (was `const`) so the binary's main can override it via `-ldflags "-X github.com/RenseiAI/donmai/daemon.Version=$VERSION"` at build time, OR a downstream embedder (e.g. rensei-tui's daemon run command) can pass its own version via `Options.Version` at daemon construction. The const form pinned the value to whatever agentfactory-tui's source had at vendor time, which left the `rensei-daemon-run` HTTP /api/daemon/status endpoint reporting an outdated string forever — confusing operators who saw e.g. `0.7.1` even after upgrading both binaries past it.
Default is `"dev"` so an unreleased build (or a vendored copy that forgot to inject) is obvious in status output.
Functions ¶
func DefaultConfigPath ¶
func DefaultConfigPath() string
DefaultConfigPath returns the path to daemon.yaml, resolving to ~/.donmai/daemon.yaml for new installs with a one-release fallback to ~/.rensei/daemon.yaml when the legacy directory still exists.
func DefaultJWTPath ¶
func DefaultJWTPath() string
DefaultJWTPath returns the path to the cached JWT, resolving to ~/.donmai/daemon.jwt for new installs with a one-release fallback to ~/.rensei/daemon.jwt when the legacy directory still exists.
func DefaultKitScanPath ¶
func DefaultKitScanPath() string
DefaultKitScanPath returns the path to the installed-kits directory, resolving to ~/.donmai/kits for new installs with a one-release fallback to ~/.rensei/kits when the legacy directory still exists.
func DeriveDefaultMachineID ¶
func DeriveDefaultMachineID() string
DeriveDefaultMachineID returns a hostname-derived identifier suitable for machine.id when the user has not set one.
func IsNewerVersion ¶
IsNewerVersion returns true if candidate is strictly newer than current according to semver-prefix comparison. Falls back to lexicographic compare if either string is not a parseable semver prefix.
func ResolvePlatformSuffix ¶
func ResolvePlatformSuffix() string
ResolvePlatformSuffix returns "<arch>-<os>" suitable for the CDN binary filename, e.g. "arm64-darwin", "amd64-linux".
func SaveCachedJWT ¶
func SaveCachedJWT(jwtPath string, resp *RegisterResponse, now time.Time) error
SaveCachedJWT atomically writes the response to jwtPath with 0o600 perms.
func ShouldSkipWizard ¶
func ShouldSkipWizard() bool
ShouldSkipWizard returns true when the wizard should be bypassed:
- stdin is not a TTY, OR
- RENSEI_DAEMON_SKIP_WIZARD is set.
func WipeCachedJWT ¶
WipeCachedJWT removes the cached JWT file at jwtPath. Returns wiped=true when the file existed and was removed, wiped=false when there was no cache to remove (idempotent — safe to call from uninstall paths on systems that never had the daemon installed).
Why this exists: Register() short-circuits with the cached JWT whenever the file is present, even when the workerId in it has been invalidated by the orchestrator (worker row deleted, registration token rotated, org migrated, manual cleanup, …). Without an explicit wipe, the daemon polls the dead workerId every poll interval forever — the token-refresh fallback re-mints credentials for the same dead id rather than triggering a true re-registration. Install / uninstall paths should call this so a fresh registration handshake happens on the next daemon boot.
func WriteConfig ¶
WriteConfig atomically writes cfg to path (tmp file + rename), creating parent directories as needed.
Types ¶
type ActiveWorkareaProvider ¶
type ActiveWorkareaProvider interface {
ActiveWorkareas() []afclient.WorkareaSummary
}
ActiveWorkareaProvider exposes the daemon's live pool members in the canonical wire shape so List can union them with on-disk archives. Implementations MUST return a stable order. Empty list (zero pool members) is a perfectly valid, non-error response.
type AutoUpdateConfig ¶
type AutoUpdateConfig struct {
Channel UpdateChannel `yaml:"channel" json:"channel"`
Schedule UpdateSchedule `yaml:"schedule" json:"schedule"`
DrainTimeoutSeconds int `yaml:"drainTimeoutSeconds" json:"drainTimeoutSeconds"`
}
AutoUpdateConfig is the auto-update preferences block.
type BinaryVerifier ¶
type BinaryVerifier interface {
Verify(ctx context.Context, contentHash, signatureValue string) (valid bool, reason string)
}
BinaryVerifier is a narrow signature-verification interface. The default production verifier rejects all signatures (until REN-1314 ships a Go sigstore adapter). Tests can inject a passing verifier.
type CachedJWT ¶
type CachedJWT struct {
WorkerID string `json:"workerId"`
RuntimeToken string `json:"runtimeToken"`
HeartbeatInterval int `json:"heartbeatInterval"` // ms
PollInterval int `json:"pollInterval"` // ms
RuntimeTokenExpiresAt string `json:"runtimeTokenExpiresAt,omitempty"`
CachedAt string `json:"cachedAt"`
// Legacy fields retained so old cache files written before REN-1422
// still load successfully. Newer writes only populate the canonical
// platform-named fields above.
LegacyRuntimeJWT string `json:"runtimeJwt,omitempty"`
LegacyHeartbeatIntervalSeconds int `json:"heartbeatIntervalSeconds,omitempty"`
LegacyPollIntervalSeconds int `json:"pollIntervalSeconds,omitempty"`
}
CachedJWT is the on-disk cache entry. We persist this between daemon runs so re-registration is skipped while the runtime token is fresh.
func LoadCachedJWT ¶
LoadCachedJWT reads ~/.rensei/daemon.jwt. Returns (nil, nil) when the file does not exist or cannot be parsed.
type CapabilitiesResponse ¶
type CapabilitiesResponse struct {
// Provides is the substrate capability set detected at daemon startup.
// Each entry corresponds to a SubstrateCapabilityDeclaration.runtimeKinds
// value (e.g. "native", "npm", "python-pip"). This matches the provides[]
// array sent to POST /api/workers/register.
Provides []ProvideCapability `json:"provides"`
// Timestamp is the RFC3339 UTC time when this response was generated.
Timestamp string `json:"timestamp"`
}
CapabilitiesResponse is the JSON response from GET /api/daemon/capabilities.
type CapacityConfig ¶
type CapacityConfig struct {
MaxConcurrentSessions int `yaml:"maxConcurrentSessions" json:"maxConcurrentSessions"`
MaxVCpuPerSession int `yaml:"maxVCpuPerSession" json:"maxVCpuPerSession"`
MaxMemoryMbPerSession int `yaml:"maxMemoryMbPerSession" json:"maxMemoryMbPerSession"`
ReservedForSystem ReservedSystemSpec `yaml:"reservedForSystem" json:"reservedForSystem"`
// PoolMaxDiskGb is the LRU-eviction trigger for the workarea pool.
// 0 means no limit. (REN-1334.)
PoolMaxDiskGb int `yaml:"poolMaxDiskGb,omitempty" json:"poolMaxDiskGb,omitempty"`
}
CapacityConfig is the resource envelope declared in daemon.yaml.
type CloneStrategy ¶
type CloneStrategy string
CloneStrategy controls how the daemon clones a project repo for new workarea pool members.
const ( CloneShallow CloneStrategy = "shallow" CloneFull CloneStrategy = "full" CloneReference CloneStrategy = "reference-clone" )
Clone strategy constants.
type Config ¶
type Config struct {
APIVersion string `yaml:"apiVersion" json:"apiVersion"`
Kind string `yaml:"kind" json:"kind"`
Machine MachineConfig `yaml:"machine" json:"machine"`
Capacity CapacityConfig `yaml:"capacity" json:"capacity"`
Projects []ProjectConfig `yaml:"projects,omitempty" json:"projects,omitempty"`
Orchestrator OrchestratorConfig `yaml:"orchestrator" json:"orchestrator"`
AutoUpdate AutoUpdateConfig `yaml:"autoUpdate" json:"autoUpdate"`
Observability *ObservabilityConfig `yaml:"observability,omitempty" json:"observability,omitempty"`
// Workarea holds Layer-3 workarea-surface tunables (archive root,
// diff streaming threshold). Optional; populated with defaults if
// absent.
Workarea WorkareaConfig `yaml:"workarea,omitempty" json:"workarea,omitempty"`
// Kit holds Layer-4 kit-surface tunables (scan paths). Optional;
// applyDefaults seeds ScanPaths to [DefaultKitScanPath()] when
// absent. Per ADR-2026-05-07 § D4.
Kit KitConfig `yaml:"kit,omitempty" json:"kit,omitempty"`
// Trust holds the daemon-wide signature-verification policy
// (sigstore bundle-mode verifier mode + issuer allowlist + audit
// actor). Optional; applyDefaults seeds Mode to
// TrustModePermissive when absent. Per WAVE12_PLAN Q2 and
// 002-provider-base-contract.md § "Signing and trust". Lives on
// Config (not on KitConfig) because the trust mode applies across
// all plugin families per 015-plugin-spec.md § "Auth + trust".
Trust TrustConfig `yaml:"trust,omitempty" json:"trust,omitempty"`
}
Config is the in-memory representation of ~/.rensei/daemon.yaml. The wire schema mirrors the TS DaemonConfig (rensei-architecture/004 §Configuration shape).
func BuildDefaultConfigFromExisting ¶
BuildDefaultConfigFromExisting returns a default Config (or the existing one) and optionally persists it to configPath.
func DefaultConfig ¶
func DefaultConfig() *Config
DefaultConfig returns a minimal Config suitable as a starting point when the wizard is skipped. Capacity defaults are derived from runtime info.
func LoadConfig ¶
LoadConfig reads daemon.yaml from path. Returns (nil, nil) when the file does not exist (so callers can branch into the setup wizard / default).
func RunSetupWizard ¶
func RunSetupWizard(opts WizardOptions) (*Config, error)
RunSetupWizard runs the interactive first-run wizard (or returns the non-interactive default when stdin is not a TTY).
type Daemon ¶
type Daemon struct {
// contains filtered or unexported fields
}
Daemon is the top-level supervisor. It owns the loaded Config, the HeartbeatService, the WorkerSpawner, and (optionally) the AutoUpdater.
func (*Daemon) AcceptWork ¶
func (d *Daemon) AcceptWork(spec SessionSpec) (*SessionHandle, error)
AcceptWork dispatches a session spec to the spawner.
func (*Daemon) AcceptWorkWithDetail ¶
func (d *Daemon) AcceptWorkWithDetail(spec SessionSpec, detail *SessionDetail) (*SessionHandle, error)
AcceptWorkWithDetail dispatches a session spec to the spawner and records the per-session detail used by the spawned `donmai agent run` process. Pass nil detail when the caller does not have one (legacy tests); the spawner falls through to env-only inputs.
Detail is stored before spawning and removed when the spawner emits the corresponding SessionEventEnded event so stale credentials do not linger in memory.
func (*Daemon) ActiveSessions ¶
func (d *Daemon) ActiveSessions() []SessionHandle
ActiveSessions returns a snapshot of in-flight session handles.
func (*Daemon) Done ¶
func (d *Daemon) Done() <-chan struct{}
Done returns a channel that is closed when the daemon has fully stopped.
func (*Daemon) EffectiveVersion ¶
EffectiveVersion returns the version string the daemon should report in HTTP status / heartbeat / registration payloads. Resolution order: `Options.Version` (downstream embedder override) → package `Version` (which itself is "dev" unless overridden via `-ldflags -X .../daemon.Version=…`). Empty option = fall through.
func (*Daemon) HostStatus ¶
func (d *Daemon) HostStatus() *HostStatusDetail
HostStatus returns the most recent hostStatus reported by the platform in a heartbeat response. nil until at least one beat has been ACK'd (or the platform predates Phase 2e). Phase 2e of 2026-05-18-daemon-config-sync-DESIGN.md.
af daemon stats can surface this so an operator sees "your pool was deleted — re-register against pool X" without parsing daemon.log.
func (*Daemon) RoutingTraces ¶
func (d *Daemon) RoutingTraces() *RoutingTraceStore
RoutingTraces returns the daemon's in-process routing trace store. The eventual cross-provider scheduler records its decisions here via store.RecordDecision; the /api/daemon/routing/* HTTP surface reads from it. Exposed so test harnesses (and a future scheduler wire-up) can drive recordings without reaching through internal fields. (Wave 9 / A4.)
func (*Daemon) SessionDetail ¶
func (d *Daemon) SessionDetail(sessionID string) (*SessionDetail, bool)
SessionDetail returns the stored per-session detail for the given session id, or (nil, false) if no detail is recorded. Used by the HTTP server's /api/daemon/sessions/<id> handler.
func (*Daemon) SetWorkareaArchiveRegistry ¶
func (d *Daemon) SetWorkareaArchiveRegistry(reg *WorkareaArchiveRegistry)
SetWorkareaArchiveRegistry replaces the daemon's archive registry with the provided one. Used by tests + by the future pool wire-up (REN-1280) to inject an ActiveWorkareaProvider that sees the live pool.
func (*Daemon) Start ¶
Start brings the daemon online: load config (or wizard), register, start heartbeat, and start the spawner. The HTTP server is NOT started here; callers do that explicitly via Server.Start so they can pick the bind.
func (*Daemon) Stop ¶
Stop performs a graceful shutdown: drain in-flight sessions, stop loops, and transition to stopped. The context is currently unused but is retained for future use (e.g. cancelling drain via ctx.Done). Stop drains spawned work, halts the heartbeat/poller loops, closes the yaml watcher, and transitions to StateStopped. Safe to call concurrently or repeatedly — the whole body is gated by stopOnce so a deferred Stop() in a test fixture racing with an HTTP /stop handler is benign.
func (*Daemon) SubstrateCapabilities ¶
func (d *Daemon) SubstrateCapabilities() []internaldaemon.SubstrateCapability
SubstrateCapabilities returns the substrate capabilities detected at daemon startup. The slice is nil before Start() is called and non-nil afterwards (even when no optional toolchains were found — the always-present set is returned). The returned slice is a copy; callers may mutate it freely. (ADR-2026-05-12-capacity-pools-and-substrate-resolution.md §H.)
func (*Daemon) Update ¶
func (d *Daemon) Update(ctx context.Context) (*UpdateResult, error)
Update triggers a manual auto-update check.
Behavior: drain → fetch manifest → verify → swap → exit (3). If no update is available the call is idempotent and the daemon transitions back to running. If signature verification fails, the swap is aborted and an error is returned. The caller (HTTP handler) typically returns the outcome to the client and may then call Stop().
type EvictHandler ¶
type EvictHandler interface {
Evict(ctx context.Context, req afclient.EvictPoolRequest) (*afclient.EvictPoolResponse, error)
}
EvictHandler executes a pool eviction request and returns the response.
type HeartbeatMutationFailure ¶
HeartbeatMutationFailure is sent in the request body's mutationFailures[] to ACK a queued daemon-config mutation that failed locally.
type HeartbeatOptions ¶
type HeartbeatOptions struct {
WorkerID string
Hostname string
OrchestratorURL string
// RuntimeJWT is the runtime token (a JWT) returned by /api/workers/register
// and sent in Authorization: Bearer on every heartbeat.
RuntimeJWT string
IntervalSeconds int
GetActiveCount func() int
GetMaxCount func() int
GetStatus func() RegistrationStatus
Region string
// GetAllowlist returns the daemon's current project allowlist entries
// (derived from cfg.Projects). Called every beat so a hot yaml reload
// (when that lands) or in-process mutation reflects in the next
// heartbeat. Returning nil is the canonical "no projects configured"
// signal and triggers an empty AllowlistHash. Optional — callers that
// don't care about allowlist sync can leave it nil.
//
// Phase 1d of 2026-05-18-daemon-config-sync-DESIGN.md.
GetAllowlist func() []ProjectAllowlistEntry
// OnPendingMutations is invoked when the platform attaches one or more
// queued daemon-config mutations to a heartbeat response. The callback
// is expected to apply each mutation against daemon.yaml and return
// which ones succeeded (appliedIDs) and which failed
// (failures). The HeartbeatService buffers these and includes them in
// the NEXT beat's appliedMutations[] / mutationFailures[] fields so
// the platform can ACK and emit audit events.
//
// Optional — leave nil to ignore platform-initiated mutations (the
// daemon will keep working off its yaml as-edited locally). Phase 2c.
OnPendingMutations func(ctx context.Context, mutations []PendingMutation) (appliedIDs []string, failures []HeartbeatMutationFailure)
// OnHostStatus is invoked when the platform's heartbeat response
// reports a non-ok hostStatus (e.g. pool_deleted). The daemon can
// use this to surface re-register guidance in `donmai daemon stats` or
// to enter a non-claiming state. Called with the latest status on
// every beat that includes one, so callers can rely on it to clear
// (status='ok') as well.
//
// Optional — leave nil to ignore. Phase 2e.
OnHostStatus func(detail HostStatusDetail)
// HTTPClient is the client used for the real-endpoint call.
HTTPClient *http.Client
// LogWarn is called when the real-endpoint call fails (transient
// failures are non-fatal — the platform will detect via missed
// heartbeats and Redis TTL expiry).
LogWarn func(format string, args ...any)
// Now provides the heartbeat sentAt timestamp.
Now func() time.Time
// OnHeartbeat is invoked after each heartbeat payload is composed
// (whether or not the network call succeeded). Used by tests and
// observability.
OnHeartbeat func(payload HeartbeatPayload)
// OnReregister is called when the runtime token is rejected (HTTP 401)
// or the worker is reported missing (HTTP 404 — likely Redis TTL
// expired). Implementations re-issue Register() against the platform
// and return the fresh worker id + runtime token. Returning a non-nil
// error leaves the heartbeat in its prior state and logs via LogWarn;
// the next tick retries the heartbeat with the stale token (which will
// fail again and re-trigger this path).
//
// reason is the structured failure reason ("worker-not-found",
// "runtime-token-expired", "unauthorized", "auth-failure"). Callers
// should pass it through to RefreshRuntimeToken so the correct
// recovery path is taken — in particular, "worker-not-found" skips
// the JWT refresh probe and goes directly to full re-registration
// (creating a new Redis entry), while "runtime-token-expired" tries
// the refresh probe first to preserve the workerId.
//
// Required when the daemon runs against a real platform; tests that
// only exercise the local stub path can leave it nil.
OnReregister func(ctx context.Context, reason string) (workerID, runtimeJWT string, err error)
}
HeartbeatOptions configure a HeartbeatService.
type HeartbeatPayload ¶
type HeartbeatPayload struct {
WorkerID string `json:"workerId"`
Hostname string `json:"hostname"`
Status RegistrationStatus `json:"status"`
ActiveSessions int `json:"activeSessions"`
MaxSessions int `json:"maxSessions"`
Region string `json:"region,omitempty"`
SentAt string `json:"sentAt"`
// AllowlistHash is the SHA-256 of the daemon's current project
// allowlist (see allowlist_report.go). Sent on every beat so the
// platform can detect drift cheaply. Empty string when the daemon
// has no projects configured.
//
// Phase 1d of 2026-05-18-daemon-config-sync-DESIGN.md.
AllowlistHash string `json:"allowlistHash,omitempty"`
// Allowlist is the full structured allowlist payload. Included only
// when AllowlistHash changes from the platform's last-known value
// (the daemon caches its previously-reported hash and includes the
// list only on first beat or on change). Steady-state overhead per
// beat is the 64-byte hash + ~8 bytes of JSON framing.
Allowlist []ProjectAllowlistEntry `json:"allowlist,omitempty"`
}
HeartbeatPayload is the body sent on POST /v1/daemon/heartbeat.
type HeartbeatService ¶
type HeartbeatService struct {
// contains filtered or unexported fields
}
HeartbeatService manages the periodic heartbeat goroutine. It is safe to Start / Stop multiple times; consecutive Starts are idempotent.
func NewHeartbeatService ¶
func NewHeartbeatService(opts HeartbeatOptions) *HeartbeatService
NewHeartbeatService constructs a HeartbeatService from opts. Required callbacks are GetActiveCount, GetMaxCount, and GetStatus.
func (*HeartbeatService) CurrentCredentials ¶
func (h *HeartbeatService) CurrentCredentials() (workerID, runtimeJWT string)
CurrentCredentials returns the worker id and runtime JWT currently in use. They may differ from the values passed at construction time after a re-register on 401.
func (*HeartbeatService) IsRunning ¶
func (h *HeartbeatService) IsRunning() bool
IsRunning reports whether the heartbeat goroutine is active.
func (*HeartbeatService) LastPayload ¶
func (h *HeartbeatService) LastPayload() HeartbeatPayload
LastPayload returns the most recently composed heartbeat payload (for debugging / status surfaces).
func (*HeartbeatService) Start ¶
func (h *HeartbeatService) Start()
Start launches the heartbeat goroutine. It sends an immediate heartbeat, then continues at IntervalSeconds. Subsequent calls are no-ops.
func (*HeartbeatService) Stop ¶
func (h *HeartbeatService) Stop()
Stop terminates the heartbeat goroutine. Safe to call multiple times.
type HostStatusDetail ¶
type HostStatusDetail struct {
Status string `json:"status"` // ok | pool_deleted | pool_draining | pool_disabled | unauthorized
RecommendedAction string `json:"recommendedAction,omitempty"`
CandidatePoolIDs []string `json:"candidatePoolIds,omitempty"`
}
HostStatusDetail mirrors the platform's wire shape for hostStatus in the heartbeat response. The daemon uses this to decide whether to keep claiming work or surface a re-register recommendation.
type KitConfig ¶
type KitConfig struct {
// ScanPaths is the ordered list of directories the kit registry walks
// to find installed kits. Empty / absent means [DefaultKitScanPath()]
// (resolved by applyDefaults).
ScanPaths []string `yaml:"scanPaths,omitempty" json:"scanPaths,omitempty"`
}
KitConfig configures the Layer-4 kit operator surface — the scan paths the daemon walks to discover installed kits. Wave 11 / ADR-2026-05-07 § D4. ScanPaths are evaluated in declaration order; the first entry is also where the .state.json sidecar (enable/disable toggles) lives. A leading `~/` is expanded to the user's home directory by NewKitRegistry.
type KitRegistry ¶
type KitRegistry struct {
// contains filtered or unexported fields
}
KitRegistry is a minimal in-process Kit registry.
Methods are safe for concurrent use. The registry rescans on every List call so newly-installed manifests appear without a daemon restart; this is acceptable for an operator-facing surface where call volume is low.
func NewKitRegistry ¶
func NewKitRegistry(scanPaths []string) *KitRegistry
NewKitRegistry constructs a KitRegistry with permissive trust mode.
scanPaths defaults to []string{DefaultKitScanPath()} when nil or empty. The first scan path is also where the .state.json sidecar lives.
Equivalent to NewKitRegistryWithTrust(scanPaths, TrustConfig{Mode: TrustModePermissive}). Callers wiring trust modes (or an issuer allowlist) from daemon.yaml should use NewKitRegistryWithTrust.
func NewKitRegistryWithTrust ¶
func NewKitRegistryWithTrust(scanPaths []string, trust TrustConfig) *KitRegistry
NewKitRegistryWithTrust constructs a KitRegistry with the given trust configuration. Used by Server.kitRegistryOrEmpty to thread the daemon.Config().Trust block into the registry.
If the verifier fails to construct (e.g., the embedded trust root JSON fails to parse), a permissive verifier with no trusted material is installed instead — every signed manifest reports SignedUnverified, every unsigned reports Unsigned, and the install gate behaves as if Mode=Permissive. The construction error is logged via slog.Warn so operators can diagnose.
func (*KitRegistry) Disable ¶
func (r *KitRegistry) Disable(id string) (afclient.Kit, error)
Disable marks the kit disabled in the persisted state. Returns the updated Kit summary or ErrKitNotFound when the id is unknown.
func (*KitRegistry) DisableSource ¶
func (r *KitRegistry) DisableSource(name string) (afclient.KitRegistrySource, error)
DisableSource toggles a registry source off.
func (*KitRegistry) Enable ¶
func (r *KitRegistry) Enable(id string) (afclient.Kit, error)
Enable marks the kit active in the persisted state. Returns the updated Kit summary or ErrKitNotFound when the id is unknown.
func (*KitRegistry) EnableSource ¶
func (r *KitRegistry) EnableSource(name string) (afclient.KitRegistrySource, error)
EnableSource toggles a registry source on. Returns ErrKitSourceNotFound if the name is not in the federation list.
func (*KitRegistry) Get ¶
func (r *KitRegistry) Get(id string) (afclient.KitManifest, error)
Get returns the full manifest for a single kit id. Returns ErrKitNotFound when the id is not registered.
func (*KitRegistry) Install ¶
func (r *KitRegistry) Install(id string, req afclient.KitInstallRequest) (afclient.KitInstallResult, error)
Install fetches a kit from the operator-supplied source, runs the trust-gated verifier against the freshly-fetched manifest, and (when the gate allows) persists the manifest + sibling .sigstore bundle into the first configured scan path.
Behaviour by request shape (audit § 2.1, § 2.2):
- req.Source == nil — the Wave-9 backward-compat path. Returns ErrKitInstallUnimplemented (HTTP 501) so the existing Wave-9 smoke + handler tests posting `{}` keep their assertions intact.
- req.Source.Kind == "git" — clone source.URL @ source.Ref into a temp dir (via gitKitFetcher), locate the manifest, run the verifier, gate on r.verifier.config.Mode, persist into scanPaths[0]. Errors map to ErrKitInstallSourceFetchFailed (502) or ErrKitInstallManifestNotFound (422).
- req.Source.Kind == "tessl" / "agentskills" / "community" — federation cross-repo wave (REN-1308 follow-up). Returns ErrKitSourceFederationUnimplemented (HTTP 501).
- Any other kind — wrapped fmt error (handler-mapped to 400).
Trust override: `req.TrustOverride == "allowed-this-once"` bypasses the gate for a single install with structured slog audit logging. Otherwise an unsigned/unverified manifest under a non-permissive trust mode returns ErrKitTrustGateRejected (HTTP 403).
Manifest persistence uses the atomic tmp-then-rename pattern to match the kit_state writer at saveStateLocked. The on-disk filename is `<sanitizedID>.kit.toml` where slashes in the manifest's `kit.id` are replaced with `__` (the manifest's internal `kit.id` retains the canonical slash form).
func (*KitRegistry) List ¶
func (r *KitRegistry) List() []afclient.Kit
List returns all installed kits across all scan paths. Malformed manifests log a warning and are excluded. Empty scan paths return an empty slice with no error.
func (*KitRegistry) ListSources ¶
func (r *KitRegistry) ListSources() []afclient.KitRegistrySource
ListSources returns the federation order's registry source descriptors. Persisted disable state from .state.json is applied to the Enabled flag.
func (*KitRegistry) ScanPaths ¶
func (r *KitRegistry) ScanPaths() []string
ScanPaths returns the registry's scan paths in declaration order.
func (*KitRegistry) VerifySignature ¶
func (r *KitRegistry) VerifySignature(id string) (afclient.KitSignatureResult, error)
VerifySignature returns a KitSignatureResult for the kit, driven by the sigstore bundle-mode verifier (Wave 12 / S2). The verifier reads the sibling `<manifest>.sigstore` file alongside the kit manifest; missing-bundle returns KitTrustUnsigned with OK: true. Verification outcomes map to KitTrustSignedVerified / KitTrustSignedUnverified; see kit_trust.go for the full state machine.
type MachineConfig ¶
type MachineConfig struct {
ID string `yaml:"id" json:"id"`
Region string `yaml:"region,omitempty" json:"region,omitempty"`
}
MachineConfig captures the machine identity block from daemon.yaml.
type ObservabilityConfig ¶
type ObservabilityConfig struct {
LogFormat string `yaml:"logFormat,omitempty" json:"logFormat,omitempty"`
LogPath string `yaml:"logPath,omitempty" json:"logPath,omitempty"`
MetricsPort int `yaml:"metricsPort,omitempty" json:"metricsPort,omitempty"`
}
ObservabilityConfig holds optional log/metrics tuning.
type Options ¶
type Options struct {
// ConfigPath is where to load / persist daemon.yaml. Defaults to
// DefaultConfigPath().
ConfigPath string
// JWTPath is where to cache the runtime JWT. Defaults to
// DefaultJWTPath().
JWTPath string
// SkipWizard, when true, prevents the interactive wizard from running
// even when stdin is a TTY. The default config (or existing config) is
// used instead.
SkipWizard bool
// SkipRegistration, when true, skips the registration call (used when
// the daemon is being started in setup-only or config-only modes).
SkipRegistration bool
// SpawnerOptions overrides the default spawner options. The Projects
// and MaxConcurrentSessions fields are populated automatically from
// loaded config.
SpawnerOptions SpawnerOptions
// HTTPHost overrides the default control server bind address.
HTTPHost string
// HTTPPort overrides the default control server port.
//
// Zero means "ephemeral port": the listener binds 127.0.0.1:0 and
// the kernel picks a free port. The effective bound port is then
// available via Server.Addr() after Server.Start succeeds.
// Production callers (afcli/daemon_run.go) substitute the
// well-known DefaultHTTPPort (7734) themselves before constructing
// Options so operator behaviour is preserved; the daemon library
// itself does NOT auto-fill — leaving zero-as-ephemeral makes
// parallel tests collision-free under -race.
HTTPPort int
// PoolStatsProvider returns the current workarea pool snapshot. May be
// nil — the /api/daemon/pool/stats endpoint will return an empty
// snapshot in that case (acceptance criterion: pool integration is
// optional in the runtime port; full WorkareaProvider wiring is REN-1280).
PoolStatsProvider PoolStatsProvider
// EvictHandler handles pool eviction requests. May be nil; the endpoint
// returns 501 in that case.
EvictHandler EvictHandler
// ProviderRegistry exposes the daemon's locally-registered AgentRuntime
// providers (claude/codex/ollama/opencode/gemini/amp/stub) to the
// /api/daemon/providers* surface. May be nil — the endpoint will then
// return an empty list with PartialCoverage=true, which is the correct
// behaviour for a daemon that has not yet wired its runtime registry.
// Wave 9 / ADR-2026-05-07-daemon-http-control-api.md §D4.
ProviderRegistry ProviderRegistry
// Version overrides the package-level `Version` for status reporting.
// Empty falls back to the package var (which itself defaults to "dev"
// unless the build injected via -ldflags). Downstream embedders that
// ship their own binary (e.g. the rensei daemon) should set this to
// their own version string so /api/daemon/status reports the
// running binary, not whatever string agentfactory-tui's vendored
// source had at the time.
Version string
}
Options configure a Daemon.
type OrchestratorConfig ¶
type OrchestratorConfig struct {
URL string `yaml:"url" json:"url"`
AuthToken string `yaml:"authToken,omitempty" json:"authToken,omitempty"`
}
OrchestratorConfig is the orchestrator URL + registration token block.
type PendingMutation ¶
type PendingMutation struct {
ID string `json:"id"`
Op string `json:"op"` // project.add | project.remove
Params json.RawMessage `json:"params"`
RequestedAt string `json:"requestedAt"`
RequestedBy string `json:"requestedBy"`
}
PendingMutation mirrors the platform's serializePendingMutation wire shape — included in the heartbeat response so the daemon can apply queued proposals and ACK on the next beat. Phase 2 of 2026-05-18-daemon-config-sync-DESIGN.md.
type PollHTTPError ¶
PollHTTPError is returned by callPollEndpoint for non-2xx responses so the loop can branch on the HTTP status (401 → re-register).
func (*PollHTTPError) Error ¶
func (e *PollHTTPError) Error() string
type PollOptions ¶
type PollOptions struct {
WorkerID string
OrchestratorURL string
RuntimeJWT string
IntervalSeconds int
// HTTPClient is the client used for poll calls. Defaults to a 30s-timeout
// http.Client.
HTTPClient *http.Client
// LogWarn is called for transient poll failures. Defaults to no-op.
LogWarn func(format string, args ...any)
// LogInfo is called when work is dispatched / re-register fires.
LogInfo func(format string, args ...any)
// OnWork is invoked for each item returned in the work[] slice. Errors are
// logged at warn and do not stop the loop. Required.
OnWork func(item PollWorkItem) error
// OnReregister is called on HTTP 401 (runtime JWT expired) or 404 (worker
// fell out of Redis). Implementations re-issue Register() and return the
// fresh worker id + runtime token. The poll loop swaps credentials and
// continues. Returning an error logs and the loop retries on the next tick.
//
// reason is the structured failure reason ("worker-not-found",
// "runtime-token-expired", "unauthorized", "auth-failure"). Pass it
// through to RefreshRuntimeToken so the correct recovery path is taken
// — "worker-not-found" skips the JWT refresh probe and goes directly to
// full re-registration to create a new Redis entry.
OnReregister func(ctx context.Context, reason string) (workerID, runtimeJWT string, err error)
}
PollOptions configure a single poll loop run.
type PollResponse ¶
type PollResponse struct {
Work []PollWorkItem `json:"work"`
HasInboxMessages bool `json:"hasInboxMessages,omitempty"`
PreClaimed bool `json:"preClaimed,omitempty"`
ClaimedSessionIDs []string `json:"claimedSessionIds,omitempty"`
}
PollResponse is the body of GET /api/workers/<id>/poll. Only the fields the daemon currently consumes are decoded; unknown fields are ignored.
type PollService ¶
type PollService struct {
// contains filtered or unexported fields
}
PollService manages the periodic poll goroutine. Like HeartbeatService it is safe to Start / Stop multiple times; consecutive Starts are idempotent.
func NewPollService ¶
func NewPollService(opts PollOptions) *PollService
NewPollService constructs a PollService from opts. OnWork must be non-nil.
func (*PollService) IsRunning ¶
func (p *PollService) IsRunning() bool
IsRunning reports whether the poll goroutine is active.
func (*PollService) Start ¶
func (p *PollService) Start()
Start launches the poll goroutine. Subsequent calls are no-ops.
func (*PollService) Stop ¶
func (p *PollService) Stop()
Stop terminates the poll goroutine. Safe to call multiple times.
type PollStageBudget ¶
type PollStageBudget struct {
MaxDurationSeconds int `json:"maxDurationSeconds,omitempty"`
MaxSubAgents int `json:"maxSubAgents,omitempty"`
MaxTokens int64 `json:"maxTokens,omitempty"`
}
PollStageBudget mirrors the platform's StageBudget shape so the daemon can decode + forward it without depending on the runner package (cardinal package-architecture rule: daemon does not import runner). The runner re-types this into prompt.StageBudget when it constructs the QueuedWork. (REN-1485 / REN-1487.)
type PollWorkItem ¶
type PollWorkItem struct {
SessionID string `json:"sessionId"`
ProjectName string `json:"projectName,omitempty"`
Repository string `json:"repository,omitempty"`
Ref string `json:"ref,omitempty"`
Priority int `json:"priority,omitempty"`
Env map[string]string `json:"env,omitempty"`
MaxDuration int `json:"maxDurationSeconds,omitempty"`
Resources *SessionResources `json:"resources,omitempty"`
QueuedAt int64 `json:"queuedAt,omitempty"`
ProjectScope string `json:"projectScope,omitempty"`
// REN-1461 / F.2.8 — enriched fields the platform may send so the
// `donmai agent run` worker has the runner context it needs without
// requiring a separate platform fetch. Optional during the rollout
// window; absent fields fall through to the default render path.
IssueID string `json:"issueId,omitempty"`
IssueIdentifier string `json:"issueIdentifier,omitempty"`
LinearSessionID string `json:"linearSessionId,omitempty"`
ProviderSessionID string `json:"providerSessionId,omitempty"`
OrganizationID string `json:"organizationId,omitempty"`
WorkType string `json:"workType,omitempty"`
PromptContext string `json:"promptContext,omitempty"`
Body string `json:"body,omitempty"`
Title string `json:"title,omitempty"`
MentionContext string `json:"mentionContext,omitempty"`
ParentContext string `json:"parentContext,omitempty"`
Branch string `json:"branch,omitempty"`
ResolvedProfile *SessionResolvedProfile `json:"resolvedProfile,omitempty"`
ModelProfile *SessionModelProfile `json:"modelProfile,omitempty"`
// REN-1485 / REN-1487 Phase 2 stage-driven SDLC fields. Populated
// by the platform's `agent.dispatch_stage` action; absent when the
// work was queued by the legacy `agent.dispatch_to_queue` action.
// Round-trip opaquely on the QueuedWork JSON; the daemon forwards
// them onto SessionDetail without interpreting them.
StagePrompt string `json:"stagePrompt,omitempty"`
StageID string `json:"stageId,omitempty"`
StageBudget *PollStageBudget `json:"stageBudget,omitempty"`
StageLifecycle map[string]any `json:"stageLifecycle,omitempty"`
StageSourceEventID string `json:"stageSourceEventId,omitempty"`
}
PollWorkItem mirrors one element of the platform's poll response `work[]` array. The platform serves GET /api/workers/<id>/poll and returns:
{
work: QueuedWork[],
inboxMessages: { [sessionId]: InboxMessage[] },
hasInboxMessages: boolean,
preClaimed: boolean,
claimedSessionIds: string[],
gitCredentials: { token, cloneUrl, expiresAt }[],
}
QueuedWork carries the session-spec fields the daemon needs to dispatch a session to the spawner. Field names follow the platform wire shape (camelCase).
QueuedAt is a Unix-millisecond epoch number on the wire — the platform's QueuedWork interface (packages/agentfactory-server work-queue.ts) defines it as `queuedAt: number`, and the Redis-stored session payload confirms a numeric value (e.g. 1777658441780). v0.4.1 mistakenly typed it as `string`, which caused the daemon's poll loop to fail decoding ("cannot unmarshal number into Go struct field PollWorkItem.work.queuedAt of type string") and silently drop pre-claimed sessions.
type PoolCapacityGuard ¶
type PoolCapacityGuard interface {
// CheckCapacity returns nil + zero retryAfter when a new member
// fits, or a non-zero retryAfter and an explanatory error when the
// pool is saturated.
CheckCapacity() (retryAfter time.Duration, err error)
}
PoolCapacityGuard tells Restore whether a fresh pool member can be admitted. Returning a non-zero retryAfter indicates saturation — Restore propagates that to the HTTP handler as 503 + Retry-After.
type PoolStatsProvider ¶
type PoolStatsProvider interface {
Stats(ctx context.Context) (*afclient.WorkareaPoolStats, error)
}
PoolStatsProvider returns a workarea pool snapshot.
type PrefixWriterFunc ¶
type PrefixWriterFunc func(workerID, line string)
PrefixWriterFunc adapts a function to PrefixedWriter.
func (PrefixWriterFunc) WriteWorkerLine ¶
func (f PrefixWriterFunc) WriteWorkerLine(workerID, line string)
WriteWorkerLine implements PrefixedWriter.
type PrefixedWriter ¶
type PrefixedWriter interface {
WriteWorkerLine(workerID, line string)
}
PrefixedWriter is the minimal sink interface used by the spawner to emit child stdout/stderr. Implementations are responsible for prefixing each line with the worker tag.
type ProjectAllowlistEntry ¶
ProjectAllowlistEntry is the wire shape for a single allowlisted project reported by the daemon. Mirrors the on-disk daemon.yaml `projects[]` shape (daemon/config.go ProjectConfig) but trimmed to the fields the platform needs for display + routing-decision visibility.
type ProjectConfig ¶
type ProjectConfig struct {
ID string `yaml:"id" json:"id"`
Repository string `yaml:"repository" json:"repository"`
CloneStrategy CloneStrategy `yaml:"cloneStrategy,omitempty" json:"cloneStrategy,omitempty"`
Git *ProjectGit `yaml:"git,omitempty" json:"git,omitempty"`
}
ProjectConfig describes one entry in the project allowlist.
func (*ProjectConfig) UnmarshalYAML ¶
func (p *ProjectConfig) UnmarshalYAML(node *yaml.Node) error
UnmarshalYAML accepts either the canonical `repository` key or the legacy `repoUrl` key (pre-REN-1419 daemon.yaml files written by older versions of `rensei project allow`). When the legacy key is found a one-line warning is logged so operators know to rewrite the file; this back-compat shim is scheduled for removal one release after the canonical writer ships.
type ProjectGit ¶
type ProjectGit struct {
CredentialHelper string `yaml:"credentialHelper,omitempty" json:"credentialHelper,omitempty"`
SSHKey string `yaml:"sshKey,omitempty" json:"sshKey,omitempty"`
}
ProjectGit captures per-project credential helper / SSH key hints.
type ProvideCapability ¶
type ProvideCapability struct {
Kind string `json:"kind"`
}
ProvideCapability is a single entry in the RegisterRequest.Provides array. It mirrors the SubstrateCapability wire shape used by the internal capability detector but is kept in the daemon package (public API surface) to avoid a cross-package import cycle between daemon and internal/daemon.
type ProviderRegistry ¶
type ProviderRegistry interface {
// Names returns the sorted list of registered provider name strings.
// Each name is the canonical agent.ProviderName string (e.g. "claude",
// "codex"). Order is stable across calls.
Names() []string
// Capabilities returns the typed capability struct serialised to a
// flat map[string]any for the named provider. ok is false when the
// provider is not registered. The map shape matches the JSON encoding
// of agent.Capabilities so the wire shape on /api/daemon/providers
// matches the contract.
Capabilities(name string) (caps map[string]any, ok bool)
}
ProviderRegistry is the minimal read-only view of the runner's in-process AgentRuntime registry the /api/daemon/providers handler consumes. The daemon imports a satisfying type from runner.Registry — the interface keeps this package free of a runner import cycle. (Wave 9 / A1.)
type RefreshTokenResult ¶
type RefreshTokenResult struct {
// Mode is the path the refresh actually took: "refresh" (platform
// honoured the refresh probe and minted a new JWT bound to the
// same workerId), "reregister" (probe returned 404 / endpoint
// missing — the daemon fell back to full POST /api/workers/register
// and got a NEW workerId), or "error" (both paths failed).
Mode string
// WorkerID is the worker id in effect after the refresh attempt.
// On Mode=refresh this is the SAME workerId; on Mode=reregister
// it's a fresh one.
WorkerID string
// RuntimeToken is the fresh runtime JWT.
RuntimeToken string
// RegistrationTokenSwapped is true when Mode=reregister produced a
// different workerId. Operators care about this signal because the
// platform forgets the old workerId after a fresh registration —
// any in-flight heartbeats / polls keyed on it 404 until the daemon
// swaps credentials. (REN-1481 root cause.)
RegistrationTokenSwapped bool
// Reason is the structured reason the refresh path was taken
// (e.g. "runtime-token-expired", "worker-not-found"). Surfaces in
// the [runtime-token] log line.
Reason string
}
RefreshTokenResult is the outcome of an attempted runtime-token refresh. The OnReregister callback wired into HeartbeatService and PollService synthesises one of these per attempt; logged via the `[runtime-token]` structured line.
func RefreshRuntimeToken ¶
func RefreshRuntimeToken( ctx context.Context, regOpts RegistrationOptions, currentWorkerID string, reason string, ) (*RefreshTokenResult, error)
RefreshRuntimeToken attempts to refresh the daemon's runtime JWT without re-registering — i.e. preserving the workerId. This is the REN-1481 fix path. Behaviour:
- When reason is "worker-not-found" (HTTP 404 on poll or heartbeat), the worker's Redis registration entry has expired — the runtime token itself is still valid, but the platform has no record of this worker. Probing the refresh endpoint would return a fresh JWT for the SAME workerId, which would loop forever. Skip the probe and go directly to full re-register to create a new Redis entry.
- Otherwise, probe POST /api/workers/<id>/refresh-token with the registration token in the Authorization: Bearer header. On 200, the platform mints a fresh JWT bound to the same workerId — best case.
- On 404 (endpoint missing — current platform-side state) or 405 (method not allowed) from the refresh probe, fall through to FULL re-register via Register(ForceReregister=true). The runtime token gets refreshed but at the cost of a new workerId.
- On any other failure (5xx, network, 401-on-registration-token), return an error. Caller logs + retries on next tick.
This is the only path that should call Register() with ForceReregister=true outside boot. All in-flight 401/404 detection in HeartbeatService / PollService routes through here so the `[runtime-token]` log line is the single source of truth for operators investigating the 5-minute cycle in REN-1481.
type RegisterRequest ¶
type RegisterRequest struct {
MachineID string `json:"machineId,omitempty"`
Hostname string `json:"hostname"`
Capacity int `json:"capacity"`
Version string `json:"version,omitempty"`
Projects []string `json:"projects,omitempty"`
Provides []ProvideCapability `json:"provides,omitempty"`
// DaemonProjects is the structured project allowlist read from the
// daemon's local config (daemon.yaml's projects[]). Each entry carries
// the project id and resolved repository URL the daemon enforces at
// WorkerSpawner.AcceptWork time. Distinct from the legacy `Projects`
// []string above, which the platform overwrites with Linear-resolved
// names for registration-token auth (see platform/src/app/api/workers/
// register/route.ts:265).
//
// Phase 1c of 2026-05-18-daemon-config-sync-DESIGN.md — read-only mirror;
// platform persists into worker_hosts.allowed_projects jsonb so the
// capacity UI can surface "this host serves projects X, Y, Z" without
// SSH-ing to the host. Omitted when the daemon yaml has no projects[]
// entries; the platform falls back to "unknown / unrestricted" semantics.
DaemonProjects []ProjectAllowlistEntry `json:"daemonProjects,omitempty"`
}
RegisterRequest is the JSON body sent on POST /api/workers/register.
The platform contract (see platform/src/app/api/workers/register/route.ts):
{ machineId?: string, hostname: string, capacity: number, version?: string,
projects?: string[], provides?: []{ kind: string } }
The registration token is sent in the Authorization: Bearer header, NOT in the body. Status / region / activeAgentCount are not part of the platform contract — they live in the heartbeat payload, or are inferred from the project's Linear tracker bindings on the server side.
provides[] carries the substrate capabilities detected by the daemon at startup. Each entry has a `kind` field matching the platform v1 SubstrateCapabilityDeclaration.runtimeKinds enum (ADR-2026-05-12-capacity-pools-and-substrate-resolution.md §2). The platform stamps pool-level capability on worker_hosts.capabilities when this field is present. Omitted on older daemon versions; the platform falls back to provider-class defaults in that case.
type RegisterResponse ¶
type RegisterResponse struct {
WorkerID string `json:"workerId"`
HeartbeatInterval int `json:"heartbeatInterval"` // ms
PollInterval int `json:"pollInterval"` // ms
RuntimeToken string `json:"runtimeToken"`
RuntimeTokenExpiresAt string `json:"runtimeTokenExpiresAt,omitempty"`
}
RegisterResponse is the JSON response from POST /api/workers/register.
Platform contract:
{ workerId, heartbeatInterval (ms), pollInterval (ms),
runtimeToken, runtimeTokenExpiresAt }
Field names mirror the wire shape; helper methods provide seconds-based accessors used by the heartbeat scheduler.
func Register ¶
func Register(ctx context.Context, opts RegistrationOptions) (*RegisterResponse, error)
Register dials the platform (or the stub path) and returns a RegisterResponse. The cache at jwtPath is consulted first unless opts.ForceReregister is set.
Real-platform registration is the default. The stub path is taken when:
- RENSEI_DAEMON_FORCE_STUB env is set (e.g. =1), OR
- the orchestrator URL is "file://...", OR
- the registration token does not start with rsp_live_ or rsk_live_.
REN-1444 (v0.4.1) inverted the env-gate from opt-in to opt-out. The previous default required RENSEI_DAEMON_REAL_REGISTRATION=1 in the launchd plist; with that env unset, a daemon configured with a real rsk_live_* token would silently fall back to stub mode and never register against the platform.
func (*RegisterResponse) HeartbeatIntervalSeconds ¶
func (r *RegisterResponse) HeartbeatIntervalSeconds() int
HeartbeatIntervalSeconds returns the heartbeat cadence in seconds (rounded up). The platform reports the cadence in milliseconds; daemon code that schedules tickers historically worked in seconds.
func (*RegisterResponse) PollIntervalSeconds ¶
func (r *RegisterResponse) PollIntervalSeconds() int
PollIntervalSeconds returns the poll cadence in seconds (rounded up).
type RegistrationOptions ¶
type RegistrationOptions struct {
OrchestratorURL string
RegistrationToken string
MachineID string
Hostname string
Version string
MaxAgents int
Capabilities []string
Region string
JWTPath string
ForceReregister bool
// Provides is the substrate capability set the daemon advertises to the
// platform at registration time. Each entry corresponds to a
// SubstrateCapabilityDeclaration.runtimeKinds value (e.g. "native",
// "npm", "python-pip"). When nil the field is omitted from the wire
// payload and the platform falls back to provider-class defaults.
// (ADR-2026-05-12-capacity-pools-and-substrate-resolution.md §2, Stream H.)
Provides []ProvideCapability
// DaemonProjects is the structured allowlist reported to the platform
// for read-only mirroring (Phase 1c of daemon-config-sync DESIGN).
// Populate from cfg.Projects at the call site.
DaemonProjects []ProjectAllowlistEntry
// HTTPClient is the client used when the real (non-stub) path is taken.
// Defaults to http.DefaultClient with a 10s timeout.
HTTPClient *http.Client
// Now lets tests deterministically clock the cached-at timestamp.
Now func() time.Time
}
RegistrationOptions configure a single Register call.
type RegistrationStatus ¶
type RegistrationStatus string
RegistrationStatus is the worker-status string sent to the orchestrator in the heartbeat payload. Mirrors the TS DaemonRegistrationStatus.
const ( RegistrationIdle RegistrationStatus = "idle" RegistrationBusy RegistrationStatus = "busy" RegistrationDraining RegistrationStatus = "draining" )
Registration status constants.
type ReservedSystemSpec ¶
type ReservedSystemSpec struct {
VCpu int `yaml:"vCpu" json:"vCpu"`
MemoryMb int `yaml:"memoryMb" json:"memoryMb"`
}
ReservedSystemSpec describes resources reserved for the host OS.
type RoutingTraceStore ¶
type RoutingTraceStore struct {
// contains filtered or unexported fields
}
RoutingTraceStore is the in-process record of routing decisions. The scheduler (or, in this wave, the test harness) feeds it via RecordDecision; HTTP handlers read via GetConfig and Explain.
The store is safe for concurrent use.
func NewRoutingTraceStore ¶
func NewRoutingTraceStore(ringSize int) *RoutingTraceStore
NewRoutingTraceStore constructs a store with the given ring-buffer size. ringSize ≤ 0 falls back to DefaultRoutingRingBufferSize.
func (*RoutingTraceStore) Explain ¶
func (s *RoutingTraceStore) Explain(sessionID string) (afclient.RoutingDecision, []afclient.RoutingTraceStep, bool)
Explain returns the recorded decision and trace for sessionID. Returns false when the session has no recorded decision (or the decision has been evicted from the ring).
func (*RoutingTraceStore) GetConfig ¶
func (s *RoutingTraceStore) GetConfig(providerNames []string, capturedAt time.Time) afclient.RoutingConfig
GetConfig builds the wire-shape RoutingConfig for the /api/daemon/routing/config endpoint. It composes the static portions (weights, capability filters, sandbox/LLM provider state) with the rolling RecentDecisions tail.
The provider-state surfaces are seeded from the runner.Registry's Names() (passed in via providerNames) — this represents AgentRuntime providers. The sandbox state lists only "local" because that's the only OSS-shipped sandbox in this wave. Both lists default to Thompson-Sampling priors (alpha=1, beta=1) when no decisions have been recorded.
capturedAt sets the snapshot timestamp; pass time.Now().UTC() in production.
func (*RoutingTraceStore) Len ¶
func (s *RoutingTraceStore) Len() int
Len returns the current number of recorded decisions in the ring buffer. Test-only helper.
func (*RoutingTraceStore) RecordDecision ¶
func (s *RoutingTraceStore) RecordDecision(decision afclient.RoutingDecision, trace []afclient.RoutingTraceStep)
RecordDecision appends decision + trace to the store. If the store is already at ring capacity, the oldest entry is evicted from both the ring and the per-session lookup. Recording with an empty SessionID is allowed (the ring still tracks it) but the explain lookup is keyed by SessionID, so an unkeyed entry is invisible to Explain.
type Server ¶
type Server struct {
// contains filtered or unexported fields
}
Server is the daemon's HTTP control API. It wraps a Daemon and exposes the endpoints consumed by `donmai daemon …` and `rensei daemon …`.
func NewServer ¶
NewServer builds an HTTP server for d. The handler is registered but the server is not yet listening — call Start to bind.
type SessionDetail ¶
type SessionDetail struct {
// SessionID is the platform session UUID. Always populated.
SessionID string `json:"sessionId"`
// IssueID is the Linear issue UUID this session was triggered for.
IssueID string `json:"issueId,omitempty"`
// IssueIdentifier is the human-readable Linear identifier
// (e.g. "REN-1457").
IssueIdentifier string `json:"issueIdentifier,omitempty"`
// LinearSessionID is the Linear-side agent-session id.
LinearSessionID string `json:"linearSessionId,omitempty"`
// ProviderSessionID is the provider-native session id when this
// is a resume (e.g. Claude session UUID).
ProviderSessionID string `json:"providerSessionId,omitempty"`
// ProjectName is the canonical Linear project identifier.
ProjectName string `json:"projectName,omitempty"`
// OrganizationID is the Rensei tenant UUID.
OrganizationID string `json:"organizationId,omitempty"`
// Repository is the git URL (or owner/name slug) the agent should
// operate on.
Repository string `json:"repository,omitempty"`
// Ref is the base branch / ref to check out from.
Ref string `json:"ref,omitempty"`
// WorkType is the workflow discriminant ("development", "qa",
// "research", ...).
WorkType string `json:"workType,omitempty"`
// PromptContext is the rendered Linear issue context block produced
// by the platform-side dispatcher.
PromptContext string `json:"promptContext,omitempty"`
// Body is the raw Linear issue description text.
Body string `json:"body,omitempty"`
// Title is the Linear issue title.
Title string `json:"title,omitempty"`
// MentionContext is the optional user-mention text from the Linear
// agent-session create event.
MentionContext string `json:"mentionContext,omitempty"`
// ParentContext is the optional parent-issue context block built
// by the coordinator when this session is a sub-agent.
ParentContext string `json:"parentContext,omitempty"`
// Branch is the working branch the agent should create/use.
Branch string `json:"branch,omitempty"`
// ResolvedProfile carries the model-profile knobs the platform
// resolved before queueing this work. Daemon stores opaquely.
ResolvedProfile *SessionResolvedProfile `json:"resolvedProfile,omitempty"`
// ModelProfile is the richer, fully-rendered model-profile the
// platform passes with each dispatch when workType+model-profile
// routing is active (ADR-2026-05-12-worktype-and-model-profile-
// routing). When present it supersedes ResolvedProfile.Provider /
// Model / Effort in the runner. Forwarded opaquely by the daemon.
ModelProfile *SessionModelProfile `json:"modelProfile,omitempty"`
// WorkerID is the daemon worker id that claimed this session.
WorkerID string `json:"workerId,omitempty"`
// AuthToken is the runtime JWT the runner uses for platform API
// calls (heartbeat, result post). Scoped to this worker.
AuthToken string `json:"authToken,omitempty"`
// PlatformURL is the base URL of the platform.
PlatformURL string `json:"platformUrl,omitempty"`
// StagePrompt is the pre-rendered user-prompt body the platform
// dispatcher built from the stage prompt template. When present
// the runner uses it verbatim and skips the embedded user template.
StagePrompt string `json:"stagePrompt,omitempty"`
// StageID is the canonical stage id (e.g. "research",
// "development", "qa"). Used for log correlation + env injection.
StageID string `json:"stageId,omitempty"`
// StageBudget is the per-stage runtime budget the runner enforces.
StageBudget *PollStageBudget `json:"stageBudget,omitempty"`
// StageLifecycle is the lifecycle config for the workflow this
// stage instance belongs to. Forwarded opaquely on WORK_RESULT.
StageLifecycle map[string]any `json:"stageLifecycle,omitempty"`
// StageSourceEventID is the source CloudEvent id the stage trigger
// normaliser emitted. Carried for end-to-end audit correlation.
StageSourceEventID string `json:"stageSourceEventId,omitempty"`
}
SessionDetail is the per-session payload `donmai agent run` reads from the daemon's local control HTTP API on spawn. It carries the full runner-side QueuedWork shape (issue context, resolved profile, branch) plus the platform-side credentials the runner needs to talk back (auth token, platform URL, worker id, lock id).
The daemon stores one SessionDetail per accepted session in an in-memory map. A spawned `donmai agent run` process fetches its detail via GET /api/daemon/sessions/<id> at start-up.
Wire shape: JSON, camelCase tags. Forward-compat — new fields can be added freely; clients ignore unknown fields.
type SessionEvent ¶
type SessionEvent struct {
Kind SessionEventKind
Handle SessionHandle
Spec SessionSpec
ExitErr error
}
SessionEvent is emitted on the spawner's events channel.
type SessionEventKind ¶
type SessionEventKind string
SessionEventKind identifies the kind of SessionEvent.
const ( SessionEventStarted SessionEventKind = "started" SessionEventEnded SessionEventKind = "ended" )
Session event kind constants.
type SessionHandle ¶
type SessionHandle struct {
SessionID string `json:"sessionId"`
PID int `json:"pid"`
AcceptedAt string `json:"acceptedAt"`
State SessionState `json:"state"`
}
SessionHandle is the daemon-side handle for an in-flight session.
type SessionModelProfile ¶
type SessionModelProfile struct {
// ID is the model_profile row UUID (e.g. "mp_01jt5...").
ID string `json:"id"`
// ProviderID is the canonical provider family (e.g. "claude", "codex",
// "gemini", "ollama").
ProviderID string `json:"providerId"`
// Model is the model variant within the provider family.
Model string `json:"model"`
// Mode is the reasoning-effort/speed tier (e.g. "xhigh").
Mode string `json:"mode,omitempty"`
// Context is the context-window size in tokens required for this
// dispatch. Zero means "use the model default".
Context int `json:"context,omitempty"`
// MaxOutputTokens is the per-response output-token budget. Zero
// means "use the model default".
MaxOutputTokens int `json:"maxOutputTokens,omitempty"`
}
SessionModelProfile mirrors runner.ResolvedModelProfile but lives in the daemon package to avoid an import cycle. It carries the richer fully-rendered model-profile the platform resolves via the three-axis workType + model-profile routing algorithm (ADR-2026-05-12-worktype-and-model-profile-routing). The daemon forwards it opaquely; `donmai agent run` bridges it into runner.ResolvedModelProfile via detailToQueuedWork.
type SessionResolvedProfile ¶
type SessionResolvedProfile struct {
Provider string `json:"provider,omitempty"`
Runner string `json:"runner,omitempty"`
Model string `json:"model,omitempty"`
Effort string `json:"effort,omitempty"`
CredentialID string `json:"credentialId,omitempty"`
ProviderConfig map[string]any `json:"providerConfig,omitempty"`
}
SessionResolvedProfile mirrors runner.ResolvedProfile but lives in the daemon package to avoid an import cycle (the daemon package must stay independent of the runner package — `donmai agent run` constructs its own runner from this opaque payload).
type SessionResources ¶
type SessionResources struct {
VCpu int `json:"vCpu,omitempty"`
MemoryMB int `json:"memoryMb,omitempty"`
}
SessionResources is the optional resource request on a SessionSpec.
type SessionSpec ¶
type SessionSpec struct {
SessionID string `json:"sessionId"`
Repository string `json:"repository"`
Ref string `json:"ref"`
Resources *SessionResources `json:"resources,omitempty"`
Env map[string]string `json:"env,omitempty"`
MaxDurationSeconds int `json:"maxDurationSeconds,omitempty"`
}
SessionSpec is an inbound work specification dispatched by the orchestrator. Subset of SandboxSpec from 004 relevant to the daemon's session-dispatch path.
type SessionState ¶
type SessionState string
SessionState is the lifecycle of a single worker child process spawned for an accepted session.
const ( SessionStarting SessionState = "starting" SessionRunning SessionState = "running" SessionCompleted SessionState = "completed" SessionFailed SessionState = "failed" SessionTerminated SessionState = "terminated" )
Session state constants.
type SpawnerOptions ¶
type SpawnerOptions struct {
Projects []ProjectConfig
MaxConcurrentSessions int
// WorkerCommand is the command to run for each accepted session. The
// caller may pass arbitrary args; the session-specific environment is
// added on top of os.Environ() at spawn time.
//
// When empty, a short-lived /bin/sh stub is used that prints
// "session-started:<id>" and exits 0 — sufficient for testing the
// daemon's accept/lifecycle path without launching real worker binaries.
WorkerCommand []string
// BaseEnv is the environment injected into every worker process.
BaseEnv map[string]string
// OnPreSpawn is an optional hook invoked once per spawn, immediately
// before the child process is exec'd. It receives the final SessionSpec
// and the env slice that would otherwise be exec'd, and returns the env
// slice that will actually be exec'd. Returning nil is equivalent to
// returning the input unchanged.
//
// Callers may use this to layer per-session env entries (e.g.,
// credentials resolved at spawn time) over the spawner's BaseEnv.
// BaseEnv is set once at spawner construction and cannot express
// per-session values; this hook is the extension point for callers
// that need to compute env entries from the inbound SessionSpec.
//
// The hook runs AFTER the BaseEnv + SessionSpec.Env composition, so
// returned entries can both add new keys and override BaseEnv keys.
//
// The hook MUST NOT block on I/O paths that can hang indefinitely.
// Spawn latency budget is on the order of 250ms; if the hook needs
// to do I/O, it should have its own timeout.
OnPreSpawn func(spec SessionSpec, env []string) []string
// Now lets tests deterministically clock acceptedAt timestamps.
Now func() time.Time
// Stdout is where worker stdout is forwarded with a "[worker:<id>]"
// prefix. Defaults to os.Stdout. Set to io.Discard in tests.
StdoutPrefixWriter PrefixedWriter
StderrPrefixWriter PrefixedWriter
}
SpawnerOptions configure a WorkerSpawner.
type TrustConfig ¶
type TrustConfig struct {
// Mode is one of permissive | signed-by-allowlist | attested.
// Empty defaults to permissive (set by applyDefaults).
Mode TrustMode `yaml:"mode,omitempty" json:"mode,omitempty"`
// IssuerSet is an OPTIONAL allowlist of OIDC subject identities
// (Fulcio SAN) the operator considers trusted. Empty = trust any
// signer the embedded trust root can validate (the bundle's chain
// must still verify; this just skips the SAN allowlist filter).
IssuerSet []string `yaml:"issuerSet,omitempty" json:"issuerSet,omitempty"`
// Actor is the operator-declared identity used in the trustOverride
// audit log entry. When empty the actor falls back to
// fmt.Sprintf("uid:%d", os.Getuid()) per coordinator decision
// Q-audit-2 (2026-05-07). The override is also timestamped and
// names the kitId + signerId, so this field is best-effort.
Actor string `yaml:"actor,omitempty" json:"actor,omitempty"`
}
TrustConfig is the daemon-wide trust policy. Lives on Config (NOT on KitConfig) per audit § 1.2: the trust mode applies across plugin families per 015-plugin-spec.md § "Auth + trust", not just kits.
type TrustMode ¶
type TrustMode string
TrustMode is the operator-configured policy for how the install gate reacts to verifier outcomes.
const ( // TrustModePermissive allows install regardless of verifier outcome. // The verifier still runs and the trust state is reported; this // matches OSS-execution-layer expectations vs the npm/pip/cargo // precedent. Default per Q2 of WAVE12_PLAN. TrustModePermissive TrustMode = "permissive" // TrustModeSignedByAllowlist rejects unsigned and unverified kits at // install time; verified-signed kits whose signer matches the // configured issuer set install normally. TrustModeSignedByAllowlist TrustMode = "signed-by-allowlist" // TrustModeAttested is allowlist + (future) SLSA attestation-graph // requirement. Wave 12 treats it as an alias for allowlist; the // attestation requirement lands in Wave 13+ alongside the SLSA // provenance parser. TrustModeAttested TrustMode = "attested" )
Trust modes accepted on daemon.yaml `trust.mode`.
type UpdateChannel ¶
type UpdateChannel string
UpdateChannel is the release channel for the auto-updater.
const ( ChannelStable UpdateChannel = "stable" ChannelBeta UpdateChannel = "beta" ChannelMain UpdateChannel = "main" )
Update channel constants.
type UpdateResult ¶
type UpdateResult struct {
Updated bool `json:"updated"`
Version string `json:"version"`
Reason string `json:"reason"`
}
UpdateResult describes the outcome of a runUpdate call.
type UpdateSchedule ¶
type UpdateSchedule string
UpdateSchedule is the cadence the supervisor wakes the daemon to check.
const ( ScheduleNightly UpdateSchedule = "nightly" ScheduleOnRelease UpdateSchedule = "on-release" ScheduleManual UpdateSchedule = "manual" )
Update schedule constants.
type Updater ¶
type Updater struct {
// contains filtered or unexported fields
}
Updater runs the full update flow: check → fetch → verify → swap → restart.
func NewUpdater ¶
func NewUpdater(opts UpdaterOptions) *Updater
NewUpdater returns an Updater with sane defaults.
func (*Updater) BuildBinaryURL ¶
func (u *Updater) BuildBinaryURL(channel UpdateChannel, version string) string
BuildBinaryURL returns the binary URL for a channel/version.
func (*Updater) BuildManifestURL ¶
func (u *Updater) BuildManifestURL(channel UpdateChannel) string
BuildManifestURL returns the manifest URL for a channel.
func (*Updater) BuildSignatureURL ¶
BuildSignatureURL returns the signature URL for a binary URL.
func (*Updater) CheckForUpdate ¶
func (u *Updater) CheckForUpdate(ctx context.Context) (*VersionManifest, error)
CheckForUpdate fetches the version manifest and returns it iff a strictly newer version is available. Returns (nil, nil) when up-to-date.
type UpdaterOptions ¶
type UpdaterOptions struct {
CurrentVersion string
CurrentBinaryPath string
Config AutoUpdateConfig
// HTTPClient is the client used to fetch the manifest, binary, and
// signature. Defaults to a 60s-timeout client.
HTTPClient *http.Client
// Verifier is the binary-signature verifier. Defaults to
// alwaysFailVerifier (production-safe — no real swaps until configured).
Verifier BinaryVerifier
// SkipExit, when true, prevents the swap step from calling os.Exit. Used
// by tests and by callers that want to handle the restart explicitly.
SkipExit bool
// ExitFn allows tests to inject a fake exit. Called only when SkipExit
// is false. Defaults to os.Exit.
ExitFn func(int)
// CDNBase overrides UpdateCDNBase (test injection).
CDNBase string
// PlatformSuffix overrides the auto-detected suffix (test injection).
PlatformSuffix string
}
UpdaterOptions configure an Updater.
type VersionManifest ¶
type VersionManifest struct {
Version string `json:"version"`
SHA256 string `json:"sha256"`
ReleasedAt string `json:"releasedAt"`
}
VersionManifest is the schema of <channel>/latest.json.
type WizardOptions ¶
type WizardOptions struct {
// Existing is an existing config (if any) used as defaults.
Existing *Config
// ConfigPath is where to write the resulting config. Empty means do not
// persist.
ConfigPath string
// Stdin is the TTY input. Defaults to os.Stdin.
Stdin io.Reader
// Stdout is where prompts are printed. Defaults to os.Stdout.
Stdout io.Writer
// IsTTY overrides the auto-detected TTY status. When false (and not
// explicitly set true), the wizard returns the default config without
// prompting.
IsTTY *bool
// SkipWizard, when true, returns DefaultConfig (or Existing) without
// prompting. Mirrors the RENSEI_DAEMON_SKIP_WIZARD env var.
SkipWizard bool
// CPUCount overrides runtime.NumCPU() (test injection).
CPUCount int
// MemoryMB overrides total-memory detection (test injection). 0 means
// "use a sensible default".
MemoryMB int
// DetectGitRemote returns the cwd's git remote URL or "" if none. Tests
// inject a stub.
DetectGitRemote func() string
}
WizardOptions configure the interactive setup wizard.
type WorkareaArchiveOptions ¶
type WorkareaArchiveOptions struct {
// Root is the directory the registry scans. Empty selects the
// default ~/.rensei/workareas.
Root string
// ActiveProvider is the live pool view; may be nil (archives-only
// list, see ActiveWorkareaProvider).
ActiveProvider ActiveWorkareaProvider
// PoolGuard is consulted on Restore. May be nil — restore proceeds
// without a saturation check.
PoolGuard PoolCapacityGuard
}
WorkareaArchiveOptions configures a registry.
type WorkareaArchiveRegistry ¶
type WorkareaArchiveRegistry struct {
// contains filtered or unexported fields
}
WorkareaArchiveRegistry is the on-disk archive index. Construct via NewWorkareaArchiveRegistry. Methods are safe for concurrent use.
func NewWorkareaArchiveRegistry ¶
func NewWorkareaArchiveRegistry(opts WorkareaArchiveOptions) *WorkareaArchiveRegistry
NewWorkareaArchiveRegistry constructs a registry against the given archive root. The directory is NOT created at construction time — missing-or-empty roots return an empty list (HTTP 200) per ADR D4a.
func (*WorkareaArchiveRegistry) CountDiff ¶
func (r *WorkareaArchiveRegistry) CountDiff(idA, idB string) (int, error)
CountDiff returns the number of differing entries between two archives without buffering or streaming them. The handler uses this to pick JSON vs NDJSON before opening the response stream.
func (*WorkareaArchiveRegistry) Diff ¶
func (r *WorkareaArchiveRegistry) Diff(idA, idB string) (*afclient.WorkareaDiffResult, error)
Diff returns the structured per-path delta between two archives. Both ids MUST resolve to archives (live diffs are out of scope per ADR D4a). Walks are deterministic — entries are sorted by path. The well-known .rensei/ subtree under each archive's tree/ root is excluded.
func (*WorkareaArchiveRegistry) DiffStream ¶
func (r *WorkareaArchiveRegistry) DiffStream( idA, idB string, emit func(afclient.WorkareaDiffEntry) error, ) (*afclient.WorkareaDiffSummary, error)
DiffStream emits diff entries through the supplied callback as they are computed. The callback receives one entry at a time; if it returns a non-nil error the walk halts and the error is returned. After all entries are emitted DiffStream returns the aggregate summary so callers can write the trailing NDJSON line.
The streaming variant exists so the HTTP handler can switch its Content-Type on entry count without buffering the entire diff.
func (*WorkareaArchiveRegistry) Get ¶
func (r *WorkareaArchiveRegistry) Get(id string) (*afclient.Workarea, error)
Get returns the full archive record for the named id. The Workarea Kind field is set to WorkareaKindArchived. Returns ErrArchiveNotFound when the id is absent.
func (*WorkareaArchiveRegistry) List ¶
func (r *WorkareaArchiveRegistry) List() (active, archived []afclient.WorkareaSummary, err error)
List walks the archive root and returns the union of on-disk archives (ordered deterministically by id) and the active pool members reported by the configured ActiveWorkareaProvider, if any. Missing-or- empty root is NOT an error — the response is just (empty active + empty archived).
func (*WorkareaArchiveRegistry) Restore ¶
func (r *WorkareaArchiveRegistry) Restore( archiveID string, req afclient.WorkareaRestoreRequest, ) (*afclient.Workarea, time.Duration, error)
Restore materialises an archive into a fresh active pool member. The returned Workarea has Kind=Active and a NEW id distinct from the archive id (archives are immutable per ADR D4a). The tree/ subtree is copied to a per-restore directory under the archive root's sibling "restored/" so operators can find the materialised state from the daemon's host filesystem.
IntoSessionID conflicts return ErrConflict; saturation returns ErrUnavailable + a non-zero retryAfter; corrupted archives return ErrArchiveCorrupted; missing archives return ErrArchiveNotFound.
func (*WorkareaArchiveRegistry) Root ¶
func (r *WorkareaArchiveRegistry) Root() string
Root returns the archive root directory the registry scans. Exposed for tests and operators surfacing the path.
type WorkareaConfig ¶
type WorkareaConfig struct {
// ArchiveRoot is the directory the daemon scans for archived workareas.
// Default ~/.rensei/workareas (resolved at runtime by the handler if
// empty).
ArchiveRoot string `yaml:"archiveRoot,omitempty" json:"archiveRoot,omitempty"`
// DiffStreamingThreshold is the entry count above which the diff
// endpoint switches from a single JSON envelope to NDJSON streaming.
// Default 1000 per ADR D4a.
DiffStreamingThreshold int `yaml:"diffStreamingThreshold,omitempty" json:"diffStreamingThreshold,omitempty"`
}
WorkareaConfig configures the Layer-3 workarea operator surface — archive root scan path, diff streaming threshold. Wave 9 / ADR-2026-05-07.
type WorkerSpawner ¶
type WorkerSpawner struct {
// contains filtered or unexported fields
}
WorkerSpawner manages the lifecycle of worker child processes.
func NewWorkerSpawner ¶
func NewWorkerSpawner(opts SpawnerOptions) *WorkerSpawner
NewWorkerSpawner constructs a spawner. Workers will not be spawned until AcceptWork is called.
func (*WorkerSpawner) AcceptWork ¶
func (s *WorkerSpawner) AcceptWork(spec SessionSpec) (*SessionHandle, error)
AcceptWork validates the spec, spawns a worker, and returns its handle.
func (*WorkerSpawner) ActiveCount ¶
func (s *WorkerSpawner) ActiveCount() int
ActiveCount returns the number of in-flight sessions.
func (*WorkerSpawner) ActiveSessions ¶
func (s *WorkerSpawner) ActiveSessions() []SessionHandle
ActiveSessions returns a snapshot of the current session handles.
func (*WorkerSpawner) ActiveWorkareas ¶
func (s *WorkerSpawner) ActiveWorkareas() []afclient.WorkareaSummary
ActiveWorkareas projects the spawner's in-flight sessions onto the canonical afclient.WorkareaSummary wire shape so the WorkareaArchiveRegistry can union live-pool members with on-disk archives in the GET /api/daemon/workareas response (Wave 11 / S5; ADR-2026-05-07-daemon- http-control-api.md §D4a).
The projection is pull-based — the spawner holds no separate workarea map; each call materialises summaries from the live `sessions` map under the same `mu` lock that ActiveSessions uses. ProjectID is resolved via the project allowlist using the same matcher AcceptWork applies. The summary's ID is the spawner's session id so /api/daemon/workareas/<id> reaches the live entry.
Output is sorted by SessionID for deterministic test assertions.
func (*WorkerSpawner) Drain ¶
func (s *WorkerSpawner) Drain(timeout time.Duration) error
Drain waits for all in-flight sessions to exit, then resolves. After timeout, remaining sessions receive SIGTERM via context cancellation and the function returns an error indicating how many were forcibly stopped.
func (*WorkerSpawner) IsAccepting ¶
func (s *WorkerSpawner) IsAccepting() bool
IsAccepting reports whether the spawner is currently accepting work.
func (*WorkerSpawner) On ¶
func (s *WorkerSpawner) On(fn func(SessionEvent))
On registers a session-event listener. Listeners are invoked synchronously from the spawner goroutine; do not block them.
func (*WorkerSpawner) Pause ¶
func (s *WorkerSpawner) Pause()
Pause stops accepting new work but leaves running sessions alive.
func (*WorkerSpawner) SetMaxConcurrentSessions ¶
func (s *WorkerSpawner) SetMaxConcurrentSessions(n int) error
SetMaxConcurrentSessions updates the local session capacity used for future AcceptWork decisions. Existing sessions are never interrupted.
func (*WorkerSpawner) SetProjects ¶
func (s *WorkerSpawner) SetProjects(projects []ProjectConfig)
SetProjects atomically swaps the spawner's project allowlist used by AcceptWork's findProjectLocked check. Existing in-flight sessions continue against whichever project they were dispatched under — the new list governs only future AcceptWork calls.
Phase 2c of 2026-05-18-daemon-config-sync-DESIGN.md — wired by the mutation-applier so platform-driven project.add / project.remove proposals take effect on the very next claim without a daemon restart.
A defensive copy is taken so subsequent mutations to the caller's slice (e.g. daemon.go reusing a single buffer) don't race the spawner.
Source Files
¶
- allowlist_report.go
- auto_update.go
- child_log.go
- config.go
- daemon.go
- handle_capabilities.go
- handle_kit.go
- handle_provider.go
- handle_routing.go
- handle_workarea.go
- heartbeat.go
- io_helpers.go
- kit_install_git.go
- kit_registry.go
- kit_trust.go
- mutation_apply.go
- poll.go
- registration.go
- routing_state.go
- runtime_token.go
- server.go
- session_detail.go
- setup_wizard.go
- types.go
- workarea_archive.go
- worker_command.go
- worker_spawner.go
- yaml_watcher.go