Documentation
¶
Overview ¶
compact.go owns session compaction: when the live message list grows past CompactionConfig.AutoCompactInputTokens the agent collapses the older portion into a synthetic summary message, preserving a configurable tail of recent turns. Compaction must not split a tool_use from its matching tool_result — see adjustBoundary in T-204.
compact_summary.go owns the deterministic local summarizer that CompactSession folds removed messages through. No LLM call — the summary is rules-based so compaction is fast, free, and reproducible across restarts of the same input.
events.go defines the tagged-union the agent pushes through its Events() channel. Replaces the old Callbacks struct — instead of the UI registering function pointers and the agent calling them synchronously on its goroutine, the UI consumes a typed event stream and translates each to its own representation (tea.Msg in the TUI, fprintf in the CLI).
Why a sealed interface over a struct-of-functions:
- Order is explicit: events arrive in the order the agent emits them, so the consumer never has to reason about callback interleaving across goroutines.
- Adding a new event kind is one type declaration plus a consumer case, with the compiler flagging missing handling anywhere a switch lists known variants.
- The CLI and TUI adapters are just two consumers of the same channel — no parallel implementations of the same function table to keep in sync.
Permission ask is special. The agent must block until the user answers, so EventPermissionAsk carries a buffered Reply channel. The consumer writes a PermissionResponse on Reply; the agent goroutine is parked on that receive.
loop.go holds helpers that sit between the CLI and the agent loop, handling one-shot compound operations (e.g. --compact) that need to drive the agent without entering interactive or one-shot prompt mode.
semantic_compact.go adds LLM-powered semantic compaction on top of the deterministic local summarizer. When context pressure hits the semantic threshold, the agent calls deepseek-v4-flash (thinking disabled) to produce a richer summary. Falls back to deterministic compaction on failure.
Package agent is the deepseekcode ReAct loop. It owns turn boundaries, tool dispatch, callback fan-out, and stop conditions.
The shape closely mirrors charmbracelet/crush's internal/agent/agent.go (callback table + StopWhen []StopCondition). The three load-bearing patches from docs/design.md §6.4 are applied here: stream/present split, finish-reason override, and two-tier timeout (the timeout is applied at the llm.Client layer; this package just configures it).
internal/agent/warm.go
Index ¶
- Constants
- func AffectedPathsFor(reg *tools.Registry, call llm.ToolCall) []string
- func CheckBudget(policy BudgetPolicy, state BudgetState, projectedCNY float64) (allow bool, warn bool)
- func ContextPressure(messages []llm.Message, maxContextTokens int, charsPerToken float64) float64
- func EstimateInputTokens(messages []llm.Message, charsPerToken float64) int
- func EstimateTokens(messages []llm.Message) int
- func EstimateTokensCalibrated(messages []llm.Message, charsPerToken float64) int
- func IsLikelyWarm(lastFingerprint, curFingerprint string, sinceLastUse, ttl time.Duration) bool
- func ProjectedTurnCostCNY(model string, req llm.Request, charsPerToken, cacheHitRate float64, ...) float64
- func RunCompact(ctx context.Context, a *Agent, w io.Writer) error
- func ShouldCompact(messages []llm.Message, cfg CompactionConfig, charsPerToken float64) (ok bool, fromIdx, toIdx int)
- func ShouldSemanticCompact(pressure float64, cfg SemanticCompactionConfig) string
- func WarmthNotice(warm bool, sinceLastUse time.Duration) string
- type Agent
- func (a *Agent) AskQuestion(ctx context.Context, req tools.QuestionRequest) (tools.QuestionResponse, error)
- func (a *Agent) AttachChildTraceSink(child *Agent) *TraceSinkHandle
- func (a *Agent) AttachTraceSink(w io.Writer) *TraceSinkHandle
- func (a *Agent) BranchAt(ctx context.Context, nameOrTurn string, wt *worktree.Manager) (BranchResult, error)
- func (a *Agent) Bus() *Bus
- func (a *Agent) CancelJob(id string) error
- func (a *Agent) Close()
- func (a *Agent) Compact(ctx context.Context) (compacted bool, err error)
- func (a *Agent) CurrentEpochID() string
- func (a *Agent) EmitInfo(msg string)
- func (a *Agent) EnableEscalation(model string)
- func (a *Agent) EnterPlan(_ context.Context) error
- func (a *Agent) Events() <-chan Event
- func (a *Agent) ExitPlan(_ context.Context, plan string) error
- func (a *Agent) ForceCompact(ctx context.Context)
- func (a *Agent) HasActiveBackgroundWork() bool
- func (a *Agent) JobStatus(id string, tailLines int) (tools.Status, error)
- func (a *Agent) ReconcileUndo(ctx context.Context, n int) (int, error)
- func (a *Agent) RecordCheckpoint(name string) int
- func (a *Agent) ReloadSkills(cwd, home string) (ReloadResult, error)
- func (a *Agent) RequestStop()
- func (a *Agent) Run(ctx context.Context, userPrompt string) (reason StopReason, err error)
- func (a *Agent) StartBashJob(ctx context.Context, command string, usePTY bool, timeoutMs int, ...) (string, error)
- func (a *Agent) StaticPrefixFingerprint() string
- func (a *Agent) Steer(text string)
- func (a *Agent) SwitchProfile(p agents.AgentProfile) *PrefixEpoch
- func (a *Agent) ToolIsReadOnly(name string) bool
- func (a *Agent) Transcript() []byte
- func (a *Agent) WaitChildTraces(timeout time.Duration)
- type BranchResult
- type BudgetPolicy
- type BudgetState
- type Bus
- type CapabilitySet
- type CheckpointIndex
- type CompactionConfig
- type CompactionResult
- type EpochComponents
- type EpochManager
- func (m *EpochManager) CreateEpoch(reason string, components EpochComponents) *PrefixEpoch
- func (m *EpochManager) CurrentEpoch() *PrefixEpoch
- func (m *EpochManager) DetectDrift(components EpochComponents) []PendingChange
- func (m *EpochManager) ExpectedCacheMiss() bool
- func (m *EpochManager) FreezeEpoch()
- func (m *EpochManager) InitEpoch(reason string, components EpochComponents) *PrefixEpoch
- func (m *EpochManager) IsFrozen() bool
- func (m *EpochManager) PendingChanges() []PendingChange
- func (m *EpochManager) RecordPendingChange(change PendingChange)
- func (m *EpochManager) SetBus(bus *Bus)
- func (m *EpochManager) SwitchEpoch(reason string, components EpochComponents) *PrefixEpoch
- type Event
- type EventBackgroundJobFinish
- type EventBackgroundJobStart
- type EventBudget
- type EventCompaction
- type EventCompactionWarning
- type EventDone
- type EventDriftBlocked
- type EventEnvelope
- type EventEpochCreated
- type EventEpochFrozen
- type EventEpochSwitched
- type EventEscalated
- type EventHookFired
- type EventInfo
- type EventPendingChange
- type EventPermissionAsk
- type EventPermissionDenied
- type EventPlanUpdate
- type EventQuestionAsk
- type EventReasoningDelta
- type EventReasoningEnd
- type EventReasoningStart
- type EventRepair
- type EventRetry
- type EventSemanticCompaction
- type EventStepFinish
- type EventSubagentFinish
- type EventSubagentStart
- type EventTextDelta
- type EventToolCallDelta
- type EventToolCallResult
- type EventToolCallStart
- type Job
- type JobKind
- type JobRegistry
- func (r *JobRegistry) Cancel(id string) bool
- func (r *JobRegistry) Close()
- func (r *JobRegistry) Finish(id string, state JobState, summary string)
- func (r *JobRegistry) Get(id string) (*Job, bool)
- func (r *JobRegistry) HasActive() bool
- func (r *JobRegistry) JobStatus(id string, tailLines int) (Status, error)
- func (r *JobRegistry) List() []*Job
- func (r *JobRegistry) Start(parent context.Context, kind JobKind, description string) (*Job, context.Context)
- type JobState
- type LoopSpawner
- type MessageTruncator
- type PendingChange
- type PendingChangeKind
- type PermissionResponse
- type Persister
- type PlanItem
- type PrefixEpoch
- type ReceiptAppender
- type ReloadResult
- type SemanticCompactionConfig
- type SemanticCompactionResult
- type Spawner
- type Status
- type StepRecord
- type StopCondition
- type StopReason
- type SubResult
- type SubTask
- type Subscription
- type TraceSink
- type TraceSinkHandle
- type VerifyHook
Constants ¶
const ( BudgetKindWarning = "warning" // crossed WarnCNY BudgetKindBlocked = "blocked" // crossed HardCNY — the turn is refused BudgetKindUnpriced = "unpriced" // model has no pricing table; gate can't price it )
Budget-gate event kinds (EventBudget.Kind), T1.3.
const DefaultSystemPrompt = `` /* 742-byte string literal not displayed */
DefaultSystemPrompt is the cache-stable system prompt. It must not change between turns; that would invalidate the prompt cache and blow the cost story. Versioned by binary release, not by session.
const EventProtocolVersion = 1
EventProtocolVersion is the current envelope version. Increment when the EventEnvelope shape changes in a backward-incompatible way.
const MaxContextTokens = 1_000_000
MaxContextTokens is the default maximum context window size for DeepSeek V4 models (1M context). Used by ContextPressure to compute the usage ratio. Override via Agent.MaxContextTokens.
Variables ¶
This section is empty.
Functions ¶
func AffectedPathsFor ¶
AffectedPathsFor returns the static affected paths for a tool call (used by the snapshot manager). Bash returns nil since its effects are unknown statically; the destructive-bash check + permission prompt are the safety net there.
func CheckBudget ¶
func CheckBudget(policy BudgetPolicy, state BudgetState, projectedCNY float64) (allow bool, warn bool)
CheckBudget evaluates whether a model turn should proceed given the policy, current state, and projected cost for the upcoming turn.
Returns:
- allow: true if the turn may proceed; false if hard limit is exceeded
- warn: true if this call should emit a warning (first time crossing WarnCNY)
Pure function: does not modify state. The caller updates BudgetState based on the returned flags.
func ContextPressure ¶
ContextPressure returns the current context usage ratio (0-1).
func EstimateInputTokens ¶ added in v0.3.6
EstimateInputTokens returns the exact tokenizer count of messages when the embedded V4 tokenizer is available, else the calibrated heuristic. This is the preferred token estimator for compaction triggering and budget projection.
func EstimateTokens ¶
EstimateTokens returns the cold-start (UTF-8 byte ÷ 4) token estimate. It is the uncalibrated wrapper over EstimateTokensCalibrated; callers holding a learned per-session ratio should use the calibrated form directly.
func EstimateTokensCalibrated ¶
EstimateTokensCalibrated estimates the token count of a message list using the given chars-per-token ratio — a per-session value learned from provider usage frames (see Agent.calibrateCharsPerToken). A non-positive ratio falls back to the cold-start prior. Used only for compaction triggering and the pre-stream budget projection — never for cost computation. Cheap, deterministic, no tokenizer dependency.
Rounding-locus note: this divides the SUMMED char count once, whereas the historical EstimateTokens floored each block's len/4 independently. For the existing strict test inputs every block length is a multiple of 4, so the results are identical; for other inputs the two differ by at most (blocks-1) tokens — negligible on a 1M window.
func IsLikelyWarm ¶ added in v0.4.0
func ProjectedTurnCostCNY ¶
func ProjectedTurnCostCNY(model string, req llm.Request, charsPerToken, cacheHitRate float64, staticResidualTokens int) float64
ProjectedTurnCostCNY returns a pre-stream cost estimate for one model turn. It runs before DeepSeek returns authoritative cache hit/miss usage, so it prices the prompt against the rolling session cache-hit rate (T4.2): the fraction cacheHitRate of input tokens is priced at the cheap cache-hit rate and the rest as cache miss.
The input token count uses the per-session calibrated chars-per-token ratio (charsPerToken<=0 falls back to the cold-start char/4 prior), so the gate tracks the real prompt size on CJK/code-heavy sessions (T4.1).
staticResidualTokens is the learned server-side static-prefix token cost (system+tool-schema template) not visible to the local tokenizer. Zero until learned from the first real usage frame (see Agent.learnStaticResidual).
cacheHitRate is clamped to [0,1] and the hit-token split is floored, so the projection is biased toward cache miss. cacheHitRate==0 (the cold-start floor, before any usage is observed) reproduces the all-miss estimate exactly — the gate is never looser than the conservative default.
func RunCompact ¶ added in v0.4.0
RunCompact performs a forced compaction on a (already-constructed) Agent and reports the outcome to w. It is the implementation of the `dsc --compact` flag path.
Error handling:
- If a.Compact returns a non-nil error the error is returned to the caller (CLI will print "dsc: …" and exit 1).
- The compacted bool distinguishes "nothing to compact" from a real compaction so the output message is accurate.
func ShouldCompact ¶
func ShouldCompact(messages []llm.Message, cfg CompactionConfig, charsPerToken float64) (ok bool, fromIdx, toIdx int)
ShouldCompact decides whether the message list has grown enough to merit compaction. Returns ok=true with the proposed [fromIdx, toIdx) window of messages to summarize. The window is not yet boundary-safe — callers must run adjustBoundary (T-204) before deleting anything.
Returns ok=false (and zero indices) when:
- estimated tokens are below AutoCompactInputTokens, or
- len(messages) <= preserve*2 (nothing meaningful to compact)
func ShouldSemanticCompact ¶
func ShouldSemanticCompact(pressure float64, cfg SemanticCompactionConfig) string
ShouldSemanticCompact decides whether semantic compaction should fire. Returns the action: "none", "warn", "compact", or "protect".
func WarmthNotice ¶ added in v0.4.0
IsLikelyWarm reports whether the current static-prefix fingerprint probably still has a hot DeepSeek disk cache from a prior session: same fingerprint and last used within ttl. DeepSeek clears unused prefix caches in hours-to-days, so this is a best-effort hint (used to message "first turn was a cache hit"), never a guarantee. WarmthNotice returns a one-line human notice when warm==true, or "" when cold. sinceLastUse is rounded to minutes (if <1h) or hours.
Types ¶
type Agent ¶
type Agent struct {
Client *llm.Client
Tools *tools.Registry
Permissions *permissions.Policy
// Persister, if non-nil, receives session and snapshot bookkeeping
// alongside the in-memory Messages list. nil = ephemeral session
// (the -p one-shot mode runs this way).
Persister Persister
// Model is the active main-loop model (e.g. deepseek-v4-flash).
// Changed mid-session via /models.
Model string
Thinking bool
// ThinkingMode, when set (on|off|adaptive), overrides per-turn thinking
// selection for cost experiments (env DEEPSEEKCODE_THINKING_MODE). Empty
// keeps the legacy Thinking/AutoReasoning behavior. "off" never thinks;
// "adaptive" reasons on the first turn (plan) or a repair turn, terse
// otherwise — cutting the 2x-priced reasoning_content tax on routine turns.
ThinkingMode string
// Temperature and TopP, when non-nil, are sent on every model request as
// the OpenAI-shaped sampling controls. nil (the default) omits the field
// entirely so the main-loop wire bytes — and thus the cache fingerprint —
// are unchanged. Sub-agents set these from their def frontmatter (T7.1).
Temperature *float64
TopP *float64
// EscalationModel, when non-empty and different from Model, enables
// model-driven escalation (T2.3): a turn is re-issued once on this model
// when the assistant emits a <<<NEEDS_PRO>>> self-declaration or the
// per-turn repair-error count crosses escalationRepairThreshold. Empty (the
// default) disables escalation entirely — the mechanism is a no-op and adds
// no wire bytes. The marker *contract* (telling the model the marker exists)
// is a separate, opt-in system-prompt addition (see escalationContract); the
// detection here works regardless and never moves the Prefix Fingerprint.
EscalationModel string
// AutoReasoning enables per-turn thinking selection via
// llm.SelectThinking. When true, runStep calls SelectThinking
// with the last user message text to decide thinking on/off.
AutoReasoning bool
// ReasoningEffort is the configured DeepSeek V4 reasoning effort
// level (low, medium, high, max). Carried into every main-loop
// request when thinking is enabled. Empty means omit from wire.
ReasoningEffort llm.ReasoningEffort
// AutoRoute enables the pre-turn cost-aware router (internal/routing):
// per-turn model + effort chosen from the user message and repair signal.
// It never moves the Prefix Fingerprint (model/effort are not in the static
// prefix). Requires EscalationModel set for the pro tier to be reachable.
AutoRoute bool
// AutoClarify gates vague prompts through internal/routing.NeedsClarification
// before spending a (possibly pro/max) turn.
AutoClarify bool
// UserID is an optional DeepSeek field for abuse monitoring and
// enterprise attribution. Empty means omitted from wire.
UserID string
// DisablePrefixEpoch disables the PrefixEpoch feature (for benchmarking).
DisablePrefixEpoch bool
// DisableSemanticCompaction disables semantic (LLM) compaction (for benchmarking).
DisableSemanticCompaction bool
// System is the system prompt. Cache-stable across turns by design.
System string
// PromptBuilder, when non-nil, overrides System with the builder's
// output at the start of Run. The builder owns the static + dynamic
// split (see internal/prompt); the agent just calls Build() to get
// the assembled string. nil → System stays as configured.
PromptBuilder *prompt.SystemPromptBuilder
// CompactionCfg controls when the running message list gets
// collapsed into a synthetic summary. Initialized to
// DefaultCompactionConfig in New; override fields before Run.
CompactionCfg CompactionConfig
// SemanticCfg controls semantic (LLM-powered) compaction.
// Initialized to defaultSemanticCompactionConfig in New;
// override fields before Run. Zero value disables semantic
// compaction (falls back to deterministic only).
SemanticCfg SemanticCompactionConfig
// MaxContextTokens is the maximum context window size for
// context pressure computation. Default: 128_000.
MaxContextTokens int
// HookRunner dispatches lifecycle hooks (PreToolUse, PostToolUse,
// SessionStart, SessionEnd). nil = hooks disabled.
HookRunner *hooks.Runner
// StopWhen runs after each step; first match wins. Defaults below.
StopWhen []StopCondition
// Messages is the conversation. The agent appends user messages,
// assistant turns, and tool results here.
Messages []llm.Message
// StepTimeout, if non-zero, caps the duration of a single step
// (one model turn + tool execution). 0 = no per-step limit.
StepTimeout time.Duration
// MaxToolCalls is the hard cap on total tool calls per session.
// Warns at 80% via OnInfo. 0 = unlimited.
MaxToolCalls int
// IsSubagent is true when this agent was spawned by a parent
// via LoopSpawner. It disables thinking in sub-agents (via
// SelectThinking) and may be used to adjust other behaviors.
IsSubagent bool
// Spawner, when non-nil, enables sub-agent dispatch from slash
// commands that declare agent: or subtask: true. Set by the
// assembly layer (cmd/dsc or TUI) after construction.
Spawner tools.Spawner
// Jobs manages background jobs (async sub-agents and background_bash).
// Initialized in New; closed via defer in Run.
Jobs *JobRegistry
// BudgetPolicy and BudgetState control session cost gating.
// Zero values (default) disable budget checks entirely.
BudgetPolicy BudgetPolicy
BudgetState BudgetState
// Skills is the skill metadata store. nil = no skills loaded.
Skills *skills.Store
// PostEditDiagnostics, when non-nil, is called after each step that
// includes mutating tool calls. It receives the deduplicated list of
// affected file paths and returns a formatted diagnostics string. An
// empty return means no feedback is appended. The returned string is
// injected as a synthetic user message so the model sees it on its
// next turn, following the same pattern as injectLoopBreakNudge.
PostEditDiagnostics func(ctx context.Context, paths []string) string
// Verify, when non-nil, runs a shell command after each step that
// includes mutating tool calls. When the command exits non-zero, the
// synthesized feedback is injected as a synthetic user message so the
// model can fix the reported errors on its next turn.
//
// Verify is also consulted at model-stop time (when the model emits no
// tool calls): if the hook passes, the stop reason is promoted from
// StopModelDone to StopVerifiedDone; if it fails, the feedback is
// injected into a.Messages before returning so the caller (or a
// re-entered loop) has context about the failure. This is the single
// wiring point for verification — do not set a separate VerifyCmd field.
Verify *VerifyHook
// MCPRegistry is the MCP tool registry. nil = no MCP servers.
// Its SchemaHash feeds the epoch's mcp_schema_hash, so startup MCP
// discovery is part of the frozen prefix and mid-session schema
// changes surface as pending changes rather than live drift.
MCPRegistry *mcp.Registry
// Profile is the active first-class agent profile. nil means the
// implicit "default" profile. The profile name feeds the epoch's
// agent_profile_hash; switching profiles via SwitchProfile creates a
// new epoch (one expected cache miss) rather than mutating the live one.
Profile *agents.AgentProfile
// ActiveTiers controls which tool tiers are sent to the model. The
// agent uses Tools.AsLLMToolsFiltered(ActiveTiers...) when building
// requests; a nil/empty slice means "no filter" (all registered tools
// are exposed). The constructor defaults this to [TierCore] (see New).
ActiveTiers []tools.ToolTier
// contains filtered or unexported fields
}
Agent is one running ReAct loop. Construct with New, drive with Run.
Agent is *not* safe for concurrent use within a single session. The TUI wraps it in a goroutine and a consumer reads events from Events() to drive the UI.
func New ¶
New returns an Agent with sensible defaults for v0.1.
The Events channel is buffered at 256: roughly 4 seconds at a 60 tok/s burst rate. Streaming deltas don't block the model goroutine unless the consumer falls more than that behind, which would only happen if the UI goroutine were stuck — an upstream bug we'd want to surface.
func (*Agent) AskQuestion ¶
func (a *Agent) AskQuestion(ctx context.Context, req tools.QuestionRequest) (tools.QuestionResponse, error)
AskQuestion implements tools.Questioner. It emits an EventQuestionAsk and blocks until the consumer replies or ctx is cancelled.
func (*Agent) AttachChildTraceSink ¶
func (a *Agent) AttachChildTraceSink(child *Agent) *TraceSinkHandle
AttachChildTraceSink wires a subagent's event bus into the parent's trace writer, stamping every child record with agent_role="subagent" and the parent's current epoch_id. It is a no-op (returns nil) when the parent has no trace sink attached, so normal interactive/CLI runs are unaffected. The returned handle's Wait blocks until the child's EventDone is processed; the caller closes it after the subagent's Run returns.
func (*Agent) AttachTraceSink ¶
func (a *Agent) AttachTraceSink(w io.Writer) *TraceSinkHandle
AttachTraceSink subscribes a root TraceSink to the agent's bus and starts a drain goroutine. The returned handle's Wait blocks until the run's EventDone is processed, so callers can flush a JSONL file before exit. The sink is also retained on the agent so spawned subagents can tee their own epoch/usage events into the same trace (see AttachChildTraceSink).
func (*Agent) BranchAt ¶ added in v0.4.0
func (a *Agent) BranchAt(ctx context.Context, nameOrTurn string, wt *worktree.Manager) (BranchResult, error)
BranchAt forks the current session at the step identified by nameOrTurn (a checkpoint name or a decimal step index). It:
- Resolves nameOrTurn to a (stepIdx, messageCount) boundary via resolveBranchBoundary.
- Creates a new git worktree via wt.Create (branch name derived from nameOrTurn).
- Returns BranchResult so the TUI/CLI can open the new worktree directory.
BranchAt does NOT truncate a.Messages: the current session continues unaffected. The caller is responsible for launching a new dsc process or TUI session rooted at BranchResult.WorktreePath.
wt may be nil, in which case BranchAt resolves the boundary but skips worktree creation (useful for --dry-run or non-git repos).
func (*Agent) Bus ¶
Bus returns the agent's event bus. Additional consumers (loggers, parity recorders, future daemons) subscribe via Bus().Subscribe to receive versioned EventEnvelope values. The primary consumer (TUI/CLI) should continue using Events() for backward compatibility.
func (*Agent) Close ¶
func (a *Agent) Close()
Close releases agent resources. It cancels all running background jobs and should be called when the session ends (not per prompt turn).
func (*Agent) Compact ¶ added in v0.4.0
Compact forces an immediate compaction (honoring the preserve count) and reports whether a compaction was performed. It is the exported companion to ForceCompact, intended for CLI paths such as `dsc --compact` and for testing: callers that need to distinguish "nothing to compact" from a real compaction can inspect the bool; errors surface as a non-nil second return.
Internally it borrows ForceCompact's threshold-lowering trick, but wraps a snapshot of the message-list length so it can report whether the list actually shrank.
func (*Agent) CurrentEpochID ¶
CurrentEpochID returns the current epoch's ID, or "" when no epoch has been initialized yet. Used to stamp a subagent's child trace with the parent epoch it ran under.
func (*Agent) EmitInfo ¶
EmitInfo pushes an out-of-band notice onto the event stream. Used by adjacent components (e.g. llm.Client.OnRetry) that don't otherwise hold the event channel but want to surface user-visible status.
func (*Agent) EnableEscalation ¶
EnableEscalation turns on model-driven escalation to the given model and adds the marker contract to the static system prompt so the model knows to emit <<<NEEDS_PRO>>>. Call it BEFORE Run (before epoch #1 freezes) so the contract is part of the frozen, fingerprinted prefix; it is inserted just before prompt.DynamicContextBoundary (when present) so per-turn dynamic context still follows it. Adding the contract deliberately moves the Prefix Fingerprint for this session — the model name is the only interpolant, so it stays byte-stable across turns — while the default (escalation off) leaves DefaultSystemPrompt and the committed cache-stable golden untouched. No-op when model is empty or already the active model. NOTE: when a PromptBuilder is set it rebuilds a.System each turn, overwriting this injection; such assemblies must add the contract through the builder's static section instead.
func (*Agent) EnterPlan ¶
EnterPlan transitions the agent into plan mode. While in plan mode only read-only tools, question, and plan_exit are available. Calling EnterPlan when already in plan mode returns an error.
func (*Agent) Events ¶
Events returns the receive end of the agent-lifetime event stream. Consume from one goroutine; the agent guarantees in-order delivery. The channel is never closed by the agent — multiple Run calls share it. Consumers should select against their own ctx.Done() to exit cleanly during shutdown.
func (*Agent) ExitPlan ¶
ExitPlan transitions the agent out of plan mode, restoring the original tool registry and permissions policy. plan is the finalized plan text (unused here; consumed by the plan_exit tool itself). Calling ExitPlan when not in plan mode returns an error.
func (*Agent) ForceCompact ¶
ForceCompact runs maybeCompact with a temporarily-lowered token threshold so the user's /compact slash command can fire even when the message list hasn't hit the auto threshold. The preserve count is honored — too-short transcripts still no-op.
func (*Agent) HasActiveBackgroundWork ¶
HasActiveBackgroundWork reports whether any background job (async subagent or background_bash) is still running. ReloadSkills mutates state those detached goroutines read — notably the shared skill store via skill_read — so a caller must refuse a reload while this is true. a.running alone covers only the main loop, not work that outlives the parent turn.
func (*Agent) ReconcileUndo ¶
ReconcileUndo rolls the transcript back by n completed steps so the model's view matches the files a /undo just reverted (today /undo reverts files only; a.Messages and disk still claim the reverted turn succeeded). It truncates a.Messages to the boundary the first undone step started from (StepRecord.MessageCount), trims a.steps in lockstep, and — when the Persister supports it — truncates the persisted messages to the same boundary.
It refuses to cross a compaction: compaction renumbers a.Messages, so boundaries recorded by steps below compactionFloor are stale and truncating to them would corrupt the transcript. The caller (TUI) must ensure the agent is not running, since ReconcileUndo mutates a.Messages that runStep reads.
The Static Prefix (system + tools) is untouched, so the cache fingerprint is byte-identical across the undo and the 50x discount survives the rewind. Returns the number of body messages removed.
func (*Agent) RecordCheckpoint ¶ added in v0.4.0
RecordCheckpoint implements tools.CheckpointRecorder. It associates name with the current step count so /branch and --resume-at can find it.
func (*Agent) ReloadSkills ¶
func (a *Agent) ReloadSkills(cwd, home string) (ReloadResult, error)
ReloadSkills re-scans the skill directories under cwd (then home), refreshes the in-place skill store, rebuilds the model-visible system prompt, and — only when the rebuilt prefix actually moves — mints a new prefix epoch so the edit takes effect mid-session.
This is the deliberate, user-triggered exception to skills.LoadScan's session-start-only rule: the skill directory is normally frozen for the whole session to protect DeepSeek's 50x prompt cache. /reload-skills trades exactly one cache miss (the new epoch's first turn) for the model seeing edited skills now instead of next session.
The store is mutated in place (skills.Store.ReplaceFrom). a.Skills and the skill_read dispatcher share one *Store pointer, so the capability set used for drift detection and on-demand skill-body lookups both pick up the reloaded skills atomically.
Concurrency: the caller MUST ensure no turn is in flight — neither the main loop (a.running) NOR any background job (HasActiveBackgroundWork). ReloadSkills mutates a.System, the shared skill store (which runStep's capability set and the skill_read tool both read — including from a still-live async subagent), and the epoch state, none of which is safe to touch while a reader runs. The TUI gates this behind both checks; /undo's guard covers only the main loop, so reload's guard is deliberately stronger.
func (*Agent) RequestStop ¶
func (a *Agent) RequestStop()
RequestStop marks the current run as explicitly stopped by the user, so a subsequent context cancellation is reported as StopUserRequested rather than the ambient StopContextCancel. Callers invoke it immediately before cancelling the run's context (e.g. the TUI's ctrl+c handler). Safe to call from a goroutine other than the one driving Run.
func (*Agent) Run ¶
Run drives the loop until a stop condition fires or context cancels. Returns the StopReason and any infrastructure error.
The userPrompt is appended as a user message. To resume without a new user prompt (e.g. after a tool result the model needs to react to), pass "".
Run defers an EventDone emit so the consumer sees a strict terminator AFTER every other event from this turn. Bypassing the events channel for the "done" signal used to race trailing deltas and leave the UI's chrome stuck on "writing…" — never do that.
func (*Agent) StartBashJob ¶
func (a *Agent) StartBashJob(ctx context.Context, command string, usePTY bool, timeoutMs int, sb sandbox.Sandbox, profile sandbox.Profile) (string, error)
StartBashJob implements tools.JobController. It starts a background bash job and returns immediately with the job ID.
func (*Agent) StaticPrefixFingerprint ¶ added in v0.4.0
StaticPrefixFingerprint returns the combined SHA-256 fingerprint of the agent's current static prefix (system prompt + tool schemas). It is the same value that feeds the DeepSeek cache key and the prefix-epoch hash, so it can be persisted cross-session to detect whether the cache is still warm. Returns "" before the first Run (no tools resolved yet) — callers must treat an empty string as "unknown / cold".
func (*Agent) Steer ¶ added in v0.4.0
Steer queues a user instruction to be injected at the next step boundary of the in-flight Run, redirecting the turn without aborting it. Safe to call from a goroutine other than the one driving Run. Empty text is ignored.
func (*Agent) SwitchProfile ¶
func (a *Agent) SwitchProfile(p agents.AgentProfile) *PrefixEpoch
SwitchProfile makes p the active agent profile. It applies the profile's tool tiers and model, then creates a new PrefixEpoch via the epoch manager. The first turn of the new epoch is expected to miss cache (ExpectedCacheMiss); subsequent same-epoch turns stay cache-stable. Returns the new epoch. A no-op when an epoch hasn't been initialized yet (the first runStep will pick up the profile when it creates epoch #1).
func (*Agent) ToolIsReadOnly ¶ added in v0.4.0
ToolIsReadOnly reports whether the named tool declares itself read-only via the tools.ReadOnlyHint interface. Returns false for unknown tools.
func (*Agent) Transcript ¶
Transcript returns a compact wire-format snapshot of recent messages for the Duet builtin hook. Bounded so we don't blow up pro's context uselessly; for v0.1 we send the last 8 messages.
func (*Agent) WaitChildTraces ¶
WaitChildTraces blocks until every tracked subagent trace handle has flushed its child's EventDone, or the shared deadline elapses, then closes them. A one-shot run calls this before closing the root trace so an async (`task` with async:true) subagent's child epoch is flushed instead of being lost when the process exits. No-op when no subagent trace was attached.
If a handle times out the child never reached EventDone — its trace is partial. Rather than close it silently, a `child_trace_incomplete` record is written so the gate fails closed instead of trusting a cut-off child.
type BranchResult ¶ added in v0.4.0
type BranchResult struct {
WorktreePath string
Branch string
// StepIdx is the zero-based index into a.steps of the resolved boundary.
StepIdx int
MessageCount int
}
BranchResult is returned by BranchAt to the caller (TUI/CLI).
type BudgetPolicy ¶
type BudgetPolicy struct {
WarnCNY float64 // emit warning when projected spend >= WarnCNY
HardCNY float64 // block turn when projected spend >= HardCNY
}
BudgetPolicy configures session cost thresholds. Zero values disable the corresponding gate.
type BudgetState ¶
type BudgetState struct {
SpentCNY float64
Warned bool
// Rolling cache-hit accounting over every billed turn this session. Used
// to discount the pre-stream cost projection by the realized cache-hit
// rate (T4.2). Only frames with input tokens (hit+miss>0) fold in, so an
// empty/absent usage frame can't skew the rate.
CacheHitTokens int
CacheMissTokens int
// UnknownModelWarned makes the "model has no known pricing, so the budget
// gate can't cost-gate it" warning fire at most once per session.
UnknownModelWarned bool
}
BudgetState tracks cumulative spend and whether a warning has already been emitted for this session.
func (*BudgetState) FoldCacheUsage ¶
func (s *BudgetState) FoldCacheUsage(hitTokens, missTokens int)
FoldCacheUsage adds one turn's realized cache hit/miss token counts into the rolling session accounting. Turns with no input tokens are ignored so they can't move the rate.
func (BudgetState) SessionCacheHitRate ¶
func (s BudgetState) SessionCacheHitRate() float64
SessionCacheHitRate is the rolling cache-hit fraction in [0,1] over all input tokens billed this session. It returns 0 before any usage is observed (cold start), so a projection discounted by it floors to all-miss — never looser than the conservative default. The result is clamped defensively.
type Bus ¶
type Bus struct {
// contains filtered or unexported fields
}
Bus is a multi-consumer fan-out for agent events. Subscribers receive every published event as an EventEnvelope in publish order.
Ordinary events are delivered non-blocking: if a subscriber's buffer is full the event is dropped and its Dropped counter increments. Reply-carrying events (EventPermissionAsk, EventQuestionAsk) are delivered blocking — they carry a reply channel the agent goroutine parks on, so dropping them would deadlock the agent.
func (*Bus) Close ¶
func (b *Bus) Close()
Close shuts down the Bus and closes every subscriber channel. Publish after Close is a no-op.
func (*Bus) Publish ¶
Publish wraps ev in an EventEnvelope (with a monotonic Seq and current time) and fans it out to every subscriber. The caller must not hold any locks that a subscriber's reading goroutine might also need.
func (*Bus) Subscribe ¶
func (b *Bus) Subscribe(buffer int) *Subscription
Subscribe adds a consumer and returns its Subscription. buffer is the channel capacity; buffer <= 0 defaults to 256 (matching the legacy events channel). The caller must drain C or Unsubscribe to avoid back-pressure on reply events.
func (*Bus) Unsubscribe ¶
func (b *Bus) Unsubscribe(s *Subscription)
Unsubscribe removes the subscription and closes its channel. Safe to call multiple times; subsequent calls are no-ops.
type CapabilitySet ¶
type CapabilitySet struct {
ProfileID string
Skills *skills.Store // nil when no skill store is configured
MCPTools []mcp.McpToolMeta // nil when no MCP servers are connected
}
CapabilitySet is the latent capability identity behind a PrefixEpoch: the inputs that determine *which* StaticPrefix gets built (the active agent profile, the skill catalog, the connected MCP tools) but which are not themselves the model-visible bytes. EpochManager watches it to record pending changes; it is deliberately NOT part of the Prefix Fingerprint — see /CONTEXT.md and docs/adr/0001-prefix-fingerprint-is-model-visible-bytes-only.
Skill and active-MCP changes also move the fingerprint (the skill directory is rendered into the system prompt; active MCP tools are in the tool set), so they are reported here as one fine-grained pending change rather than also as a raw "system"/"tools" change — that is the double-report the cache-epoch review flagged.
type CheckpointIndex ¶ added in v0.4.0
type CheckpointIndex struct {
// contains filtered or unexported fields
}
CheckpointIndex is a concurrency-safe name→stepIdx registry. It is owned by Agent and reset on each Run.
func (*CheckpointIndex) Lookup ¶ added in v0.4.0
func (c *CheckpointIndex) Lookup(name string) (stepIdx int, ok bool)
Lookup returns the step index for name. ok is false if name is unknown.
func (*CheckpointIndex) Names ¶ added in v0.4.0
func (c *CheckpointIndex) Names() []string
Names returns all recorded checkpoint names in sorted order.
func (*CheckpointIndex) Record ¶ added in v0.4.0
func (c *CheckpointIndex) Record(name string, stepIdx int)
Record associates name with stepIdx, overwriting any prior association.
type CompactionConfig ¶
type CompactionConfig struct {
// PreserveRecentMessages is how many trailing messages stay
// outside the compaction window (default 4).
PreserveRecentMessages int
// MaxEstimatedTokens caps the compacted summary's own token
// budget — used by the summarizer to truncate (default 10_000).
MaxEstimatedTokens int
// AutoCompactInputTokens is the trigger threshold: once the
// estimated token count of the full message list exceeds this
// value, compaction fires (default 100_000; override via env
// DEEPSEEKCODE_AUTO_COMPACT_INPUT_TOKENS).
AutoCompactInputTokens int
// CacheUnit, when > 0, is the measured DeepSeek cache-unit boundary
// (tokens) used to align the rebuilt post-compaction tail (§3.6). After a
// compaction the live transcript is [summary || kept-tail]; padding the
// summary message so the whole rebuilt body lands on a cache-unit multiple
// maximizes the reusable, fully-persisted portion on the next turn. 0
// (default) = disabled, in which case CompactSession is byte-identical to
// its pre-§3.6 behavior. Comes from a cacheprobe measurement; it never
// touches the frozen prefix (the summary is an assistant body message).
CacheUnit int
}
CompactionConfig controls when and how the agent compacts its running message list. Values flow in via Agent.CompactionCfg; the agent reads them under no lock — set them once at construction.
func DefaultCompactionConfig ¶
func DefaultCompactionConfig() CompactionConfig
DefaultCompactionConfig returns the default config. The AutoCompactInputTokens value can be overridden at process start via DEEPSEEKCODE_AUTO_COMPACT_INPUT_TOKENS — malformed values fall back to the default rather than crash.
type CompactionResult ¶
type CompactionResult struct {
Summary string
FromIdx, ToIdx int
RemovedCount int
SummaryMessage llm.Message
KeptMessages []llm.Message
}
CompactionResult is what CompactSession produces. Summary == "" means "no compaction performed" — the caller should leave the message list untouched.
func CompactSession ¶
func CompactSession(messages []llm.Message, cfg CompactionConfig, charsPerToken float64) CompactionResult
CompactSession runs the full pipeline: ShouldCompact → adjustBoundary → summarize. Returns a CompactionResult with Summary == "" when no compaction was performed (caller must check Summary before mutating its message list).
CompactSession does NOT persist — the caller wires the result into Persister.ReplaceWithCompaction (T-209) and replaces its in-memory a.Messages slice.
type EpochComponents ¶
type EpochComponents struct {
AgentProfileID string
Model string
ReasoningEffort string
StaticSystem string
FewShots []llm.Message
ToolSpecs []llm.Tool
Capability CapabilitySet
}
EpochComponents is the input for creating a PrefixEpoch. StaticSystem, ToolSpecs (and, when folded in, FewShots) are the model-visible bytes that determine the Prefix Fingerprint; Capability is the latent identity used only for pending-change detection.
type EpochManager ¶
type EpochManager struct {
// contains filtered or unexported fields
}
EpochManager manages PrefixEpoch lifecycle.
func NewEpochManager ¶
func NewEpochManager() *EpochManager
func (*EpochManager) CreateEpoch ¶
func (m *EpochManager) CreateEpoch(reason string, components EpochComponents) *PrefixEpoch
CreateEpoch builds a new PrefixEpoch from components but does not make it current. Use SwitchEpoch or the initial CreateEpoch path.
func (*EpochManager) CurrentEpoch ¶
func (m *EpochManager) CurrentEpoch() *PrefixEpoch
CurrentEpoch returns the current epoch. Returns nil if no epoch has been initialized.
func (*EpochManager) DetectDrift ¶
func (m *EpochManager) DetectDrift(components EpochComponents) []PendingChange
DetectDrift records the latent capability deltas (profile / skills / MCP) between the frozen epoch and the live components as pending changes, using canonical comparisons. Model-visible byte drift is NOT detected here — it is caught per turn by llm.PrefixMonitor and treated as a bug, not a pending change. Returns the newly detected changes. See docs/adr/0001.
func (*EpochManager) ExpectedCacheMiss ¶
func (m *EpochManager) ExpectedCacheMiss() bool
ExpectedCacheMiss returns true on the first turn after an epoch switch. Returns false on subsequent turns. Call once per turn — it clears the flag on read.
func (*EpochManager) FreezeEpoch ¶
func (m *EpochManager) FreezeEpoch()
FreezeEpoch marks the epoch as immutable after first model request and captures FrozenTools/FrozenSystem from the current epoch.
func (*EpochManager) InitEpoch ¶
func (m *EpochManager) InitEpoch(reason string, components EpochComponents) *PrefixEpoch
InitEpoch creates and sets the initial epoch. Called once at session start. Panics if called when an epoch already exists.
func (*EpochManager) IsFrozen ¶
func (m *EpochManager) IsFrozen() bool
IsFrozen reports whether the epoch is frozen.
func (*EpochManager) PendingChanges ¶
func (m *EpochManager) PendingChanges() []PendingChange
PendingChanges returns a copy of the pending changes list.
func (*EpochManager) RecordPendingChange ¶
func (m *EpochManager) RecordPendingChange(change PendingChange)
RecordPendingChange records a mutation that occurred after the epoch was frozen. The change is not applied to the current epoch.
func (*EpochManager) SetBus ¶
func (m *EpochManager) SetBus(bus *Bus)
SetBus attaches an event bus for epoch lifecycle events.
func (*EpochManager) SwitchEpoch ¶
func (m *EpochManager) SwitchEpoch(reason string, components EpochComponents) *PrefixEpoch
SwitchEpoch creates a new epoch, makes it current, and resets the frozen/pending state. The first turn of the new epoch will report ExpectedCacheMiss() = true.
type Event ¶
type Event interface {
// contains filtered or unexported methods
}
Event is the sealed interface implemented by every event the agent emits. Type-switch on the concrete type at the consumer.
type EventBackgroundJobFinish ¶
EventBackgroundJobFinish signals that a background job has completed.
type EventBackgroundJobStart ¶
EventBackgroundJobStart signals that a background job has started.
type EventBudget ¶
EventBudget reports a session-budget gate decision (T1.3 — promoted from a stringly-typed EventInfo so "warned" vs "blocked" vs "unpriced" are distinguishable for analytics and programmatic gating). ProjectedCNY/SpentCNY are the gate's inputs (ProjectedCNY is 0 for the unpriced kind, where it cannot be computed); Model is the model being gated. Traced with type budget.warning / budget.blocked / budget.unpriced.
type EventCompaction ¶
EventCompaction reports that the agent collapsed messages [FromIdx, ToIdx) into a single Summary message, freeing RemovedCount slots from the live transcript. Wired in Phase 2.
type EventCompactionWarning ¶
EventCompactionWarning is emitted when context pressure crosses the warning threshold (default 75%). The UI can show a status indicator or prepare pinned facts for an upcoming compaction.
type EventDone ¶
type EventDone struct {
Reason StopReason
Err error
}
EventDone is the agent's "this Run is finished" signal. Emitted via defer from Run, so it travels the same channel as every other event and arrives in strict order AFTER the final EventStepFinish. This is load-bearing: routing the "done" signal through a separate goroutine + tea.Msg path used to race past trailing text deltas, leaving the UI's chrome stuck on "writing…" because a late delta would re-fire BeginWriting after the reset.
type EventDriftBlocked ¶
EventDriftBlocked signals that an unauthorized prefix drift was detected and blocked within a frozen epoch.
type EventEnvelope ¶
type EventEnvelope struct {
Version int // = EventProtocolVersion
Seq uint64 // monotonic, assigned by Bus
At time.Time // publish moment
Event Event // the concrete event
}
EventEnvelope wraps an Event with versioning, sequence number, and timestamp for multi-consumer fan-out on the Bus. It does NOT implement Event itself — it is a container, not an event.
type EventEpochCreated ¶
type EventEpochCreated struct {
EpochID string
StaticPrefixHash string
ToolsHash string
Reason string
}
EventEpochCreated signals that a new PrefixEpoch was created.
type EventEpochFrozen ¶
type EventEpochFrozen struct {
EpochID string
}
EventEpochFrozen signals that the current PrefixEpoch was frozen after the first model request.
type EventEpochSwitched ¶
type EventEpochSwitched struct {
OldEpochID string
NewEpochID string
StaticPrefixHash string
ToolsHash string
Reason string
}
EventEpochSwitched signals an explicit epoch switch.
type EventEscalated ¶
EventEscalated reports that the current turn was re-issued on a stronger model (the Two-Model escalation). Trigger is "marker" (the model emitted a <<<NEEDS_PRO>>> self-declaration) or "repair_errors" (the per-turn repair failure count crossed the threshold). FromModel/ToModel record the switch. Traced with type policy.escalated.
type EventHookFired ¶
type EventHookFired struct {
HookName string
Event string // PreToolUse / PostToolUse / ...
Decision string // allow / deny / continue / ask
Reason string
Dur time.Duration
}
EventHookFired reports that a registered hook ran. Decision is one of allow / deny / continue / ask; Reason is the hook's free-form explanation. Surfaced so the UI can show a `[hook] …` chat line. Wired in Phase 3.
type EventInfo ¶
type EventInfo struct{ Text string }
EventInfo is an out-of-band notice (retry attempt, validator skipped, tool-call rate warning). Surfaced as a chat line.
type EventPendingChange ¶
type EventPendingChange struct {
EpochID string
Kind PendingChangeKind
Description string
}
EventPendingChange signals that a component change was detected after the epoch was frozen. The change is recorded but not applied.
type EventPermissionAsk ¶
type EventPermissionAsk struct {
Check permissions.Check
Reply chan<- PermissionResponse
}
EventPermissionAsk requests user approval for a tool call. The consumer MUST send a PermissionResponse on Reply — the agent goroutine blocks on the receive. Reply is buffered (cap 1) so the consumer can send without serialization concerns.
type EventPermissionDenied ¶
EventPermissionDenied reports a tool call refused by the permission layer (T1.3 — promoted from EventInfo). ByRule distinguishes an explicit deny-rule match from a policy-tier denial; Reason is the human-readable cause. Traced with type permission.denied.
type EventPlanUpdate ¶ added in v0.4.0
type EventPlanUpdate struct{ Items []PlanItem }
EventPlanUpdate reports the agent's plan (todo list) was replaced. Published by TodoWrite via its PlanPublisher hook (wired in Agent construction).
type EventQuestionAsk ¶
type EventQuestionAsk struct {
Questions []tools.Question
Reply chan<- tools.QuestionResponse
}
EventQuestionAsk requests the user answer one or more questions. The consumer MUST send a QuestionResponse on Reply — the agent goroutine blocks on the receive. Reply is buffered (cap 1) so the consumer can send without serialization concerns.
type EventReasoningDelta ¶
type EventReasoningDelta struct{ Text string }
EventReasoningDelta appends to the active reasoning block.
type EventReasoningEnd ¶
type EventReasoningEnd struct{}
EventReasoningEnd closes the active reasoning block.
type EventReasoningStart ¶
type EventReasoningStart struct{}
EventReasoningStart opens a new reasoning block.
type EventRepair ¶
type EventRepair struct {
Kind string
Tool string
CallID string
Message string
BeforeHash string
AfterHash string
}
EventRepair reports a tool-call repair action (args completed, recovered, suppressed, or schema-complex). Published by the repair integration layer in runStep after model streaming finishes.
type EventRetry ¶ added in v0.4.0
type EventRetry struct{ Attempt, Max int }
EventRetry reports a transient mid-stream re-issue attempt (T1.4). Attempt is 1-based for the first re-issue; Max is the configured ceiling.
type EventSemanticCompaction ¶
type EventSemanticCompaction struct {
FromIdx, ToIdx int
UsedSemantic bool
SummaryCost float64
FallbackReason string
StaticPrefixHashBefore string
StaticPrefixHashAfter string
}
EventSemanticCompaction reports that semantic compaction ran. UsedSemantic is true when the LLM summary was used; false when deterministic fallback was used. SummaryCost is the cost of the LLM call (0 when fallback). FallbackReason is set when the LLM call failed and deterministic compaction was used instead.
StaticPrefixHashBefore/After are the measured static-prefix fingerprints (system + tools) of the frozen baseline and of the request compaction actually fed the model. They are emitted into the trace so the benchmark can verify compaction did not move the prefix — instead of the agent asserting stability with a hardcoded boolean. When the freeze override is intact they are equal; a regression that summarized against the live (non-frozen) prompt makes them diverge and fails the gate.
type EventStepFinish ¶
type EventStepFinish struct {
Reason StopReason
Usage llm.Usage
Model string
}
EventStepFinish ends one ReAct step. The consumer updates its status counters / cost HUD here. Model is the model that produced the step (an escalated turn reports the stronger model); empty means the loop model.
type EventSubagentFinish ¶
EventSubagentFinish signals that a sub-agent run completed (successfully or not).
type EventSubagentStart ¶
EventSubagentStart signals that a sub-agent spawn has begun.
type EventTextDelta ¶
type EventTextDelta struct{ Text string }
EventTextDelta appends to the active assistant-text block.
type EventToolCallDelta ¶ added in v0.4.0
EventToolCallDelta carries incremental tool stdout for a running call. Only published by tool runners that genuinely stream (e.g. bash stdout); tools that return whole results never emit it.
type EventToolCallResult ¶
EventToolCallResult carries the result of an executed tool call.
type EventToolCallStart ¶
EventToolCallStart announces a tool call (model decided to run a tool; permission gates haven't fired yet).
type Job ¶
type Job struct {
ID string
Kind JobKind
Description string
StartedAt time.Time
FinishedAt time.Time
State JobState
Summary string
// contains filtered or unexported fields
}
Job represents a running or completed background job.
func (*Job) AppendOutput ¶
AppendOutput appends data to the job's ring buffer, dropping old data if the buffer would exceed maxBytes.
type JobRegistry ¶
type JobRegistry struct {
// contains filtered or unexported fields
}
JobRegistry manages background jobs for an agent.
func NewJobRegistry ¶
func NewJobRegistry() *JobRegistry
NewJobRegistry creates a new, empty job registry.
func (*JobRegistry) Cancel ¶
func (r *JobRegistry) Cancel(id string) bool
Cancel cancels the job with the given ID. Returns true if the job was found and canceled, false otherwise. Calling Cancel on an already- cancelled or finished job returns false.
func (*JobRegistry) Close ¶
func (r *JobRegistry) Close()
Close cancels all running jobs and waits up to 2 seconds for them to finish. Any jobs still running after the grace period are marked as JobCanceled.
func (*JobRegistry) Finish ¶
func (r *JobRegistry) Finish(id string, state JobState, summary string)
Finish marks a job as completed with the given state and summary. It is safe to call multiple times; only the first call takes effect.
func (*JobRegistry) Get ¶
func (r *JobRegistry) Get(id string) (*Job, bool)
Get returns the job with the given ID, or nil if not found.
func (*JobRegistry) HasActive ¶
func (r *JobRegistry) HasActive() bool
HasActive reports whether any job is currently in the JobRunning state. /reload-skills uses it (via Agent.HasActiveBackgroundWork) to refuse a skill reload while an async subagent or background_bash job is still live: those goroutines outlive the parent loop and can read the shared skill store that the reload mutates in place, so a.running alone does not cover them.
func (*JobRegistry) JobStatus ¶
func (r *JobRegistry) JobStatus(id string, tailLines int) (Status, error)
JobStatus returns status information for a job suitable for tools.Status return.
func (*JobRegistry) List ¶
func (r *JobRegistry) List() []*Job
List returns all jobs in the registry.
func (*JobRegistry) Start ¶
func (r *JobRegistry) Start(parent context.Context, kind JobKind, description string) (*Job, context.Context)
Start creates a new job with State=JobRunning and returns it along with a derived context. The caller holds the context and runs the actual work; when done, call Finish to lock in the final state.
type LoopSpawner ¶
type LoopSpawner struct {
Client *llm.Client
Parent *Agent
Defs map[string]agents.AgentDef
MaxDepth int // 0 → default 2
// WT is the worktree manager for git-worktree isolation.
// nil = worktree path disabled; def.Worktree==true degrades to normal spawn.
WT *worktree.Manager
// Locks provides branch-level mutual exclusion for worktree operations.
Locks worktree.BranchLocker
// contains filtered or unexported fields
}
LoopSpawner implements tools.Spawner by running a child Agent loop in the same process. It derives a child Registry and Policy from the parent Agent, respecting agent-def tool whitelists and depth limits.
func (*LoopSpawner) Spawn ¶
func (s *LoopSpawner) Spawn(ctx context.Context, req tools.SpawnRequest) (tools.SpawnResult, error)
type MessageTruncator ¶
type MessageTruncator interface {
TruncateMessages(ctx context.Context, keepCount int) (int, error)
// PersistedMessageCount returns the count of persisted body messages.
// ReconcileUndo compares it to len(a.Messages) to confirm the in-memory
// transcript is index-aligned with disk before truncating disk — alignment
// breaks after a resume, a branch, or a dangling-tool-call repair insert,
// where an in-memory boundary would delete the wrong persisted rows.
PersistedMessageCount(ctx context.Context) (int, error)
}
MessageTruncator is an optional Persister capability: drop persisted body messages with index >= keepCount so disk matches the in-memory transcript after an /undo (T3.5). internal/session.Persister implements it; it is checked via type assertion so non-persisting agents and test fakes need not implement it (mirrors ReceiptAppender).
type PendingChange ¶
type PendingChange struct {
Kind PendingChangeKind
Description string
DetectedAt time.Time
}
PendingChange is a detected mutation that is blocked from model-visibility until an explicit epoch switch.
func CapabilityDiff ¶
func CapabilityDiff(oldCS, newCS CapabilitySet) []PendingChange
CapabilityDiff returns the pending changes between two capability sets using canonical comparisons (skills.Store.Diff and mcp.CompareToolLists), so cache-irrelevant noise — a reordered MCP tool list, or reordered JSON-Schema keys within a tool — never registers as a change.
type PendingChangeKind ¶
type PendingChangeKind string
PendingChangeKind identifies the type of change detected after an epoch was frozen.
const ( PendingSystemChanged PendingChangeKind = "system_changed" PendingToolAdded PendingChangeKind = "tool_added" PendingToolRemoved PendingChangeKind = "tool_removed" PendingToolSchemaChanged PendingChangeKind = "tool_schema_changed" PendingSkillAdded PendingChangeKind = "skill_added" PendingSkillRemoved PendingChangeKind = "skill_removed" PendingSkillBodyChanged PendingChangeKind = "skill_body_changed" PendingMCPToolAdded PendingChangeKind = "mcp_tool_added" PendingMCPToolRemoved PendingChangeKind = "mcp_tool_removed" PendingMCPToolSchemaChanged PendingChangeKind = "mcp_tool_schema_changed" PendingAgentProfileChanged PendingChangeKind = "agent_profile_changed" PendingFewShotsChanged PendingChangeKind = "few_shots_changed" )
type PermissionResponse ¶
type PermissionResponse struct {
Allow bool
PersistPattern bool // when true (bash + "always"), persist to allowlist
}
PermissionResponse is what the UI returns from OnPermissionAsk.
type Persister ¶
type Persister interface {
// SessionID is the session this agent is associated with.
SessionID() string
// AppendUserMessage records a user turn as a typed block slice
// (typically one TextBlock; multimodal extensions later).
AppendUserMessage(ctx context.Context, blocks []llm.ContentBlock) (int, error)
// AppendAssistant records an assistant turn as a typed block slice
// (Thinking, Text, ToolUse — order matches model emission).
AppendAssistant(ctx context.Context, blocks []llm.ContentBlock, model string, usage llm.Usage) (int, error)
// AppendToolResult records the result of one tool_use, identified
// by toolUseID. isError true marks an infrastructure failure so
// renderers can color it.
AppendToolResult(ctx context.Context, toolUseID string, content string, isError bool) (int, error)
// TakeSnapshot snapshots the given paths before a mutating tool runs.
// stepIdx is the message index of the assistant turn that contained
// the tool call; the snapshot manager uses it to namespace files on
// disk so /undo can revert a specific step.
TakeSnapshot(stepIdx int, paths []string) (int, error)
// SetActiveModel persists the result of a /models switch.
SetActiveModel(ctx context.Context, model string) error
// ReplaceWithCompaction atomically deletes messages in [fromIdx,
// toIdx) and inserts a synthetic summary message at fromIdx,
// renumbering subsequent messages to keep idx contiguous. Returns
// the idx of the inserted summary. The full transactional
// implementation lands in Phase 2 (T-208); for now this method
// exists so the rest of the system can compile against the final
// interface shape.
ReplaceWithCompaction(ctx context.Context, fromIdx, toIdx int, summary string) (int, error)
}
Persister abstracts the session/snapshot bookkeeping the agent needs. internal/session.Persister and internal/snapshots.Manager satisfy this. Keeping the interface here lets the agent stay decoupled.
type PlanItem ¶ added in v0.4.0
PlanItem is one entry of the agent's structured plan, decoupled from tools.TodoItem so consumers (gateway/SPA) need not import tools. Status is the Contract value set: pending | in_progress | done.
type PrefixEpoch ¶
type PrefixEpoch struct {
EpochID string
AgentProfileID string
Model string
ReasoningEffort string
StaticSystem string
FewShots []llm.Message
ToolSpecs []llm.Tool
// Capability is the latent identity (profile/skills/MCP) frozen with this
// epoch. It drives pending-change detection but is NOT in StaticPrefixHash.
Capability CapabilitySet
CreatedAt time.Time
CreatedReason string
ComponentHashes map[string]string
// StaticPrefixHash is the Prefix Fingerprint: the canonical hash of the
// model-visible bytes (system + tools) — the DeepSeek cache key. Latent
// capability state is intentionally excluded (see docs/adr/0001).
StaticPrefixHash string
// FrozenTools and FrozenSystem capture the tool list and system
// prompt at the moment FreezeEpoch is called. When the epoch is
// frozen, runStep and maybeCompact MUST use these instead of the
// live values to guarantee cache-stable prefixes.
FrozenTools []llm.Tool
FrozenSystem string
}
PrefixEpoch is a frozen model-visible prefix snapshot. Once frozen (after first model request), it cannot change. Changes to tools, skills, MCP, system prompt, etc. become pending changes that are visible in receipts but not model-visible until an explicit epoch switch.
type ReceiptAppender ¶
type ReceiptAppender interface {
AppendReceipt(ctx context.Context, kind session.ReceiptKind, payload json.RawMessage) (int64, error)
}
ReceiptAppender is an optional interface that Persister implementations can satisfy to support transcript receipt persistence. The agent checks for this interface via type assertion and uses it when available.
type ReloadResult ¶
type ReloadResult struct {
// Changes is the skill-level diff (added / removed / body_changed) between
// the previously loaded skills and the freshly re-scanned set.
Changes []skills.SkillChange
// FingerprintMoved reports whether the rebuilt model-visible system prompt
// differs from the pre-reload one. True means the bytes DeepSeek caches
// actually changed, so a new epoch was minted and the next turn will miss
// cache exactly once. False means nothing the model can see changed and no
// epoch was switched — so the reload costs no cache miss.
FingerprintMoved bool
// OldEpochID / NewEpochID are populated only when an epoch switch occurred
// (FingerprintMoved && an epoch already existed).
OldEpochID string
NewEpochID string
}
ReloadResult summarizes an Agent.ReloadSkills (the /reload-skills command).
type SemanticCompactionConfig ¶
type SemanticCompactionConfig struct {
// WarnThreshold is the context ratio (0-1) at which to emit
// a warning and prepare pinned facts. Default: 0.75
WarnThreshold float64
// CompactThreshold is the context ratio at which to attempt
// semantic compaction. Default: 0.80
CompactThreshold float64
// ProtectionThreshold is the context ratio at which to enter
// protection mode (preserve task continuity over full history).
// Default: 0.90
ProtectionThreshold float64
// SummaryModel is the model to use for semantic summaries.
// Default: "deepseek-v4-flash"
SummaryModel string
// SummaryTimeout is the timeout for the summary request.
// Default: 15 seconds
SummaryTimeout time.Duration
// MaxSummaryTokens caps the semantic summary length.
// Default: 2000
MaxSummaryTokens int
}
SemanticCompactionConfig controls semantic compaction behavior.
type SemanticCompactionResult ¶
type SemanticCompactionResult struct {
Summary string
FromIdx, ToIdx int
RemovedCount int
SummaryMessage llm.Message
KeptMessages []llm.Message
UsedSemantic bool // true if LLM summary was used
SummaryCost float64 // cost of the LLM call, if any
FallbackReason string // why deterministic fallback was used
}
SemanticCompactionResult is what SemanticCompact produces.
func CompactWithSemantic ¶
func CompactWithSemantic( ctx context.Context, messages []llm.Message, client *llm.Client, systemPrompt string, tools []llm.Tool, compCfg CompactionConfig, semanticCfg SemanticCompactionConfig, maxContextTokens int, ) SemanticCompactionResult
CompactWithSemantic checks context pressure and decides between no compaction, a warning, semantic compaction (LLM), or deterministic fallback. It returns a SemanticCompactionResult with Summary == "" when no compaction was performed.
The action decision:
- "none": below all thresholds → no compaction
- "warn": above warn threshold → warning only
- "compact": above compact threshold → semantic compaction
- "protect": above protection threshold → semantic compaction (same as compact for now)
When semantic compaction fails, it falls back to the deterministic CompactSession. The caller should check UsedSemantic and FallbackReason to report the outcome.
func SemanticCompact ¶
func SemanticCompact( ctx context.Context, messages []llm.Message, client *llm.Client, systemPrompt string, tools []llm.Tool, cfg SemanticCompactionConfig, ) SemanticCompactionResult
SemanticCompact attempts LLM-powered compaction, falling back to deterministic on failure. The LLM call:
- Uses deepseek-v4-flash
- Disables thinking
- Has a 15 second timeout
- Reuses the same static system prefix
- Preserves pinned skills and constraints
- Preserves current objective
- Preserves negative constraints
- Preserves changed file paths
- Preserves recent tool evidence
- Records its own usage and cost
type Spawner ¶
Spawner is the v0.2 interface for subagent dispatch. v0.1 leaves it unimplemented; reserving the type makes the v0.2 addition additive rather than a refactor.
type Status ¶
type Status struct {
ID string
Kind string
State string
StartedAt time.Time
FinishedAt time.Time
Summary string
Tail string
DroppedBytes int64
TotalLines int
Truncated bool
}
Status is the public status struct returned to tools.
type StepRecord ¶
type StepRecord struct {
FinishReason string
Usage llm.Usage
ToolCalls []llm.ToolCall
EpochID string
StaticPrefixHash string
ExpectedCacheMiss bool
// Model is the model that actually produced this step. It usually equals
// the loop model, but an escalated turn (T2.3) records the stronger model
// so cost/trace attribution follows the turn, not the static loop model.
Model string
// MessageCount is len(a.Messages) captured BEFORE this step's model turn —
// the transcript boundary this step started from. /undo (T3.5) truncates
// a.Messages back to the boundary of the first undone step so the model's
// view matches the reverted files. Boundaries recorded before a compaction
// are stale (compaction renumbers messages), so undo refuses to cross one.
MessageCount int
// Snapshotted is true when this step took a file snapshot (i.e. it ran a
// mutating tool). /undo counts SNAPSHOTS, not steps — snapshots are sparse
// (read-only steps take none) — so ReconcileUndo walks snapshotted steps to
// find the same boundary the snapshot manager reverts files to (T3.5).
Snapshotted bool
}
StepRecord captures one step's outcome so stop conditions can reason across history.
type StopCondition ¶
type StopCondition func(steps []StepRecord) (stop bool, reason StopReason)
StopCondition examines recent history and returns (true, reason) when the loop should terminate. The agent calls all conditions after each step and stops on the first that fires.
func LoopDetection ¶
func LoopDetection(window, maxRepeats int) StopCondition
LoopDetection breaks the loop when the same tool call (name + arg hash) appears `maxRepeats` times within the last `window` steps. Crush calls this in internal/agent/loop_detection.go; we use the same shape. Default v0.1: window=5, maxRepeats=3.
func MaxSteps ¶
func MaxSteps(n int) StopCondition
MaxSteps caps total agent steps in a single Run. Default 50 in v0.1.
type StopReason ¶
type StopReason int
StopReason describes why the loop terminated.
const ( StopUnknown StopReason = iota StopModelDone // finish_reason!=tool_calls and no tool calls StopMaxSteps // step cap exceeded StopLoopDetected // same tool call repeated too many times StopContextCancel // ctx.Err() StopUserRequested // explicit cancellation from TUI StopStepTimeout // per-step deadline exceeded (non-success) StopVerifiedDone // model done AND verify command passed )
func (StopReason) IsSuccess ¶
func (r StopReason) IsSuccess() bool
IsSuccess reports whether a stop reason represents a clean, complete run (the model finished on its own). Every other reason — cancellation, a step timeout, a loop or step-cap halt, or an unknown/error exit — is a non-success termination and must not be rendered or recorded as "done".
func (StopReason) String ¶
func (r StopReason) String() string
type Subscription ¶
type Subscription struct {
C <-chan EventEnvelope
// contains filtered or unexported fields
}
Subscription is one consumer's view of the Bus. C delivers events in publish order. Callers that stop reading should Unsubscribe to avoid leaking the goroutine that would otherwise block on the full channel.
func (*Subscription) Dropped ¶
func (s *Subscription) Dropped() uint64
Dropped returns the number of events dropped for this subscription because its buffer was full. Only non-reply events are dropped; reply-carrying events block until the consumer reads them.
type TraceSink ¶
type TraceSink struct {
// contains filtered or unexported fields
}
TraceSink converts an agent's event stream into JSONL trace records. It subscribes to a bus and writes one record per epoch lifecycle event, per turn (prefix snapshot + usage), per compaction, and per blocked drift. The trace is the source of truth for the benchmark's cache-reliability gate.
Construct the root via Agent.AttachTraceSink (wires the subscription, a drain goroutine, and a handle the caller waits on after Run). Subagent child sinks are derived via newChildTraceSink and share the root's writer.
func NewTraceSink ¶
NewTraceSink builds a root sink writing JSONL to w. model is used to price usage records (cost_cny). Every record is stamped with a per-run run_id and agent_role="root" so the benchmark can distinguish the root epoch from any subagent epochs when judging parent/subagent cache pollution.
type TraceSinkHandle ¶
type TraceSinkHandle struct {
// contains filtered or unexported fields
}
TraceSinkHandle lets the caller wait for the sink to finish processing after Run returns (EventDone is the terminator) and unsubscribe.
func (*TraceSinkHandle) Close ¶
func (h *TraceSinkHandle) Close()
Close unsubscribes the sink from the bus. Safe to call after Wait.
func (*TraceSinkHandle) Wait ¶
func (h *TraceSinkHandle) Wait()
Wait blocks until the agent's terminating EventDone has been processed.
func (*TraceSinkHandle) WaitTimeout ¶
func (h *TraceSinkHandle) WaitTimeout(d time.Duration) bool
WaitTimeout blocks until EventDone is processed or d elapses, whichever is first. It returns true when EventDone was processed (the agent finished cleanly) and false on timeout — a timed-out child trace is partial, which the caller must surface rather than silently close.
type VerifyHook ¶ added in v0.4.0
type VerifyHook struct {
// Cmd is the shell command to run (e.g. "go build ./..." or "go test ./...").
// Executed via sh -c so shell expansion is supported.
Cmd string
// Shell is the interpreter. Defaults to "sh" if empty.
Shell string
}
VerifyHook runs a shell command after mutating steps and reports whether the working tree is in a good state. It is configured via Agent.Verify and called from two points in the agent loop:
- After every step that includes at least one mutating tool call (via maybeRunVerifyHook): on failure the feedback is injected as a synthetic user message so the model can fix errors on its next turn.
- At model-stop time when the model emits no tool calls: on success the stop reason is promoted from StopModelDone to StopVerifiedDone; on failure the feedback is injected into a.Messages before returning.
When Cmd is empty, the hook is disabled and always reports pass.
func (*VerifyHook) Run ¶ added in v0.4.0
func (h *VerifyHook) Run(ctx context.Context) (feedback string, passed bool)
Run executes the verify command and returns (feedback, passed).
- passed=true, feedback="" when the command exits 0 or Cmd is empty.
- passed=false, feedback=<synthesized message> when the command exits non-0.