Documentation
¶
Overview ¶
Package toolselectlearn is the tool-selection active-learning loop (D-06/D-07, spike 057) — the SECOND consumer of the shared internal/activelearn mechanism after the reasoning learner. It detects mis-routed turns cheaply (a shell/fs fallback tool was used, OR the used-tool != the ranker's top-1), labels the confident cases with the free ranker and escalates the low-margin tail to the existing DeepSeek router (the two-tier oracle), then persists confirmed (query-embedding -> tool) exemplars to :ToolSelectionExample so the tool_search ranker's per-tool centroids self-improve. All async, off the hot path: the runner calls Observe(request, usedTool) post-tool-execution and it never blocks the turn.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct {
Embedder semindex.Embedder // embeds the flagged request for the saved exemplar
Ranker Ranker // free ranker: detection top-1 + the free-tier confident label
Teacher Teacher // DeepSeek escalation oracle (kill-switched); nil => free-only
Saver Saver // persists confirmed (query -> tool) to :ToolSelectionExample
Refresh func() // re-folds the per-tool centroids after a save (e.g. ranker.RefreshLearned)
MarginFloor float64 // activelearn queue margin floor; 0 => default
Queue int // activelearn queue depth; 0 => default
}
Config wires a Learner. Detector inputs (Ranker + Embedder) and the two-tier labeling I/O (Ranker for the free tier, Teacher for the escalation tier, Saver for persistence) are supplied by the composition root. A nil Embedder or Saver yields a nil Learner (the runner then simply observes nothing).
type ExampleLoader ¶
type ExampleLoader interface {
LoadExamples(ctx context.Context) ([]toolselectstore.LabeledVec, error)
}
ExampleLoader loads the full confirmed-example set for the Refresh re-fold. *toolselectstore.Store satisfies it (LoadExamples). The runner's Refresh hook type-asserts the Saver to this so the loop re-folds the per-tool centroids after a save without toolselectlearn importing toolselectstore.
type Learner ¶
type Learner struct {
// contains filtered or unexported fields
}
Learner is the tool-selection self-improvement worker. It mirrors reasoninglearn.Learner: a thin wrapper over the shared internal/activelearn core, supplying only the tool-selection-specific I/O (the mis-route detector, the granite embedder, the two-tier oracle, the Neo4j saver).
Observe(request, usedTool) is the runner's post-turn capture entry. It is a NON-BLOCKING handoff (CR-01): it only enqueues the raw signal onto a bounded, drop-on-full channel and returns immediately, so the synchronous, lock-held turn path never waits on embed/network I/O. A dedicated intake goroutine then runs BOTH the mis-route detection (which embeds) AND the exemplar embed off the turn, deriving its context from the learner's lifetime so Close cancels any in-flight embed.
func New ¶
New starts the worker. Returns nil when the Embedder or Saver is missing (the runner then has no learner attached and Observe is a no-op): both are required — the embedder turns a flagged request into the exemplar vector, the saver persists it. The Teacher may be nil (free-tier-only labeling).
func (*Learner) Close ¶
func (l *Learner) Close()
Close stops both the intake worker and the inner activelearn worker and waits for in-flight work to finish (goleak-clean). It cancels the shared lifetime ctx (aborting any in-flight embed), then joins the intake worker, then closes the inner core. Safe to call multiple times and on a nil Learner.
func (*Learner) Observe ¶
Observe is the runner's post-tool-execution capture site (Open-Q #3). It runs on the synchronous, lock-held turn path, so it MUST NOT block: it only hands the raw (request, usedTool) signal to the intake worker over a bounded channel and returns. A nil learner and a full queue both drop silently (best-effort — the same request can be re-flagged on a later turn). NO embed, ranking, or network I/O happens here (CR-01: the detect+embed work runs on the intake worker, off the turn goroutine).
type Ranker ¶
type Ranker interface {
Rank(ctx context.Context, query string) (top1 string, margin float64, ok bool)
}
Ranker is the narrow seam the loop uses to ask the free semantic ranker two things about a request: (1) "what tool would you choose?" (top1, for the mis-route disagreement signal) and (2) "how confident are you?" (the top-2 margin, for the two-tier free-vs-escalate label gate). It is satisfied by the tool_search semindex ranker (the per-tool centroid bank). ok is false when the ranker is unwired or the bank is empty — then the detector uses only the embedding-free shell/fs heuristic, and the oracle escalates (no free label is trustworthy).
Rank takes a ctx (CR-01): ranking embeds the deferred corpus + the request, and that I/O runs entirely off the turn goroutine on the learner's intake/activelearn workers, bounded by the learner's lifetime ctx so Close aborts any in-flight embed.
type Saver ¶
Saver persists one oracle-confirmed (query -> tool) example. *toolselectstore.Store satisfies it.
type Teacher ¶
Teacher is the narrow seam for the EXISTING DeepSeek router used as the escalation oracle (Req-8: no new sidecar). The runner supplies a concrete adapter that runs the router prompt via llm.Client (mirroring runner.reasoningOracle: MaxTokens:32, Temperature:0, ToolChoice:"none", Reasoning.Enabled=false). Label returns the confirmed tool name for the request, or ok=false on any failure/decline.