toolselectlearn

package
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 20, 2026 License: MIT Imports: 7 Imported by: 0

Documentation

Overview

Package toolselectlearn is the tool-selection active-learning loop (D-06/D-07, spike 057) — the SECOND consumer of the shared internal/activelearn mechanism after the reasoning learner. It detects mis-routed turns cheaply (a shell/fs fallback tool was used, OR the used-tool != the ranker's top-1), labels the confident cases with the free ranker and escalates the low-margin tail to the existing DeepSeek router (the two-tier oracle), then persists confirmed (query-embedding -> tool) exemplars to :ToolSelectionExample so the tool_search ranker's per-tool centroids self-improve. All async, off the hot path: the runner calls Observe(request, usedTool) post-tool-execution and it never blocks the turn.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	Embedder    semindex.Embedder // embeds the flagged request for the saved exemplar
	Ranker      Ranker            // free ranker: detection top-1 + the free-tier confident label
	Teacher     Teacher           // DeepSeek escalation oracle (kill-switched); nil => free-only
	Saver       Saver             // persists confirmed (query -> tool) to :ToolSelectionExample
	Refresh     func()            // re-folds the per-tool centroids after a save (e.g. ranker.RefreshLearned)
	MarginFloor float64           // activelearn queue margin floor; 0 => default
	Queue       int               // activelearn queue depth; 0 => default
}

Config wires a Learner. Detector inputs (Ranker + Embedder) and the two-tier labeling I/O (Ranker for the free tier, Teacher for the escalation tier, Saver for persistence) are supplied by the composition root. A nil Embedder or Saver yields a nil Learner (the runner then simply observes nothing).

type ExampleLoader

type ExampleLoader interface {
	LoadExamples(ctx context.Context) ([]toolselectstore.LabeledVec, error)
}

ExampleLoader loads the full confirmed-example set for the Refresh re-fold. *toolselectstore.Store satisfies it (LoadExamples). The runner's Refresh hook type-asserts the Saver to this so the loop re-folds the per-tool centroids after a save without toolselectlearn importing toolselectstore.

type Learner

type Learner struct {
	// contains filtered or unexported fields
}

Learner is the tool-selection self-improvement worker. It mirrors reasoninglearn.Learner: a thin wrapper over the shared internal/activelearn core, supplying only the tool-selection-specific I/O (the mis-route detector, the granite embedder, the two-tier oracle, the Neo4j saver).

Observe(request, usedTool) is the runner's post-turn capture entry. It is a NON-BLOCKING handoff (CR-01): it only enqueues the raw signal onto a bounded, drop-on-full channel and returns immediately, so the synchronous, lock-held turn path never waits on embed/network I/O. A dedicated intake goroutine then runs BOTH the mis-route detection (which embeds) AND the exemplar embed off the turn, deriving its context from the learner's lifetime so Close cancels any in-flight embed.

func New

func New(cfg Config) *Learner

New starts the worker. Returns nil when the Embedder or Saver is missing (the runner then has no learner attached and Observe is a no-op): both are required — the embedder turns a flagged request into the exemplar vector, the saver persists it. The Teacher may be nil (free-tier-only labeling).

func (*Learner) Close

func (l *Learner) Close()

Close stops both the intake worker and the inner activelearn worker and waits for in-flight work to finish (goleak-clean). It cancels the shared lifetime ctx (aborting any in-flight embed), then joins the intake worker, then closes the inner core. Safe to call multiple times and on a nil Learner.

func (*Learner) Observe

func (l *Learner) Observe(request, usedTool string)

Observe is the runner's post-tool-execution capture site (Open-Q #3). It runs on the synchronous, lock-held turn path, so it MUST NOT block: it only hands the raw (request, usedTool) signal to the intake worker over a bounded channel and returns. A nil learner and a full queue both drop silently (best-effort — the same request can be re-flagged on a later turn). NO embed, ranking, or network I/O happens here (CR-01: the detect+embed work runs on the intake worker, off the turn goroutine).

type Ranker

type Ranker interface {
	Rank(ctx context.Context, query string) (top1 string, margin float64, ok bool)
}

Ranker is the narrow seam the loop uses to ask the free semantic ranker two things about a request: (1) "what tool would you choose?" (top1, for the mis-route disagreement signal) and (2) "how confident are you?" (the top-2 margin, for the two-tier free-vs-escalate label gate). It is satisfied by the tool_search semindex ranker (the per-tool centroid bank). ok is false when the ranker is unwired or the bank is empty — then the detector uses only the embedding-free shell/fs heuristic, and the oracle escalates (no free label is trustworthy).

Rank takes a ctx (CR-01): ranking embeds the deferred corpus + the request, and that I/O runs entirely off the turn goroutine on the learner's intake/activelearn workers, bounded by the learner's lifetime ctx so Close aborts any in-flight embed.

type Saver

type Saver interface {
	Save(ctx context.Context, query, tool string, vec []float64) error
}

Saver persists one oracle-confirmed (query -> tool) example. *toolselectstore.Store satisfies it.

type Teacher

type Teacher interface {
	Label(ctx context.Context, request string) (tool string, ok bool)
}

Teacher is the narrow seam for the EXISTING DeepSeek router used as the escalation oracle (Req-8: no new sidecar). The runner supplies a concrete adapter that runs the router prompt via llm.Client (mirroring runner.reasoningOracle: MaxTokens:32, Temperature:0, ToolChoice:"none", Reasoning.Enabled=false). Label returns the confirmed tool name for the request, or ok=false on any failure/decline.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL