Documentation
¶
Overview ¶
Package llm implements Excise's v0.3 opt-in rerank against a local Ollama host. Everything in this package is reachable only when the user passes `--llm`; the v0.2 heuristic path never imports it.
The trust contract documented in the README is enforced here:
- the only outbound HTTP call is to the host configured in excise.toml (default http://localhost:11434), nothing else
- timeouts are bounded by the caller-supplied context (default 20s)
- any failure (network, HTTP status, JSON parse) is mapped to ErrLLMUnavailable so callers can fall back deterministically without having to recognise every failure mode
We intentionally use net/http with no third-party SDK — Ollama's /api/generate is a single endpoint, the JSON shape is tiny, and adding a dependency is more code to audit than the 30 lines below.
remote.go — v0.4 opt-in remote rerank backend.
This file adds a SECOND implementation of the existing Reranker interface (see rerank.go) for the three remote providers the plan calls out:
- openai — POST <base>/v1/chat/completions (OpenAI chat shape)
- openrouter — same wire shape as OpenAI (OpenRouter is OpenAI-compatible)
- anthropic — POST <base>/v1/messages (Anthropic messages shape)
It is a NEW package member, not a refactor: cmd/excise selects which concrete Reranker to construct, and the rerank call site is unchanged.
Trust contract (README + plan §v0.4 trust contract):
- The remote path is reachable ONLY when the user sets backend=remote AND supplies a key. The default backend stays local Ollama.
- The destination host is echoed to stderr on EVERY remote call so an outbound call is never silent.
- Any failure (auth, timeout, non-2xx, parse) maps to ErrLLMUnavailable — the SAME sentinel the Ollama path returns — so the caller's existing fallback branch works unchanged.
We use net/http with no provider SDK: each provider is one POST with a small JSON body, and a dependency would be more to audit than the code below.
Index ¶
Constants ¶
This section is empty.
Variables ¶
ErrLLMUnavailable is the single sentinel error this package returns when anything goes wrong: HTTP failure, non-2xx response, timeout, body too short, JSON parse failure. Callers compare with errors.Is.
Functions ¶
This section is empty.
Types ¶
type OllamaClient ¶
type OllamaClient struct {
Host string // base URL, e.g. http://localhost:11434
Model string // model tag, e.g. llama3.2
Timeout time.Duration // per-call timeout; 0 means no limit (not recommended)
HTTP *http.Client // optional injection for tests
}
OllamaClient is the tiny HTTP shim around Ollama's /api/generate. Construction is cheap and concurrency-safe; one instance per rerank call is fine.
func NewOllamaClient ¶
func NewOllamaClient(host, model string, timeout time.Duration) *OllamaClient
NewOllamaClient is the canonical constructor.
type OllamaReranker ¶
type OllamaReranker struct {
Client *OllamaClient
}
OllamaReranker is the production implementation. It defers all transport to OllamaClient and all parsing to parseRerankReply, then merges the returned scores/reasons back onto the original TurnScore records.
If the LLM returns a turn id we don't recognise, that entry is dropped (we never invent turns). If the LLM omits a turn id from the input, we keep it at its heuristic position with an empty reason — better partial information than total fallback.
func NewOllamaReranker ¶
func NewOllamaReranker(client *OllamaClient) *OllamaReranker
NewOllamaReranker is the canonical constructor.
type RemoteClient ¶ added in v0.4.0
type RemoteClient struct {
// contains filtered or unexported fields
}
RemoteClient is the tiny HTTP shim around a provider's chat endpoint.
func NewRemoteClient ¶ added in v0.4.0
func NewRemoteClient(cfg RemoteConfig) *RemoteClient
NewRemoteClient is the canonical constructor.
func (*RemoteClient) Generate ¶ added in v0.4.0
Generate sends the prompt to the configured provider and returns the model's text response. Every error path maps to ErrLLMUnavailable (wrapped with the cause for diagnostics) so the caller falls back deterministically.
The destination host is echoed to stderr BEFORE the request is sent — the trust posture is that an outbound call is never silent, even if it then fails.
type RemoteConfig ¶ added in v0.4.0
type RemoteConfig struct {
Provider string // openai | anthropic | openrouter
BaseURL string // optional override; empty → provider default
Model string // model id, e.g. gpt-4o-mini / claude-3-5-haiku-latest
APIKey string // already-resolved key (never logged)
Timeout time.Duration // per-call timeout; 0 means no limit
HTTP *http.Client // optional injection for tests
Stderr io.Writer // where the host-echo line is written; nil → os.Stderr
}
RemoteConfig is the resolved transport configuration for a remote call. It is deliberately flat so cmd/excise can build it straight from the config.LLM block + CLI overrides.
type RemoteReranker ¶ added in v0.4.0
type RemoteReranker struct {
Client *RemoteClient
}
RemoteReranker implements the same Reranker interface as OllamaReranker. It renders the identical rerank prompt (prompt.go) and parses the reply with the identical parser, so the only difference from the Ollama path is transport + auth.
func NewRemoteReranker ¶ added in v0.4.0
func NewRemoteReranker(client *RemoteClient) *RemoteReranker
NewRemoteReranker is the canonical constructor.
type Reranker ¶
type Reranker interface {
Rerank(ctx context.Context, s *session.Session, shortlist []suggest.TurnScore) ([]suggest.TurnScore, error)
}
Reranker is the v0.3 interface — Rerank takes the v0.2 heuristic shortlist and returns a reordered copy with each entry's LLMReason populated.
The interface is intentionally tiny so adding a v0.4 remote backend (or a stub for tests) is a single method, not a re-wiring of the CLI.
func NewStubReranker ¶
func NewStubReranker(err error, picks []struct { TurnID string Score float64 Reason string }) Reranker
NewStubReranker is a test helper that returns a Reranker which:
- on err != nil, returns err immediately (use to exercise the fallback path)
- otherwise merges the supplied (turn_id, score, reason) tuples onto the incoming shortlist via the same mergeReplies path the real reranker uses.