llm

package
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 20, 2026 License: MIT Imports: 15 Imported by: 0

Documentation

Overview

Package llm implements Excise's v0.3 opt-in rerank against a local Ollama host. Everything in this package is reachable only when the user passes `--llm`; the v0.2 heuristic path never imports it.

The trust contract documented in the README is enforced here:

  • the only outbound HTTP call is to the host configured in excise.toml (default http://localhost:11434), nothing else
  • timeouts are bounded by the caller-supplied context (default 20s)
  • any failure (network, HTTP status, JSON parse) is mapped to ErrLLMUnavailable so callers can fall back deterministically without having to recognise every failure mode

We intentionally use net/http with no third-party SDK — Ollama's /api/generate is a single endpoint, the JSON shape is tiny, and adding a dependency is more code to audit than the 30 lines below.

remote.go — v0.4 opt-in remote rerank backend.

This file adds a SECOND implementation of the existing Reranker interface (see rerank.go) for the three remote providers the plan calls out:

  • openai — POST <base>/v1/chat/completions (OpenAI chat shape)
  • openrouter — same wire shape as OpenAI (OpenRouter is OpenAI-compatible)
  • anthropic — POST <base>/v1/messages (Anthropic messages shape)

It is a NEW package member, not a refactor: cmd/excise selects which concrete Reranker to construct, and the rerank call site is unchanged.

Trust contract (README + plan §v0.4 trust contract):

  • The remote path is reachable ONLY when the user sets backend=remote AND supplies a key. The default backend stays local Ollama.
  • The destination host is echoed to stderr on EVERY remote call so an outbound call is never silent.
  • Any failure (auth, timeout, non-2xx, parse) maps to ErrLLMUnavailable — the SAME sentinel the Ollama path returns — so the caller's existing fallback branch works unchanged.

We use net/http with no provider SDK: each provider is one POST with a small JSON body, and a dependency would be more to audit than the code below.

Index

Constants

This section is empty.

Variables

View Source
var ErrLLMUnavailable = errors.New("llm: backend unavailable")

ErrLLMUnavailable is the single sentinel error this package returns when anything goes wrong: HTTP failure, non-2xx response, timeout, body too short, JSON parse failure. Callers compare with errors.Is.

Functions

This section is empty.

Types

type OllamaClient

type OllamaClient struct {
	Host    string        // base URL, e.g. http://localhost:11434
	Model   string        // model tag, e.g. llama3.2
	Timeout time.Duration // per-call timeout; 0 means no limit (not recommended)
	HTTP    *http.Client  // optional injection for tests
}

OllamaClient is the tiny HTTP shim around Ollama's /api/generate. Construction is cheap and concurrency-safe; one instance per rerank call is fine.

func NewOllamaClient

func NewOllamaClient(host, model string, timeout time.Duration) *OllamaClient

NewOllamaClient is the canonical constructor.

func (*OllamaClient) Generate

func (c *OllamaClient) Generate(ctx context.Context, prompt string) (string, error)

Generate sends a single prompt and returns the model's full text response. Any error path maps to ErrLLMUnavailable wrapped with the original cause for diagnostic logging, so callers can `errors.Is(err, ErrLLMUnavailable)` without checking every leaf.

type OllamaReranker

type OllamaReranker struct {
	Client *OllamaClient
}

OllamaReranker is the production implementation. It defers all transport to OllamaClient and all parsing to parseRerankReply, then merges the returned scores/reasons back onto the original TurnScore records.

If the LLM returns a turn id we don't recognise, that entry is dropped (we never invent turns). If the LLM omits a turn id from the input, we keep it at its heuristic position with an empty reason — better partial information than total fallback.

func NewOllamaReranker

func NewOllamaReranker(client *OllamaClient) *OllamaReranker

NewOllamaReranker is the canonical constructor.

func (*OllamaReranker) Rerank

func (r *OllamaReranker) Rerank(ctx context.Context, s *session.Session, shortlist []suggest.TurnScore) ([]suggest.TurnScore, error)

Rerank implements the Reranker contract.

type RemoteClient added in v0.4.0

type RemoteClient struct {
	// contains filtered or unexported fields
}

RemoteClient is the tiny HTTP shim around a provider's chat endpoint.

func NewRemoteClient added in v0.4.0

func NewRemoteClient(cfg RemoteConfig) *RemoteClient

NewRemoteClient is the canonical constructor.

func (*RemoteClient) Generate added in v0.4.0

func (c *RemoteClient) Generate(ctx context.Context, prompt string) (string, error)

Generate sends the prompt to the configured provider and returns the model's text response. Every error path maps to ErrLLMUnavailable (wrapped with the cause for diagnostics) so the caller falls back deterministically.

The destination host is echoed to stderr BEFORE the request is sent — the trust posture is that an outbound call is never silent, even if it then fails.

type RemoteConfig added in v0.4.0

type RemoteConfig struct {
	Provider string        // openai | anthropic | openrouter
	BaseURL  string        // optional override; empty → provider default
	Model    string        // model id, e.g. gpt-4o-mini / claude-3-5-haiku-latest
	APIKey   string        // already-resolved key (never logged)
	Timeout  time.Duration // per-call timeout; 0 means no limit
	HTTP     *http.Client  // optional injection for tests
	Stderr   io.Writer     // where the host-echo line is written; nil → os.Stderr
}

RemoteConfig is the resolved transport configuration for a remote call. It is deliberately flat so cmd/excise can build it straight from the config.LLM block + CLI overrides.

type RemoteReranker added in v0.4.0

type RemoteReranker struct {
	Client *RemoteClient
}

RemoteReranker implements the same Reranker interface as OllamaReranker. It renders the identical rerank prompt (prompt.go) and parses the reply with the identical parser, so the only difference from the Ollama path is transport + auth.

func NewRemoteReranker added in v0.4.0

func NewRemoteReranker(client *RemoteClient) *RemoteReranker

NewRemoteReranker is the canonical constructor.

func (*RemoteReranker) Rerank added in v0.4.0

func (r *RemoteReranker) Rerank(ctx context.Context, s *session.Session, shortlist []suggest.TurnScore) ([]suggest.TurnScore, error)

Rerank implements the Reranker contract.

type Reranker

type Reranker interface {
	Rerank(ctx context.Context, s *session.Session, shortlist []suggest.TurnScore) ([]suggest.TurnScore, error)
}

Reranker is the v0.3 interface — Rerank takes the v0.2 heuristic shortlist and returns a reordered copy with each entry's LLMReason populated.

The interface is intentionally tiny so adding a v0.4 remote backend (or a stub for tests) is a single method, not a re-wiring of the CLI.

func NewStubReranker

func NewStubReranker(err error, picks []struct {
	TurnID string
	Score  float64
	Reason string
}) Reranker

NewStubReranker is a test helper that returns a Reranker which:

  • on err != nil, returns err immediately (use to exercise the fallback path)
  • otherwise merges the supplied (turn_id, score, reason) tuples onto the incoming shortlist via the same mergeReplies path the real reranker uses.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL