llm

package

v0.4.0 Latest Latest Go to latest Published: Jun 20, 2026 License: MIT Imports: 15 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/SuperMarioYL/excise

Links

Open Source Insights

Documentation ¶

Overview ¶

Package llm implements Excise's v0.3 opt-in rerank against a local Ollama host. Everything in this package is reachable only when the user passes `--llm`; the v0.2 heuristic path never imports it.

The trust contract documented in the README is enforced here:

the only outbound HTTP call is to the host configured in excise.toml (default http://localhost:11434), nothing else
timeouts are bounded by the caller-supplied context (default 20s)
any failure (network, HTTP status, JSON parse) is mapped to ErrLLMUnavailable so callers can fall back deterministically without having to recognise every failure mode

We intentionally use net/http with no third-party SDK — Ollama's /api/generate is a single endpoint, the JSON shape is tiny, and adding a dependency is more code to audit than the 30 lines below.

remote.go — v0.4 opt-in remote rerank backend.

This file adds a SECOND implementation of the existing Reranker interface (see rerank.go) for the three remote providers the plan calls out:

openai — POST <base>/v1/chat/completions (OpenAI chat shape)
openrouter — same wire shape as OpenAI (OpenRouter is OpenAI-compatible)
anthropic — POST <base>/v1/messages (Anthropic messages shape)

It is a NEW package member, not a refactor: cmd/excise selects which concrete Reranker to construct, and the rerank call site is unchanged.

Trust contract (README + plan §v0.4 trust contract):

The remote path is reachable ONLY when the user sets backend=remote AND supplies a key. The default backend stays local Ollama.
The destination host is echoed to stderr on EVERY remote call so an outbound call is never silent.
Any failure (auth, timeout, non-2xx, parse) maps to ErrLLMUnavailable — the SAME sentinel the Ollama path returns — so the caller's existing fallback branch works unchanged.

We use net/http with no provider SDK: each provider is one POST with a small JSON body, and a dependency would be more to audit than the code below.

Index ¶

Variables
type OllamaClient
- func NewOllamaClient(host, model string, timeout time.Duration) *OllamaClient
- func (c *OllamaClient) Generate(ctx context.Context, prompt string) (string, error)
type OllamaReranker
- func NewOllamaReranker(client *OllamaClient) *OllamaReranker
- func (r *OllamaReranker) Rerank(ctx context.Context, s *session.Session, shortlist []suggest.TurnScore) ([]suggest.TurnScore, error)
type RemoteClient
- func NewRemoteClient(cfg RemoteConfig) *RemoteClient
- func (c *RemoteClient) Generate(ctx context.Context, prompt string) (string, error)
type RemoteConfig
type RemoteReranker
- func NewRemoteReranker(client *RemoteClient) *RemoteReranker
- func (r *RemoteReranker) Rerank(ctx context.Context, s *session.Session, shortlist []suggest.TurnScore) ([]suggest.TurnScore, error)
type Reranker
- func NewStubReranker(err error, picks []struct{ ... }) Reranker

Constants ¶

This section is empty.

Variables ¶

View Source

var ErrLLMUnavailable = errors.New("llm: backend unavailable")

ErrLLMUnavailable is the single sentinel error this package returns when anything goes wrong: HTTP failure, non-2xx response, timeout, body too short, JSON parse failure. Callers compare with errors.Is.

Functions ¶

This section is empty.

Types ¶

type OllamaClient ¶

type OllamaClient struct {
	Host    string        // base URL, e.g. http://localhost:11434
	Model   string        // model tag, e.g. llama3.2
	Timeout time.Duration // per-call timeout; 0 means no limit (not recommended)
	HTTP    *http.Client  // optional injection for tests
}

OllamaClient is the tiny HTTP shim around Ollama's /api/generate. Construction is cheap and concurrency-safe; one instance per rerank call is fine.

func NewOllamaClient ¶

func NewOllamaClient(host, model string, timeout time.Duration) *OllamaClient

NewOllamaClient is the canonical constructor.

func (*OllamaClient) Generate ¶

func (c *OllamaClient) Generate(ctx context.Context, prompt string) (string, error)

Generate sends a single prompt and returns the model's full text response. Any error path maps to ErrLLMUnavailable wrapped with the original cause for diagnostic logging, so callers can `errors.Is(err, ErrLLMUnavailable)` without checking every leaf.

type OllamaReranker ¶

type OllamaReranker struct {
	Client *OllamaClient
}

OllamaReranker is the production implementation. It defers all transport to OllamaClient and all parsing to parseRerankReply, then merges the returned scores/reasons back onto the original TurnScore records.

If the LLM returns a turn id we don't recognise, that entry is dropped (we never invent turns). If the LLM omits a turn id from the input, we keep it at its heuristic position with an empty reason — better partial information than total fallback.

func NewOllamaReranker ¶

func NewOllamaReranker(client *OllamaClient) *OllamaReranker

NewOllamaReranker is the canonical constructor.

func (*OllamaReranker) Rerank ¶

func (r *OllamaReranker) Rerank(ctx context.Context, s *session.Session, shortlist []suggest.TurnScore) ([]suggest.TurnScore, error)

Rerank implements the Reranker contract.

type RemoteClient ¶ added in v0.4.0

type RemoteClient struct {
	// contains filtered or unexported fields
}

RemoteClient is the tiny HTTP shim around a provider's chat endpoint.

func NewRemoteClient ¶ added in v0.4.0

func NewRemoteClient(cfg RemoteConfig) *RemoteClient

NewRemoteClient is the canonical constructor.

func (*RemoteClient) Generate ¶ added in v0.4.0

func (c *RemoteClient) Generate(ctx context.Context, prompt string) (string, error)

Generate sends the prompt to the configured provider and returns the model's text response. Every error path maps to ErrLLMUnavailable (wrapped with the cause for diagnostics) so the caller falls back deterministically.

The destination host is echoed to stderr BEFORE the request is sent — the trust posture is that an outbound call is never silent, even if it then fails.

type RemoteConfig ¶ added in v0.4.0

type RemoteConfig struct {
	Provider string        // openai | anthropic | openrouter
	BaseURL  string        // optional override; empty → provider default
	Model    string        // model id, e.g. gpt-4o-mini / claude-3-5-haiku-latest
	APIKey   string        // already-resolved key (never logged)
	Timeout  time.Duration // per-call timeout; 0 means no limit
	HTTP     *http.Client  // optional injection for tests
	Stderr   io.Writer     // where the host-echo line is written; nil → os.Stderr
}

RemoteConfig is the resolved transport configuration for a remote call. It is deliberately flat so cmd/excise can build it straight from the config.LLM block + CLI overrides.

type RemoteReranker ¶ added in v0.4.0

type RemoteReranker struct {
	Client *RemoteClient
}

RemoteReranker implements the same Reranker interface as OllamaReranker. It renders the identical rerank prompt (prompt.go) and parses the reply with the identical parser, so the only difference from the Ollama path is transport + auth.

func NewRemoteReranker ¶ added in v0.4.0

func NewRemoteReranker(client *RemoteClient) *RemoteReranker

NewRemoteReranker is the canonical constructor.

func (*RemoteReranker) Rerank ¶ added in v0.4.0

func (r *RemoteReranker) Rerank(ctx context.Context, s *session.Session, shortlist []suggest.TurnScore) ([]suggest.TurnScore, error)

Rerank implements the Reranker contract.

type Reranker ¶

type Reranker interface {
	Rerank(ctx context.Context, s *session.Session, shortlist []suggest.TurnScore) ([]suggest.TurnScore, error)
}

Reranker is the v0.3 interface — Rerank takes the v0.2 heuristic shortlist and returns a reordered copy with each entry's LLMReason populated.

The interface is intentionally tiny so adding a v0.4 remote backend (or a stub for tests) is a single method, not a re-wiring of the CLI.

func NewStubReranker ¶

func NewStubReranker(err error, picks []struct {
	TurnID string
	Score  float64
	Reason string
}) Reranker

NewStubReranker is a test helper that returns a Reranker which:

on err != nil, returns err immediately (use to exercise the fallback path)
otherwise merges the supplied (turn_id, score, reason) tuples onto the incoming shortlist via the same mergeReplies path the real reranker uses.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL