Documentation
¶
Overview ¶
Command eval runs mneme's prompt-evaluation harness against a live LLM and prints a per-metric table plus an aggregate score per prompt version. It is the number we trust when we change the extraction prompt.
Usage:
go run ./cmd/eval # uses .env / MNEME_* env, fake embedder go run ./cmd/eval -model x -k 5 # override model and search depth
It is intentionally outside `go test ./...` (which must stay offline): a run needs network and an API key.
Click to show internal directories.
Click to hide internal directories.