bench

package
v0.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 10, 2026 License: AGPL-3.0 Imports: 12 Imported by: 0

Documentation

Overview

Package bench implements the `bubblefish bench` command: throughput, latency, and retrieval-evaluation benchmarks against a running Nexus daemon.

All benchmarks are HTTP-client-based (not in-process) as required by Tech Spec Section 13.4.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func MRR

func MRR(retrieved []string, expected map[string]bool) float64

MRR computes Mean Reciprocal Rank. It returns 1/(rank of first relevant result) or 0 if no relevant results are found.

func NDCG

func NDCG(retrieved []string, expectedOrder []string) float64

NDCG computes Normalized Discounted Cumulative Gain. expectedOrder is the ideal ranking. retrieved is the actual ranking from the system.

func Percentile

func Percentile(sorted []float64, p float64) float64

Percentile computes the p-th percentile from a sorted slice of float64 values. Returns 0 if the slice is empty. p must be in [0, 100].

func PrecisionRecall

func PrecisionRecall(retrieved []string, expected map[string]bool) (float64, float64, int)

PrecisionRecall computes precision and recall given a list of retrieved IDs and a set of expected (relevant) IDs. Returns (precision, recall, relevant_count).

func Run

func Run(opts Options) (interface{}, error)

Run executes the benchmark specified by opts and returns the JSON-serializable result. It writes machine-readable output to opts.OutputFile if set.

Types

type EvalResult

type EvalResult struct {
	Mode      string  `json:"mode"`
	Precision float64 `json:"precision"`
	Recall    float64 `json:"recall"`
	MRR       float64 `json:"mrr"`
	NDCG      float64 `json:"ndcg"`
	Expected  int     `json:"expected"`
	Retrieved int     `json:"retrieved"`
	Relevant  int     `json:"relevant"`
}

EvalResult holds the outcome of an eval benchmark.

type LatencyResult

type LatencyResult struct {
	Mode              string             `json:"mode"`
	Requests          int                `json:"requests"`
	DurationMs        float64            `json:"duration_ms"`
	P50Ms             float64            `json:"p50_ms"`
	P95Ms             float64            `json:"p95_ms"`
	P99Ms             float64            `json:"p99_ms"`
	PerStageLatencyMs map[string]float64 `json:"per_stage_latency_ms"`
	StagesHit         []string           `json:"stages_hit"`
	CacheHitRatio     float64            `json:"cache_hit_ratio"`
	Errors            int                `json:"errors"`
}

LatencyResult holds the outcome of a latency benchmark.

type Options

type Options struct {
	// Mode is one of "throughput", "latency", or "eval".
	Mode string
	// URL is the base URL of the running Nexus daemon (e.g. http://127.0.0.1:8000).
	URL string
	// N is the number of requests to issue.
	N int
	// Concurrency is the number of concurrent workers (throughput mode only).
	Concurrency int
	// Source is the source name for write requests.
	Source string
	// Destination is the destination name for read requests.
	Destination string
	// APIKey is the data-plane API key.
	APIKey string
	// AdminKey is the admin token (needed for debug_stages in latency mode).
	AdminKey string
	// GoldenFile is the path to a known-good JSON for eval mode.
	GoldenFile string
	// Query is the search query string for latency and eval modes.
	Query string
	// OutputFile is the optional path for machine-readable JSON results.
	OutputFile string
	// Logger for structured output.
	Logger *slog.Logger
}

Options configures a benchmark run.

type ThroughputResult

type ThroughputResult struct {
	Mode        string  `json:"mode"`
	Requests    int     `json:"requests"`
	Concurrency int     `json:"concurrency"`
	DurationMs  float64 `json:"duration_ms"`
	ReqPerSec   float64 `json:"req_per_sec"`
	P50Ms       float64 `json:"p50_ms"`
	P95Ms       float64 `json:"p95_ms"`
	P99Ms       float64 `json:"p99_ms"`
	Errors      int     `json:"errors"`
}

ThroughputResult holds the outcome of a throughput benchmark.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL