bed

package module
v0.0.0-...-8b8aa9b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 14, 2025 License: MIT Imports: 12 Imported by: 0

README

bed - Semantic Search CLI with NDCG@10 Eval

Fast semantic search for directories with GPU-accelerated CAGRA indexing and NDCG@10 benchmarking.

Features

  • Directory Search: Index and search through code/docs directories
  • GPU CAGRA: CUDA-accelerated graph-based ANN search
  • NDCG@10 Eval: Standard IR benchmark metric for ranking quality
  • Recall@K: Traditional retrieval quality metric
  • Performance: P50/P95 latency, QPS metrics

Quick Start

# Build CLI
cd /home/lee/code/gobed
go build -o bed-cli cmd/bed/main.go

# Search a directory
./bed-cli -dir ./docs -q "GPU acceleration" -k 10

# Run benchmark with NDCG@10
./bed-cli -dir ./docs -bench -queries 100 -k 10

API Usage

import "github.com/lee101/gobed/bed"

// CPU eval with NDCG@10
cfg := bed.EvalConfig{K: 10, NumQueries: 100, Warmup: 10}
result, _ := bed.RunEval(model, docs, cfg)
fmt.Printf("NDCG@10: %.4f, Recall@10: %.4f\n", result.NDCGAtK, result.RecallAtK)

// GPU CAGRA eval
gpuCfg := gobed.DefaultGPUCagraConfig()
gpuResult, _ := bed.RunEvalGPU(model, docs, cfg, gpuCfg)
fmt.Printf("NDCG@10: %.4f\n", gpuResult.NDCGAtK)

Metrics

  • NDCG@K: Normalized Discounted Cumulative Gain - ranks by relevance scores
  • Recall@K: Fraction of relevant docs in top-K
  • P50/P95: Latency percentiles
  • QPS: Queries per second

GPU Setup

# Build GPU CAGRA library
make gpu-build

# Build with GPU tags
go build -tags="gpu" -o bed-cli cmd/bed/main.go

# Set library path
export LD_LIBRARY_PATH=/home/lee/code/gobed/gpu:$LD_LIBRARY_PATH

# Run with GPU
./bed-cli -bench -gpu -queries 1000

Files

  • ndcg.go - NDCG@K metric implementation
  • eval.go - CPU evaluation harness
  • eval_gpu.go - GPU evaluation harness
  • fsindexer.go - Directory/file indexer
  • cmd/bed/main.go - CLI tool

Test

go run cmd/test_ndcg/main.go

Output:

NDCG@5: 0.9995
Recall@5: 1.0000
P50: 0.08ms, QPS: 11988.1

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NDCG

func NDCG(relevances map[int]float64, predicted []int, k int) float64

NDCG computes Normalized Discounted Cumulative Gain at K. relevances maps doc ID to relevance score (higher = more relevant). predicted is the ordered list of predicted doc IDs.

func NDCGAtK

func NDCGAtK(goldRelevances []map[int]float64, predicted [][]int, k int) float64

NDCGAtK computes NDCG@K for multiple queries. goldRelevances[i] is the relevance map for query i. predicted[i] is the predicted ranking for query i.

Types

type EvalConfig

type EvalConfig struct {
	K          int
	NumQueries int
	Warmup     int
}

EvalConfig captures basic evaluation knobs.

type EvalGPUResult

type EvalGPUResult struct {
	K, NumQueries int
	P50SearchMs   float64
	P95SearchMs   float64
	P50EndToEndMs float64
	P95EndToEndMs float64
	QPS           float64
	RecallAtK     float64
}

func RunEvalGPU

func RunEvalGPU(model *gobed.EmbeddingModel, docs []gobed.Document, base EvalConfig, graph gobed.GPUCagraConfig) (EvalGPUResult, error)

type EvalResult

type EvalResult struct {
	K            int
	NumQueries   int
	P50LatencyMs float64
	P95LatencyMs float64
	QPS          float64
	RecallAtK    float64
	NDCGAtK      float64
}

EvalResult summarizes latency and recall metrics.

func RunEval

func RunEval(model *gobed.EmbeddingModel, docs []gobed.Document, cfg EvalConfig) (EvalResult, error)

RunEval builds an index with provided documents, then evaluates search latency and recall@K against a brute-force baseline on CPU. Embeddings are computed once for the baseline/queries.

type FSIndexer

type FSIndexer struct {
	// contains filtered or unexported fields
}

FSIndexer indexes a filesystem tree into a VectorIndex.

func NewFSIndexer

func NewFSIndexer(root string, model *gobed.EmbeddingModel, cfg gobed.VectorIndexConfig) (*FSIndexer, error)

NewFSIndexer creates a filesystem indexer with a fresh VectorIndex.

func (*FSIndexer) CreatedAt

func (f *FSIndexer) CreatedAt() time.Time

CreatedAt returns now for simple logging; reserved for future persistent state.

func (*FSIndexer) Index

func (f *FSIndexer) Index() *gobed.VectorIndex

Index exposes the underlying vector index (read-only usage recommended).

func (*FSIndexer) IndexAll

func (f *FSIndexer) IndexAll() (int, error)

IndexAll walks the root and indexes supported text files. It chunks files into paragraphs (blank-line separated) to improve recall.

func (*FSIndexer) Reindex

func (f *FSIndexer) Reindex() (int, error)

Reindex updates changed files (mtime-based). For now, rebuilds entries per changed file.

func (*FSIndexer) Root

func (f *FSIndexer) Root() string

Root returns the root path of the indexer.

type FileDoc

type FileDoc struct {
	ID      int
	Path    string
	Offset  int // byte offset in file
	Length  int // bytes
	Text    string
	ModTime int64 // file mtime unix
}

FileDoc represents a chunk of a file indexed as a document.

Directories

Path Synopsis
benchmark_final.go - Final comprehensive benchmark for bed tool
benchmark_final.go - Final comprehensive benchmark for bed tool
cmd
benchmark command
tests
cmd command
test_gpu_benchmark.go - RTX 3090 performance test
test_gpu_benchmark.go - RTX 3090 performance test

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL