search

package

v1.0.44 Latest Latest Go to latest Published: May 2, 2026 License: MIT Imports: 30 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/orneryd/nornicdb

Links

Open Source Insights

Documentation ¶

Rendered for

Overview ¶

Package search provides full-text indexing with BM25 scoring.

Package search - HNSW configuration and quality presets.

Package search provides HNSW vector indexing for fast approximate nearest neighbor search.

HNSW Delete/Update Policy:

Delete:

Remove() tombstones a vector via a dense `deleted []bool` flag
Neighbor lists are not eagerly rewired (tombstones keep deletes cheap)
Entry point is re-selected if the removed node was the entry point

Update:

Current policy: Remove() + Add() pattern
Call Remove(id) then Add(id, newVector) to update a vector
This ensures the graph structure is correctly maintained
Future: A dedicated Update() method may be added for efficiency

Graph Quality:

High-churn workloads (many updates/deletes) can degrade graph quality
Periodic rebuilds are recommended (see NORNICDB_VECTOR_ANN_REBUILD_INTERVAL)
Rebuilds restore optimal graph structure and improve recall

Package search - Kalman filter integration for stable search ranking.

This file provides the KalmanAdapter that enhances the search service with:

Similarity score smoothing: Reduce noise in vector similarity scores
Ranking stability: Prevent results from jumping around between queries
Trend detection: Identify documents becoming more/less relevant
Latency prediction: Smooth query latency for better resource planning

Why Smooth Search Scores? ¶

Vector similarity scores can be noisy due to:

Embedding model variability (same query, slightly different embeddings)
Approximation in ANN (approximate nearest neighbor) indexes
Edge cases where documents are equidistant from query

Kalman filtering smooths these variations to provide:

More consistent rankings across similar queries
Gradual relevance transitions (not sudden jumps)
Confidence estimates for borderline results

Integration Architecture ¶

┌─────────────────────────────────────────────────────────────┐
│                  Kalman Search Adapter                       │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────────┐    ┌─────────────────────────────────┐ │
│  │  Search Service │───▶│ Kalman Filter (scores)          │ │
│  │  (raw scores)   │    │ - Per-document filters          │ │
│  └────────┬────────┘    │ - Score history smoothing       │ │
│           │             └─────────────────────────────────┘ │
│           │                                                 │
│           ▼                                                 │
│  ┌─────────────────────────────────────────────────────────┐│
│  │     Ranking Stabilizer                                  ││
│  │  • Detect score ties                                    ││
│  │  • Apply stability boost to consistent docs             ││
│  │  • Prevent rank oscillation                             ││
│  └─────────────────────────────────────────────────────────┘│
│           │                                                 │
│           ▼                                                 │
│  ┌─────────────────────────────────────────────────────────┐│
│  │     Latency Predictor                                   ││
│  │  • Smooth query latency measurements                    ││
│  │  • Predict P99 latency                                  ││
│  │  • Resource scaling suggestions                         ││
│  └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘

ELI12 (Explain Like I'm 12) ¶

Imagine you're searching Google for "cute cats":

**Without Kalman:**

Search 1: Cat A is #1, Cat B is #2
Search 2: Cat B is #1, Cat A is #2 (they swapped!)
Search 3: Cat C is #1 (where did that come from?!)

**With Kalman:**

Kalman remembers: "Cat A has been top 3 for a while"
When Cat B briefly gets a tiny score boost, Kalman says: "That's probably noise. Cat A stays #1 until I see a REAL trend."
Result: Rankings stay consistent, you find things where you expect them!

It's like having a wise librarian who remembers "this book was popular before" instead of reshuffling the entire library every time someone asks a question.

Package search - K-means cluster routing candidate generator.

This file implements KMeansCandidateGen, which uses k-means clustering to route queries to the most relevant clusters, then generates candidates from those clusters. This is optimal for very large datasets (N > 100K) where cluster routing provides significant speedup over HNSW.

Trigger Policies:

K-means clustering is triggered automatically:

After bulk loads: When BuildIndexes() completes, clustering runs automatically
Periodic clustering: Background timer runs clustering at regular intervals (configurable via clustering interval)
Manual trigger: Call TriggerClustering() after bulk data loading

The candidate generator automatically uses k-means routing when:

Clustering is enabled (EnableClustering() called)
ClusterIndex is clustered (Cluster() has been run)
Dataset is large enough to benefit (typically N > 100K)

For smaller datasets, the pipeline automatically falls back to HNSW or brute-force.

Package search provides cross-encoder reranking for improved search quality.

Cross-encoder reranking is a two-stage retrieval approach:

Stage 1 (Fast): Bi-encoder retrieval - vector similarity + BM25 - Uses pre-computed embeddings for O(log N) search - Returns top-K candidates (e.g., 100)
Stage 2 (Accurate): Cross-encoder reranking - Passes (query, document) pairs through a cross-encoder model - Model sees both query and document together → more accurate - Re-scores and re-ranks the top-K candidates - Returns final top-N results (e.g., 10)

Why Cross-Encoder?

Bi-encoders (like embeddings) encode query and document separately:

query_emb  = encode(query)
doc_emb    = encode(document)
score      = cosine(query_emb, doc_emb)

Cross-encoders encode them together:

score = cross_encode(query, document)  // sees interaction!

The cross-encoder can capture fine-grained semantic relationships that bi-encoders miss, but it's slower (O(N) vs O(log N)).

ELI12 (Explain Like I'm 12):

Imagine finding a book in a library:

Bi-encoder (Stage 1): Like using the card catalog. Fast lookup by category/keywords, but might miss nuances.
Cross-encoder (Stage 2): Like actually reading each book's summary. More accurate, but takes longer. So we only do it for the top candidates from Stage 1.

Reference: Nogueira & Cho (2019) "Passage Re-ranking with BERT"

Package search provides unified hybrid search with Reciprocal Rank Fusion (RRF).

This package implements the same hybrid search approach used by production systems like Azure AI Search, Elasticsearch, and Weaviate: combining vector similarity search with BM25 full-text search using Reciprocal Rank Fusion.

Search Capabilities:

Vector similarity search (cosine similarity with HNSW index)
BM25 full-text search (keyword matching with TF-IDF)
RRF hybrid search (fuses vector + BM25 results)
Adaptive weighting based on query characteristics
Automatic fallback when one method fails

Example Usage:

// Create search service
svc := search.NewService(storageEngine)

// Build indexes from existing nodes
if err := svc.BuildIndexes(ctx); err != nil {
	log.Fatal(err)
}

// Perform hybrid search
query := "machine learning algorithms"
embedding := embedder.Embed(ctx, query) // Get from embed package
opts := search.DefaultSearchOptions()

response, err := svc.Search(ctx, query, embedding, opts)
if err != nil {
	log.Fatal(err)
}

for _, result := range response.Results {
	fmt.Printf("[%.3f] %s\n", result.RRFScore, result.Title)
}

How RRF Works:

RRF (Reciprocal Rank Fusion) combines rankings from multiple search methods. Instead of merging scores directly (which can be incomparable), RRF uses rank positions to create a unified ranking.

Formula: RRF_score = Σ (weight / (k + rank))

Where:

k is a constant (typically 60) to reduce the impact of high ranks
rank is the position in the result list (1-indexed)
weight allows emphasizing one method over another

Example: A document ranked #1 in vector search and #3 in BM25:

RRF = (1.0 / (60 + 1)) + (1.0 / (60 + 3))
    = (1.0 / 61) + (1.0 / 63)
    = 0.0164 + 0.0159
    = 0.0323

Documents that appear in both result sets get boosted scores.

ELI12 (Explain Like I'm 12):

Imagine two friends ranking pizza places:

Friend A (vector search) ranks by taste similarity to your favorite
Friend B (BM25) ranks by matching your description "spicy pepperoni"

They might disagree! Friend A says place X is #1 (tastes similar), while Friend B says it's #5 (doesn't match keywords well).

RRF solves this by: 1. If a place appears in BOTH lists, it gets bonus points 2. Higher ranks (being at the top) give more points 3. The magic number 60 prevents #1 from completely dominating

This way, a place that's #2 in both lists beats a place that's #1 in one but missing from the other!

Index persistence and WAL alignment:

NornicDB storage uses a write-ahead log (WAL) for durability; graph state is recovered from WAL (and snapshots) on startup. Search indexes (BM25 and vector) are built or loaded after storage recovery, so they always reflect the WAL-consistent state. Index files use semver format versioning (Qdrant-style): only indexes with the same format version are loaded; older or newer versions are rejected so the caller can rebuild. Persisted indexes are saved on a debounced schedule after mutations and on shutdown.

Result caching:

Search() results are cached in-process by query + options (limit, types, rerank, MMR, etc.), with the same semantics as the Cypher query cache: LRU eviction (default 1000 entries), TTL (default 5 minutes), and full invalidation on IndexNode/RemoveNode so results stay correct after index changes. All call paths (HTTP search API, Cypher vector procedures, MCP, etc.) share this cache, so repeated identical searches return immediately.

Package search provides file-backed vector storage for memory-efficient indexing.

VectorFileStore implements append-only vector storage: vectors are written to a .vec file and only id→offset metadata is kept in RAM. This allows BuildIndexes to index large datasets without holding 2–3× vector data in memory. Vectors are stored normalized (one copy per id, cosine-only) per the indexing-memory plan.

Package search provides vector indexing with cosine similarity search.

This file implements a simple but effective vector similarity search index using brute-force cosine similarity calculation. While not as sophisticated as HNSW or other approximate methods, it provides exact results and is suitable for moderate-sized datasets.

Key Features:

Exact cosine similarity search
Automatic vector normalization for performance
Thread-safe concurrent operations
Context-aware search with cancellation
Configurable similarity thresholds

Example Usage:

// Create vector index for 1024-dimensional embeddings
index := search.NewVectorIndex(1024)

// Add vectors to the index
embedding1 := make([]float32, 1024)
// ... populate embedding1 ...
index.Add("doc-1", embedding1)

embedding2 := make([]float32, 1024)
// ... populate embedding2 ...
index.Add("doc-2", embedding2)

// Search for similar vectors
query := make([]float32, 1024)
// ... populate query vector ...
results, err := index.Search(ctx, query, 10, 0.7)
if err != nil {
	log.Fatal(err)
}

// Process results (sorted by similarity)
for _, result := range results {
	fmt.Printf("ID: %s, Similarity: %.3f\n", result.ID, result.Score)
}

Algorithm Details:

The index uses cosine similarity, which measures the cosine of the angle between two vectors. It's particularly effective for high-dimensional embeddings where magnitude is less important than direction.

Cosine Similarity Formula:

similarity = (A · B) / (||A|| × ||B||)

Where:

A · B is the dot product of vectors A and B
||A|| is the magnitude (L2 norm) of vector A
||B|| is the magnitude (L2 norm) of vector B

Optimization:

Vectors are normalized when added to the index, so cosine similarity
becomes just the dot product, making search much faster.

Performance Characteristics:

Add: O(d) where d is vector dimensions
Search: O(n×d) where n is number of vectors
Memory: O(n×d) for storing normalized vectors
Thread-safe: Uses RWMutex for concurrent access

When to Use:

✅ Exact similarity search required
✅ Moderate dataset sizes (<100K vectors)
✅ High-dimensional embeddings (>100 dimensions)
✅ Simplicity and reliability preferred
❌ Very large datasets (>1M vectors)
❌ Approximate search acceptable
❌ Sub-linear search time required

Future Improvements:

HNSW implementation for O(log n) search
GPU acceleration for large batches
Quantization for memory efficiency
Incremental index updates

ELI12 (Explain Like I'm 12):

Think of vector similarity like comparing the "direction" things are pointing:

**Vectors**: Like arrows pointing in different directions in space. Each arrow represents a document, image, or piece of text.
**Cosine similarity**: Measures how much two arrows point in the same direction. If they point exactly the same way, similarity = 1. If they point opposite ways, similarity = -1.
**Normalization**: We make all arrows the same length so we only care about direction, not how "strong" they are.
**Search**: When you give us a query arrow, we check how similar its direction is to all the arrows we've stored, then give you the most similar ones.

It's like finding documents that "point in the same direction" as your search!

Package search - Vector search pipeline with CandidateGen + ExactScore interfaces.

This file implements the unified vector search pipeline as described in the hybrid-ann-plan.md. It provides:

CandidateGenerator interface for approximate candidate generation (brute/HNSW)
ExactScorer interface for exact scoring of candidates (CPU/GPU)
Auto strategy selection (brute vs HNSW based on dataset size)
GPU-accelerated exact scoring when available

Index ¶

Constants
Variables
func BuildIVFPQFromVectorStore(ctx context.Context, vfs *VectorFileStore, profile IVFPQProfile, ...) (*IVFPQIndex, *IVFPQBuildStats, error)
func ConvertSearchIndexDirFromGobToMsgpack(rootDir string) error
func DefaultBM25Engine() string
func DeriveIVFCentroidsFromClusters(hnswPath string, vectorLookup VectorLookup) (centroids [][]float32, idToCluster map[string]int, err error)
func SaveIVFHNSW(hnswPath string, clusterHNSW map[int]*HNSWIndex) error
func SaveIVFPQBundle(basePath string, idx *IVFPQIndex) error
type ANNQuality
- func ANNQualityFromEnv() ANNQuality
type ANNResult
type BruteForceCandidateGen
- func NewBruteForceCandidateGen(vectorIndex *VectorIndex) *BruteForceCandidateGen
- func (b *BruteForceCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)
type BuildProgress
type CPUExactScorer
- func NewCPUExactScorer(getter VectorGetter) *CPUExactScorer
- func (c *CPUExactScorer) ScoreCandidates(ctx context.Context, query []float32, candidates []Candidate) ([]ScoredCandidate, error)
type Candidate
type CandidateGenerator
type CompressedANNProfile
- func ResolveCompressedANNProfile(vectorCount, dimensions int, vectorStoreReady bool) CompressedANNProfile
type CompressedActivationDiagnostic
type CrossEncoder
- func NewCrossEncoder(config *CrossEncoderConfig) *CrossEncoder
- func (ce *CrossEncoder) Config() *CrossEncoderConfig
- func (ce *CrossEncoder) Enabled() bool
- func (ce *CrossEncoder) IsAvailable(ctx context.Context) bool
- func (ce *CrossEncoder) Name() string
- func (ce *CrossEncoder) Rerank(ctx context.Context, query string, candidates []RerankCandidate) ([]RerankResult, error)
type CrossEncoderConfig
- func DefaultCrossEncoderConfig() *CrossEncoderConfig
type ExactScorer
type FileStoreBruteForceCandidateGen
- func NewFileStoreBruteForceCandidateGen(vectorStore *VectorFileStore) *FileStoreBruteForceCandidateGen
- func (b *FileStoreBruteForceCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)
type FulltextBatchEntry
type FulltextIndex
- func NewFulltextIndex() *FulltextIndex
- func (f *FulltextIndex) Clear()
- func (f *FulltextIndex) Count() int
- func (f *FulltextIndex) GetDocument(id string) (string, bool)
- func (f *FulltextIndex) Index(id string, text string)
- func (f *FulltextIndex) IndexBatch(entries []FulltextBatchEntry)
- func (f *FulltextIndex) IsDirty() bool
- func (f *FulltextIndex) LexicalSeedDocIDs(maxTerms, docsPerTerm int) []string
- func (f *FulltextIndex) Load(path string) error
- func (f *FulltextIndex) PhraseSearch(phrase string, limit int) []indexResult
- func (f *FulltextIndex) Remove(id string)
- func (f *FulltextIndex) Save(path string) error
- func (f *FulltextIndex) SaveNoCopy(path string) error
- func (f *FulltextIndex) Search(query string, limit int) []indexResult
type FulltextIndexV2
- func NewFulltextIndexV2() *FulltextIndexV2
- func (f *FulltextIndexV2) Clear()
- func (f *FulltextIndexV2) Count() int
- func (f *FulltextIndexV2) GetDocument(id string) (string, bool)
- func (f *FulltextIndexV2) Index(id string, text string)
- func (f *FulltextIndexV2) IndexBatch(entries []FulltextBatchEntry)
- func (f *FulltextIndexV2) IsDirty() bool
- func (f *FulltextIndexV2) LexicalSeedDocIDs(maxTerms, docsPerTerm int) []string
- func (f *FulltextIndexV2) Load(path string) error
- func (f *FulltextIndexV2) PhraseSearch(phrase string, limit int) []indexResult
- func (f *FulltextIndexV2) Remove(id string)
- func (f *FulltextIndexV2) Save(path string) error
- func (f *FulltextIndexV2) SaveNoCopy(path string) error
- func (f *FulltextIndexV2) Search(query string, limit int) []indexResult
type GPUBruteForceCandidateGen
- func NewGPUBruteForceCandidateGen(embeddingIndex *gpu.EmbeddingIndex) *GPUBruteForceCandidateGen
- func (g *GPUBruteForceCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)
type GPUExactScorer
- func NewGPUExactScorer(embeddingIndex *gpu.EmbeddingIndex, cpuFallback *CPUExactScorer) *GPUExactScorer
- func (g *GPUExactScorer) ScoreCandidates(ctx context.Context, query []float32, candidates []Candidate) ([]ScoredCandidate, error)
type GPUKMeansCandidateGen
- func NewGPUKMeansCandidateGen(clusterIndex *gpu.ClusterIndex, numClustersToSearch int) *GPUKMeansCandidateGen
- func (g *GPUKMeansCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)
- func (g *GPUKMeansCandidateGen) SetClusterSelector(fn func(ctx context.Context, query []float32, defaultN int) []int) *GPUKMeansCandidateGen
type HNSWCandidateGen
- func NewHNSWCandidateGen(hnswIndex *HNSWIndex) *HNSWCandidateGen
- func (h *HNSWCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)
type HNSWConfig
- func DefaultHNSWConfig() HNSWConfig
- func HNSWConfigFromEnv() HNSWConfig
type HNSWIndex
- func LoadHNSWIndex(path string, vectorLookup VectorLookup) (*HNSWIndex, error)
- func LoadIVFHNSWCluster(hnswPath string, clusterID int, vectorLookup VectorLookup) (*HNSWIndex, error)
- func NewHNSWIndex(dimensions int, config HNSWConfig) *HNSWIndex
- func (h *HNSWIndex) Add(id string, vec []float32) error
- func (h *HNSWIndex) Clear()
- func (h *HNSWIndex) Config() HNSWConfig
- func (h *HNSWIndex) GetDimensions() int
- func (h *HNSWIndex) Remove(id string)
- func (h *HNSWIndex) Save(path string) error
- func (h *HNSWIndex) Search(ctx context.Context, query []float32, k int, minSimilarity float64) ([]ANNResult, error)
- func (h *HNSWIndex) SearchWithEf(ctx context.Context, query []float32, k int, minSimilarity float64, ef int) ([]ANNResult, error)
- func (h *HNSWIndex) SetVectorLookup(lookup VectorLookup)
- func (h *HNSWIndex) ShouldRebuild() bool
- func (h *HNSWIndex) Size() int
- func (h *HNSWIndex) TombstoneRatio() float64
- func (h *HNSWIndex) Update(id string, vec []float32) error
type HNSWQualityPreset
type IVFHNSWCandidateGen
- func NewIVFHNSWCandidateGen(clusterIndex *gpu.ClusterIndex, getClusterHNSW clusterHNSWLookup, ...) *IVFHNSWCandidateGen
- func (g *IVFHNSWCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)
- func (g *IVFHNSWCandidateGen) SetClusterSelector(fn func(ctx context.Context, query []float32, defaultN int) []int) *IVFHNSWCandidateGen
type IVFPQBuildStats
type IVFPQCandidateGen
- func NewIVFPQCandidateGen(index *IVFPQIndex, nprobe int) *IVFPQCandidateGen
- func (g *IVFPQCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)
type IVFPQIndex
- func LoadIVFPQBundle(basePath string) (*IVFPQIndex, error)
- func (i *IVFPQIndex) Count() int
- func (i *IVFPQIndex) Profile() IVFPQProfile
- func (i *IVFPQIndex) SearchApprox(ctx context.Context, query []float32, k int, minSimilarity float64, nprobe int) ([]Candidate, error)
type IVFPQProfile
type IdentityExactScorer
- func (i *IdentityExactScorer) ScoreCandidates(ctx context.Context, query []float32, candidates []Candidate) ([]ScoredCandidate, error)
type KMeansCandidateGen
- func NewKMeansCandidateGen(clusterIndex *gpu.ClusterIndex, vectorIndex *VectorIndex, ...) *KMeansCandidateGen
- func (k *KMeansCandidateGen) SearchCandidates(ctx context.Context, query []float32, limit int, minSimilarity float64) ([]Candidate, error)
- func (k *KMeansCandidateGen) SetClusterSelector(fn func(ctx context.Context, query []float32, defaultN int) []int) *KMeansCandidateGen
type KalmanSearchAdapter
- func NewKalmanSearchAdapter(service *Service, config KalmanSearchConfig) *KalmanSearchAdapter
- func (ka *KalmanSearchAdapter) GetDocumentRelevanceTrend(docID string) float64
- func (ka *KalmanSearchAdapter) GetFallingDocuments(maxVelocity float64) []string
- func (ka *KalmanSearchAdapter) GetLatencyTrend() float64
- func (ka *KalmanSearchAdapter) GetPredictedLatency(stepsAhead int) float64
- func (ka *KalmanSearchAdapter) GetRisingDocuments(minVelocity float64) []string
- func (ka *KalmanSearchAdapter) GetService() *Service
- func (ka *KalmanSearchAdapter) GetStats() SearchAdapterStats
- func (ka *KalmanSearchAdapter) Reset()
- func (ka *KalmanSearchAdapter) Search(ctx context.Context, query string, embedding []float32, opts *SearchOptions) (*SearchResponse, error)
type KalmanSearchConfig
- func DefaultKalmanSearchConfig() KalmanSearchConfig
type LLMFunc
type LLMReranker
- func NewLLMReranker(cfg *LLMRerankerConfig, llm LLMFunc) *LLMReranker
- func (r *LLMReranker) Enabled() bool
- func (r *LLMReranker) IsAvailable(ctx context.Context) bool
- func (r *LLMReranker) Name() string
- func (r *LLMReranker) Rerank(ctx context.Context, query string, candidates []RerankCandidate) ([]RerankResult, error)
type LLMRerankerConfig
- func DefaultLLMRerankerConfig() *LLMRerankerConfig
type LocalReranker
- func NewLocalReranker(scorer RerankScorer, cfg *LocalRerankerConfig) *LocalReranker
- func (r *LocalReranker) Enabled() bool
- func (r *LocalReranker) IsAvailable(ctx context.Context) bool
- func (r *LocalReranker) Name() string
- func (r *LocalReranker) Rerank(ctx context.Context, query string, candidates []RerankCandidate) ([]RerankResult, error)
type LocalRerankerConfig
- func DefaultLocalRerankerConfig() *LocalRerankerConfig
type NodeIterator
type RerankCandidate
type RerankResult
type RerankScorer
type Reranker
type ScoredCandidate
type SearchAdapterStats
type SearchCandidate
type SearchMetrics
type SearchOptions
- func DefaultSearchOptions() *SearchOptions
- func GetAdaptiveRRFConfig(query string) *SearchOptions
- func (o *SearchOptions) GetMinSimilarity(fallback float64) float64
type SearchResponse
type SearchResult
type Service
- func NewService(engine storage.Engine) *Service
- func NewServiceWithDimensions(engine storage.Engine, dimensions int) *Service
- func NewServiceWithDimensionsAndBM25Engine(engine storage.Engine, dimensions int, bm25Engine string) *Service
- func (s *Service) BM25Engine() string
- func (s *Service) BuildInProgress() bool
- func (s *Service) BuildIndexes(ctx context.Context) error
- func (s *Service) ClearVectorIndex()
- func (s *Service) ClusterStats() *gpu.ClusterStats
- func (s *Service) ClusteringInProgress() bool
- func (s *Service) CrossEncoderAvailable(ctx context.Context) bool
- func (s *Service) CurrentStrategy() string
- func (s *Service) EmbeddingCount() int
- func (s *Service) EnableClustering(gpuManager *gpu.Manager, numClusters int)
- func (s *Service) GetBuildProgress() BuildProgress
- func (s *Service) GetDefaultMinSimilarity() float64
- func (s *Service) GetMinEmbeddingsForClustering() int
- func (s *Service) IndexNode(node *storage.Node) error
- func (s *Service) IsClusteringEnabled() bool
- func (s *Service) IsReady() bool
- func (s *Service) PersistIndexesToDisk()
- func (s *Service) RemoveNode(nodeID storage.NodeID) error
- func (s *Service) RerankCandidates(ctx context.Context, query string, candidates []RerankCandidate, ...) ([]RerankResult, error)
- func (s *Service) RerankerAvailable(ctx context.Context) bool
- func (s *Service) Search(ctx context.Context, query string, embedding []float32, opts *SearchOptions) (*SearchResponse, error)
- func (s *Service) SetCrossEncoder(ce *CrossEncoder)
- func (s *Service) SetDefaultMinSimilarity(threshold float64)
- func (s *Service) SetFulltextIndexPath(path string)
- func (s *Service) SetGPUManager(manager *gpu.Manager)
- func (s *Service) SetHNSWIndexPath(path string)
- func (s *Service) SetMinEmbeddingsForClustering(threshold int)
- func (s *Service) SetPersistenceEnabled(enabled bool)
- func (s *Service) SetReranker(r Reranker)
- func (s *Service) SetVectorIndexPath(path string)
- func (s *Service) TriggerClustering(ctx context.Context) error
- func (s *Service) VectorIndexDimensions() int
- func (s *Service) VectorQueryNodes(ctx context.Context, queryEmbedding []float32, spec VectorQuerySpec) ([]VectorQueryHit, error)
- func (s *Service) VectorSearchCandidates(ctx context.Context, embedding []float32, opts *SearchOptions) ([]SearchCandidate, error)
type VectorFileStore
- func NewVectorFileStore(vecBasePath string, dimensions int) (*VectorFileStore, error)
- func (v *VectorFileStore) Add(id string, vec []float32) error
- func (v *VectorFileStore) Close() error
- func (v *VectorFileStore) CompactIfNeeded() (bool, error)
- func (v *VectorFileStore) Count() int
- func (v *VectorFileStore) GetBuildIndexedCount() int64
- func (v *VectorFileStore) GetDimensions() int
- func (v *VectorFileStore) GetVector(id string) ([]float32, bool)
- func (v *VectorFileStore) Has(id string) bool
- func (v *VectorFileStore) IterateChunked(chunkSize int, fn func(ids []string, vecs [][]float32) error) error
- func (v *VectorFileStore) Load() error
- func (v *VectorFileStore) Remove(id string) bool
- func (v *VectorFileStore) Save() error
- func (v *VectorFileStore) SetBuildIndexedCount(n int64)
- func (v *VectorFileStore) Sync() error
type VectorFileStoreMeta
type VectorGetter
type VectorIndex
- func NewVectorIndex(dimensions int) *VectorIndex
- func (v *VectorIndex) Add(id string, vec []float32) error
- func (v *VectorIndex) Clear()
- func (v *VectorIndex) Count() int
- func (v *VectorIndex) GetDimensions() int
- func (v *VectorIndex) GetVector(id string) ([]float32, bool)
- func (v *VectorIndex) HasVector(id string) bool
- func (v *VectorIndex) Load(path string) error
- func (v *VectorIndex) Remove(id string)
- func (v *VectorIndex) Save(path string) error
- func (v *VectorIndex) Search(ctx context.Context, query []float32, limit int, minSimilarity float64) ([]indexResult, error)
type VectorLookup
type VectorQueryHit
type VectorQuerySpec
type VectorSearchPipeline
- func NewVectorSearchPipeline(candidateGen CandidateGenerator, exactScorer ExactScorer) *VectorSearchPipeline
- func (p *VectorSearchPipeline) Search(ctx context.Context, query []float32, k int, minSimilarity float64) ([]ScoredCandidate, error)

Constants ¶

View Source

const (
	// EnvSearchBM25Engine selects the BM25 engine implementation.
	// Supported values: "v1", "v2". Default: "v2".
	EnvSearchBM25Engine = "NORNICDB_SEARCH_BM25_ENGINE"
	// EnvSearchLogTimings enables per-query timing logs for search stages.
	// When true, logs vector/BM25/fusion/total timing to help identify bottlenecks.
	EnvSearchLogTimings = "NORNICDB_SEARCH_LOG_TIMINGS"
	BM25EngineV1        = "v1"
	BM25EngineV2        = "v2"
)

View Source

const (
	// NSmallMax is the maximum dataset size for which brute-force search is used.
	// Below this threshold, brute-force is often faster than ANN overhead.
	NSmallMax = 5000

	// CandidateMultiplier determines how many candidates to generate relative to k.
	// Formula: C = max(k * CandidateMultiplier, 200)
	CandidateMultiplier = 20

	// MaxCandidates is the hard cap on candidate set size.
	MaxCandidates = 5000
)

Configuration constants for the vector search pipeline.

View Source

const DefaultMinEmbeddingsForClustering = 1000

DefaultMinEmbeddingsForClustering is the default minimum number of embeddings needed before k-means clustering provides any benefit. Below this threshold, brute-force search is faster than cluster overhead.

This value can be overridden per-service using SetMinEmbeddingsForClustering().

Performance Scaling (Real-World Benchmarks):

<1000 embeddings: Clustering overhead > speedup benefit
2,000 embeddings: ~14% faster with clustering
4,500 embeddings: ~26% faster with clustering
10,000+ embeddings: 10-50x faster with clustering

Tuning Guidelines:

1000 (default): Safe for most workloads, proven performance benefit
500-1000: Use for latency-sensitive apps (14-26% speedup range)
100-500: Testing or small datasets (verify clustering works)
2000+: Very large datasets (maximize speedup, delay until more data)

Environment Variable: NORNICDB_KMEANS_MIN_EMBEDDINGS (overrides default)

View Source

const DefaultVectorDimensions = 1024

DefaultVectorDimensions is the embedding size used by NewService (e.g. bge-m3). Use NewServiceWithDimensions or schema/query-derived dimensions when your model differs.

Variables ¶

View Source

var (
	ErrDimensionMismatch = errors.New("vector dimension mismatch")
)

View Source

var ErrSearchIndexBuilding = errors.New("search index being built, please try again when they are complete")

View Source

var SearchableProperties = []string{
	"content",
	"text",
	"title",
	"name",
	"description",
	"path",
	"workerRole",
	"requirements",
}

SearchableProperties defines PRIORITY properties for full-text search ranking. These properties are indexed first for better BM25 ranking. Note: ALL node properties are indexed, but these get priority weighting. These match Neo4j's fulltext index configuration.

Functions ¶

func BuildIVFPQFromVectorStore ¶

func BuildIVFPQFromVectorStore(ctx context.Context, vfs *VectorFileStore, profile IVFPQProfile, seedDocIDs []string) (*IVFPQIndex, *IVFPQBuildStats, error)

BuildIVFPQFromVectorStore builds a full IVF/PQ index from VectorFileStore.

func ConvertSearchIndexDirFromGobToMsgpack ¶

func ConvertSearchIndexDirFromGobToMsgpack(rootDir string) error

ConvertSearchIndexDirFromGobToMsgpack converts search index .gob files under rootDir to msgpack and removes the .gob extension (writes to bm25, vectors, hnsw, hnsw_ivf/0, 1, ...). rootDir should be the search index root (e.g. data/test/search). Back up the directory before running.

func DefaultBM25Engine ¶

func DefaultBM25Engine() string

DefaultBM25Engine returns the configured BM25 engine from environment. Valid values: "v1", "v2". Invalid/empty values fall back to "v2".

func DeriveIVFCentroidsFromClusters ¶

func DeriveIVFCentroidsFromClusters(hnswPath string, vectorLookup VectorLookup) (centroids [][]float32, idToCluster map[string]int, err error)

DeriveIVFCentroidsFromClusters builds centroids and idToCluster from existing hnsw_ivf/ cluster files (numeric names 0, 1, 2, ...) and vectors from the vector index. No separate centroid file. Returns (nil, nil, nil) if no cluster files exist or derivation fails.

func SaveIVFHNSW ¶

func SaveIVFHNSW(hnswPath string, clusterHNSW map[int]*HNSWIndex) error

SaveIVFHNSW persists per-cluster HNSW indexes to disk under hnsw_ivf/ alongside hnsw. hnswPath is the full path to the single HNSW file (e.g. data/search/dbname/hnsw); per-cluster files are written to hnsw_ivf/0, 1, 2, ... (no extension). Each cluster is saved as graph-only.

func SaveIVFPQBundle ¶

func SaveIVFPQBundle(basePath string, idx *IVFPQIndex) error

SaveIVFPQBundle persists an IVFPQ index as an atomic multipart bundle.

Types ¶

type ANNQuality ¶

type ANNQuality string

ANNQuality selects the high-level ANN strategy mode.

const (
	ANNQualityFast       ANNQuality = "fast"
	ANNQualityBalanced   ANNQuality = "balanced"
	ANNQualityAccurate   ANNQuality = "accurate"
	ANNQualityCompressed ANNQuality = "compressed"
)

func ANNQualityFromEnv ¶

func ANNQualityFromEnv() ANNQuality

ANNQualityFromEnv parses the global ANN quality selector. Unknown values default to fast to preserve historical behavior.

type ANNResult ¶

type ANNResult struct {
	ID    string
	Score float32
}

ANNResult is a minimal search result from the ANN index (HNSW).

This intentionally stays small (ID + float32 score) to keep per-request allocations and copy costs low. Higher-level layers can enrich results as needed (labels, properties, etc.).

type BruteForceCandidateGen ¶

type BruteForceCandidateGen struct {
	// contains filtered or unexported fields
}

BruteForceCandidateGen implements CandidateGenerator using brute-force search.

This is optimal for small datasets (N < NSmallMax) where ANN overhead dominates brute-force computation time.

func NewBruteForceCandidateGen ¶

func NewBruteForceCandidateGen(vectorIndex *VectorIndex) *BruteForceCandidateGen

NewBruteForceCandidateGen creates a new brute-force candidate generator.

func (*BruteForceCandidateGen) SearchCandidates ¶

func (b *BruteForceCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)

SearchCandidates generates candidates using brute-force search.

type BuildProgress ¶

type BuildProgress struct {
	Ready           bool
	Building        bool
	Phase           string
	ProcessedNodes  int64
	TotalNodes      int64
	RateNodesPerSec float64
	ETASeconds      int64 // -1 when unknown
}

BuildProgress reports in-progress indexing state for UI/ops visibility.

type CPUExactScorer ¶

type CPUExactScorer struct {
	// contains filtered or unexported fields
}

CPUExactScorer implements ExactScorer using CPU-based exact scoring.

Uses SIMD-optimized dot product for cosine similarity computation.

func NewCPUExactScorer ¶

func NewCPUExactScorer(getter VectorGetter) *CPUExactScorer

NewCPUExactScorer creates a new CPU-based exact scorer. getter can be *VectorIndex or any type that implements GetVector (e.g. file-store lookup adapter).

func (*CPUExactScorer) ScoreCandidates ¶

func (c *CPUExactScorer) ScoreCandidates(ctx context.Context, query []float32, candidates []Candidate) ([]ScoredCandidate, error)

ScoreCandidates computes exact scores for candidates using CPU.

type Candidate ¶

type Candidate struct {
	ID    string
	Score float64 // Approximate score from candidate generation
}

Candidate represents a candidate vector with approximate score.

type CandidateGenerator ¶

type CandidateGenerator interface {
	// SearchCandidates generates candidate vectors for the given query.
	//
	// Parameters:
	//   - ctx: Context for cancellation
	//   - query: Normalized query vector
	//   - k: Desired number of results
	//   - minSimilarity: Minimum similarity threshold (approximate, may be relaxed)
	//
	// Returns:
	//   - candidates: Candidate IDs with approximate scores (may be more than k)
	//   - error: Context cancellation or other errors
	SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)
}

CandidateGenerator generates candidate vectors for approximate search.

Implementations:

BruteForceCandidateGen: Exact search over all vectors (for small N)
HNSWCandidateGen: Approximate search using HNSW graph (for large N)

The generator returns candidate IDs and approximate scores. These candidates will be re-scored exactly by ExactScorer before final ranking.

type CompressedANNProfile ¶

type CompressedANNProfile struct {
	Quality             ANNQuality
	Active              bool
	Dimensions          int
	VectorCount         int
	IVFLists            int
	PQSegments          int
	PQBits              int
	NProbe              int
	RerankTopK          int
	TrainingSampleMax   int
	KMeansMaxIterations int
	SeedMaxTerms        int
	SeedDocsPerTerm     int
	RoutingMode         string
	Diagnostics         []CompressedActivationDiagnostic
}

CompressedANNProfile is the resolved runtime contract for compressed ANN mode.

func ResolveCompressedANNProfile ¶

func ResolveCompressedANNProfile(vectorCount, dimensions int, vectorStoreReady bool) CompressedANNProfile

ResolveCompressedANNProfile resolves compressed ANN settings and readiness. Shared knobs are reused directly where semantically compatible.

type CompressedActivationDiagnostic ¶

type CompressedActivationDiagnostic struct {
	Code    string
	Message string
}

CompressedActivationDiagnostic captures deterministic activation decisions.

type CrossEncoder ¶

type CrossEncoder struct {
	// contains filtered or unexported fields
}

CrossEncoder performs cross-encoder reranking.

func NewCrossEncoder ¶

func NewCrossEncoder(config *CrossEncoderConfig) *CrossEncoder

NewCrossEncoder creates a new cross-encoder reranker.

func (*CrossEncoder) Config ¶

func (ce *CrossEncoder) Config() *CrossEncoderConfig

Config returns the current configuration.

func (*CrossEncoder) Enabled ¶

func (ce *CrossEncoder) Enabled() bool

func (*CrossEncoder) IsAvailable ¶

func (ce *CrossEncoder) IsAvailable(ctx context.Context) bool

IsAvailable checks if the reranking service is available.

func (*CrossEncoder) Name ¶

func (ce *CrossEncoder) Name() string

func (*CrossEncoder) Rerank ¶

func (ce *CrossEncoder) Rerank(ctx context.Context, query string, candidates []RerankCandidate) ([]RerankResult, error)

Rerank takes a query and candidates, returns reranked results.

type CrossEncoderConfig ¶

type CrossEncoderConfig struct {
	// Enabled turns on cross-encoder reranking
	Enabled bool

	// APIURL is the reranking service endpoint
	// Supports: Cohere, HuggingFace TEI, local models
	APIURL string

	// APIKey for authentication (if required)
	APIKey string

	// Model name (e.g., "cross-encoder/ms-marco-MiniLM-L-6-v2")
	Model string

	// TopK is how many candidates to rerank (default: 100)
	TopK int

	// Timeout for reranking requests
	Timeout time.Duration

	// MinScore is the minimum rerank score to include (0-1)
	MinScore float64
}

CrossEncoderConfig configures the cross-encoder reranker.

func DefaultCrossEncoderConfig ¶

func DefaultCrossEncoderConfig() *CrossEncoderConfig

DefaultCrossEncoderConfig returns sensible defaults.

type ExactScorer ¶

type ExactScorer interface {
	// ScoreCandidates computes exact similarity scores for the given candidates.
	//
	// Parameters:
	//   - ctx: Context for cancellation
	//   - query: Normalized query vector
	//   - candidates: Candidate IDs to score (may include approximate scores for reference)
	//
	// Returns:
	//   - scored: Candidates with exact scores
	//   - error: Context cancellation or other errors
	ScoreCandidates(ctx context.Context, query []float32, candidates []Candidate) ([]ScoredCandidate, error)
}

ExactScorer computes exact similarity scores for candidate vectors.

Implementations:

CPUExactScorer: CPU-based exact scoring (SIMD-optimized)
GPUExactScorer: GPU-accelerated exact scoring (when available)

The scorer takes candidate IDs and returns exact scores using the true metric (cosine/dot/euclid), ensuring final ranking accuracy.

type FileStoreBruteForceCandidateGen ¶

type FileStoreBruteForceCandidateGen struct {
	// contains filtered or unexported fields
}

FileStoreBruteForceCandidateGen implements CandidateGenerator using brute-force search directly over the file-backed vector store. This is used for small datasets when vectors are not held in-memory.

func NewFileStoreBruteForceCandidateGen ¶

func NewFileStoreBruteForceCandidateGen(vectorStore *VectorFileStore) *FileStoreBruteForceCandidateGen

NewFileStoreBruteForceCandidateGen creates a brute-force candidate generator over VectorFileStore.

func (*FileStoreBruteForceCandidateGen) SearchCandidates ¶

func (b *FileStoreBruteForceCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)

SearchCandidates scans all vectors from the file store and returns top candidates by exact cosine score.

type FulltextBatchEntry ¶

type FulltextBatchEntry struct {
	ID   string
	Text string
}

FulltextBatchEntry is one (id, text) pair for IndexBatch.

type FulltextIndex ¶

type FulltextIndex struct {
	// contains filtered or unexported fields
}

FulltextIndex provides BM25-based full-text search. It indexes documents and supports keyword search with TF-IDF scoring.

func NewFulltextIndex ¶

func NewFulltextIndex() *FulltextIndex

NewFulltextIndex creates a new full-text search index.

func (*FulltextIndex) Clear ¶

func (f *FulltextIndex) Clear()

Clear removes all indexed documents and terms. This is used to free RAM after persisting during bulk builds.

func (*FulltextIndex) Count ¶

func (f *FulltextIndex) Count() int

Count returns the number of indexed documents.

func (*FulltextIndex) GetDocument ¶

func (f *FulltextIndex) GetDocument(id string) (string, bool)

GetDocument retrieves the original text for a document.

func (*FulltextIndex) Index ¶

func (f *FulltextIndex) Index(id string, text string)

Index adds or updates a document in the index.

func (*FulltextIndex) IndexBatch ¶

func (f *FulltextIndex) IndexBatch(entries []FulltextBatchEntry)

IndexBatch adds or updates many documents under one lock and updates avgDocLength once at the end. Use this during bulk build to reduce lock contention and avoid O(N) avg updates.

func (*FulltextIndex) IsDirty ¶

func (f *FulltextIndex) IsDirty() bool

IsDirty reports whether the index has changes not yet persisted to disk.

func (*FulltextIndex) LexicalSeedDocIDs ¶

func (f *FulltextIndex) LexicalSeedDocIDs(maxTerms, docsPerTerm int) []string

LexicalSeedDocIDs returns document IDs selected from high-IDF terms to provide lexical priors for clustering seed selection.

func (*FulltextIndex) Load ¶

func (f *FulltextIndex) Load(path string) error

Load replaces the fulltext index with the one stored at path (msgpack format). If the file does not exist or decode fails, the index is left empty (or unchanged on missing file) and no error is returned, so the caller can proceed with a full rebuild. Returns an error only for unexpected I/O (e.g. permission denied).

func (*FulltextIndex) PhraseSearch ¶

func (f *FulltextIndex) PhraseSearch(phrase string, limit int) []indexResult

PhraseSearch searches for an exact phrase match. Returns documents containing the exact phrase.

func (*FulltextIndex) Remove ¶

func (f *FulltextIndex) Remove(id string)

Remove removes a document from the index.

func (*FulltextIndex) Save ¶

func (f *FulltextIndex) Save(path string) error

Save writes the fulltext index to path (msgpack format). Dir is created if needed. Copies index data under a short read lock so I/O does not block Search/IndexNode.

func (*FulltextIndex) SaveNoCopy ¶

func (f *FulltextIndex) SaveNoCopy(path string) error

SaveNoCopy writes the fulltext index to path without deep-copying the maps. This avoids doubling memory during large builds, but holds the read lock during I/O. Use during BuildIndexes when search is not serving live traffic.

func (*FulltextIndex) Search ¶

func (f *FulltextIndex) Search(query string, limit int) []indexResult

Search performs BM25 keyword search. Returns results sorted by BM25 score (highest first).

type FulltextIndexV2 ¶

type FulltextIndexV2 struct {
	// contains filtered or unexported fields
}

FulltextIndexV2 provides a BM25 index optimized for large datasets. It stores compact postings (docNum/tf) and uses bounded prefix expansion + top-k scoring.

func NewFulltextIndexV2 ¶

func NewFulltextIndexV2() *FulltextIndexV2

func (*FulltextIndexV2) Clear ¶

func (f *FulltextIndexV2) Clear()

func (*FulltextIndexV2) Count ¶

func (f *FulltextIndexV2) Count() int

func (*FulltextIndexV2) GetDocument ¶

func (f *FulltextIndexV2) GetDocument(id string) (string, bool)

func (*FulltextIndexV2) Index ¶

func (f *FulltextIndexV2) Index(id string, text string)

func (*FulltextIndexV2) IndexBatch ¶

func (f *FulltextIndexV2) IndexBatch(entries []FulltextBatchEntry)

func (*FulltextIndexV2) IsDirty ¶

func (f *FulltextIndexV2) IsDirty() bool

func (*FulltextIndexV2) LexicalSeedDocIDs ¶

func (f *FulltextIndexV2) LexicalSeedDocIDs(maxTerms, docsPerTerm int) []string

func (*FulltextIndexV2) Load ¶

func (f *FulltextIndexV2) Load(path string) error

func (*FulltextIndexV2) PhraseSearch ¶

func (f *FulltextIndexV2) PhraseSearch(phrase string, limit int) []indexResult

func (*FulltextIndexV2) Remove ¶

func (f *FulltextIndexV2) Remove(id string)

func (*FulltextIndexV2) Save ¶

func (f *FulltextIndexV2) Save(path string) error

func (*FulltextIndexV2) SaveNoCopy ¶

func (f *FulltextIndexV2) SaveNoCopy(path string) error

func (*FulltextIndexV2) Search ¶

func (f *FulltextIndexV2) Search(query string, limit int) []indexResult

type GPUBruteForceCandidateGen ¶

type GPUBruteForceCandidateGen struct {
	// contains filtered or unexported fields
}

GPUBruteForceCandidateGen uses gpu.EmbeddingIndex.Search() as an exact candidate generator.

Note: gpu.EmbeddingIndex.Search() does not currently accept a context, so this generator cannot cancel mid-kernel. It checks ctx before invoking the search.

func NewGPUBruteForceCandidateGen ¶

func NewGPUBruteForceCandidateGen(embeddingIndex *gpu.EmbeddingIndex) *GPUBruteForceCandidateGen

func (*GPUBruteForceCandidateGen) SearchCandidates ¶

func (g *GPUBruteForceCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)

type GPUExactScorer ¶

type GPUExactScorer struct {
	// contains filtered or unexported fields
}

GPUExactScorer implements ExactScorer using GPU-accelerated exact scoring.

Falls back to CPU if GPU is unavailable or unhealthy.

func NewGPUExactScorer ¶

func NewGPUExactScorer(embeddingIndex *gpu.EmbeddingIndex, cpuFallback *CPUExactScorer) *GPUExactScorer

NewGPUExactScorer creates a new GPU-based exact scorer.

func (*GPUExactScorer) ScoreCandidates ¶

func (g *GPUExactScorer) ScoreCandidates(ctx context.Context, query []float32, candidates []Candidate) ([]ScoredCandidate, error)

ScoreCandidates computes exact scores for candidates using GPU when available.

Note: Since EmbeddingIndex.Search() searches all vectors, we use it for candidate scoring by searching with a large k, then filtering to only the candidates we care about. This is not optimal but works with the current EmbeddingIndex API. A future PR will add ScoreSubset() for true subset scoring.

type GPUKMeansCandidateGen ¶

type GPUKMeansCandidateGen struct {
	// contains filtered or unexported fields
}

GPUKMeansCandidateGen routes queries to the nearest k-means clusters and then scores only the vectors in those clusters using the GPU embedding index when available.

This is used as a high-throughput fallback when:

GPU brute-force is enabled but the dataset is outside the configured full-scan range, and
clustering is available (centroids + cluster membership already built).

It preserves correctness by returning exact cosine scores for the candidate set. When the GPU cannot score (not synced / unhealthy), ScoreSubset falls back to CPU.

func NewGPUKMeansCandidateGen ¶

func NewGPUKMeansCandidateGen(clusterIndex *gpu.ClusterIndex, numClustersToSearch int) *GPUKMeansCandidateGen

func (*GPUKMeansCandidateGen) SearchCandidates ¶

func (g *GPUKMeansCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)

func (*GPUKMeansCandidateGen) SetClusterSelector ¶

func (g *GPUKMeansCandidateGen) SetClusterSelector(fn func(ctx context.Context, query []float32, defaultN int) []int) *GPUKMeansCandidateGen

SetClusterSelector sets an optional custom cluster selector used for routing.

type HNSWCandidateGen ¶

type HNSWCandidateGen struct {
	// contains filtered or unexported fields
}

HNSWCandidateGen implements CandidateGenerator using HNSW approximate search.

This is optimal for large datasets (N >= NSmallMax) where ANN provides significant speedup over brute-force.

func NewHNSWCandidateGen ¶

func NewHNSWCandidateGen(hnswIndex *HNSWIndex) *HNSWCandidateGen

NewHNSWCandidateGen creates a new HNSW candidate generator.

func (*HNSWCandidateGen) SearchCandidates ¶

func (h *HNSWCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)

SearchCandidates generates candidates using HNSW approximate search.

type HNSWConfig ¶

type HNSWConfig struct {
	M               int     // Max connections per node per layer (default: 16)
	EfConstruction  int     // Candidate list size during construction (default: 200)
	EfSearch        int     // Candidate list size during search (default: 100)
	LevelMultiplier float64 // Level multiplier = 1/ln(M)
}

HNSWConfig contains configuration parameters for the HNSW index.

func DefaultHNSWConfig ¶

func DefaultHNSWConfig() HNSWConfig

DefaultHNSWConfig returns sensible defaults for HNSW index.

func HNSWConfigFromEnv ¶

func HNSWConfigFromEnv() HNSWConfig

HNSWConfigFromEnv loads HNSW configuration from environment variables.

Environment Variables:

NORNICDB_VECTOR_ANN_QUALITY: Quality preset (fast|balanced|accurate|compressed, default: fast)
NORNICDB_VECTOR_HNSW_M: Max connections per node (default: based on preset)
NORNICDB_VECTOR_HNSW_EF_CONSTRUCTION: Candidate list size during construction (default: based on preset)
NORNICDB_VECTOR_HNSW_EF_SEARCH: Candidate list size during search (default: based on preset)

Quality Presets:

fast: M=16, efConstruction=100, efSearch=50 (faster, lower recall)
balanced: M=16, efConstruction=200, efSearch=100 (good balance)
accurate: M=32, efConstruction=400, efSearch=200 (slower, higher recall)
compressed: parsed at ANN quality layer; maps to fast HNSW defaults until compressed ANN strategy routing is active

Example:

// Use fast preset
os.Setenv("NORNICDB_VECTOR_ANN_QUALITY", "fast")
config := HNSWConfigFromEnv()

// Override specific parameter
os.Setenv("NORNICDB_VECTOR_ANN_QUALITY", "balanced")
os.Setenv("NORNICDB_VECTOR_HNSW_EF_SEARCH", "150")
config := HNSWConfigFromEnv()

type HNSWIndex ¶

type HNSWIndex struct {
	// contains filtered or unexported fields
}

HNSWIndex provides fast approximate nearest neighbor search using HNSW algorithm.

func LoadHNSWIndex ¶

func LoadHNSWIndex(path string, vectorLookup VectorLookup) (*HNSWIndex, error)

LoadHNSWIndex loads an HNSW index from path (msgpack format) and returns it. For graph-only format (1.1.0), vectorLookup must be non-nil and vectors are resolved by ID at search time (no in-memory vector copy in HNSW). For legacy full format (1.0.0), vectorLookup is ignored. If the file does not exist or decode fails, returns (nil, nil) so the caller can rebuild. Returns an error only for unexpected I/O (e.g. permission denied).

func LoadIVFHNSWCluster ¶

func LoadIVFHNSWCluster(hnswPath string, clusterID int, vectorLookup VectorLookup) (*HNSWIndex, error)

LoadIVFHNSWCluster loads one cluster's HNSW index from hnsw_ivf/cid in lookup mode (no vector copy in HNSW RAM). Returns (nil, nil) if the file is missing or invalid (caller can build). hnswPath is the full path to the single HNSW file (e.g. data/search/dbname/hnsw).

func NewHNSWIndex ¶

func NewHNSWIndex(dimensions int, config HNSWConfig) *HNSWIndex

NewHNSWIndex creates a new HNSW index with the given dimensions and config.

func (*HNSWIndex) Add ¶

func (h *HNSWIndex) Add(id string, vec []float32) error

Add inserts a vector into the index.

func (*HNSWIndex) Clear ¶

func (h *HNSWIndex) Clear()

Clear removes all vectors from the index and resets it to an empty state. This frees memory by clearing all internal arrays and maps. Use this when you need to completely reset the index (e.g., after deleting a collection).

func (*HNSWIndex) Config ¶

func (h *HNSWIndex) Config() HNSWConfig

Config returns a copy of the index configuration.

func (*HNSWIndex) GetDimensions ¶

func (h *HNSWIndex) GetDimensions() int

GetDimensions returns the vector dimension of the index.

func (*HNSWIndex) Remove ¶

func (h *HNSWIndex) Remove(id string)

Remove removes a vector from the index by ID.

func (*HNSWIndex) Save ¶

func (h *HNSWIndex) Save(path string) error

Save writes the HNSW index to path (msgpack format) as graph-only: graph structure and IDs only, no vector data. Vectors are always loaded from the vector index (vectors) on load, so the file stays small. Dir is created if needed. Copies index data under a short read lock so I/O does not block Search/Add/Remove.

func (*HNSWIndex) Search ¶

func (h *HNSWIndex) Search(ctx context.Context, query []float32, k int, minSimilarity float64) ([]ANNResult, error)

Search finds the k nearest neighbors to the query vector.

func (*HNSWIndex) SearchWithEf ¶

func (h *HNSWIndex) SearchWithEf(ctx context.Context, query []float32, k int, minSimilarity float64, ef int) ([]ANNResult, error)

SearchWithEf finds the k nearest neighbors using a caller-provided `ef`.

In Qdrant terms, `ef` is the beam size for HNSW search: larger values improve recall and usually increase latency. If `ef <= 0`, this falls back to the index's configured `EfSearch`.

func (*HNSWIndex) SetVectorLookup ¶

func (h *HNSWIndex) SetVectorLookup(lookup VectorLookup)

SetVectorLookup sets an optional lookup so vectors are resolved by ID at search time instead of from the in-memory slice. When set, Add does not store vectors (vecOff = -1); Load can leave vectors empty and use the lookup. Saves one full vector copy in RAM.

func (*HNSWIndex) ShouldRebuild ¶

func (h *HNSWIndex) ShouldRebuild() bool

ShouldRebuild returns true if the index has accumulated too many tombstones and should be rebuilt to free memory. Threshold is 50% deleted vectors.

func (*HNSWIndex) Size ¶

func (h *HNSWIndex) Size() int

Size returns the number of vectors in the index.

func (*HNSWIndex) TombstoneRatio ¶

func (h *HNSWIndex) TombstoneRatio() float64

TombstoneRatio returns the ratio of deleted vectors to total vectors. Returns 0.0 if there are no vectors. A high ratio (>0.5) indicates the index should be rebuilt to free memory.

func (*HNSWIndex) Update ¶

func (h *HNSWIndex) Update(id string, vec []float32) error

Update updates an existing vector in the index.

Update policy: Remove + Add pattern

Removes the old vector and all its connections
Adds the new vector with fresh connections
This ensures graph structure is correctly maintained

If the vector doesn't exist, this is equivalent to Add().

Performance: O(M * log(N)) where M is max connections, N is dataset size For high-churn workloads, consider periodic rebuilds to restore graph quality.

type HNSWQualityPreset ¶

type HNSWQualityPreset string

HNSWQualityPreset defines quality presets for HNSW tuning.

const (
	// QualityFast prioritizes speed over recall.
	// Lower efSearch, lower candidate_multiplier for faster queries.
	QualityFast HNSWQualityPreset = "fast"

	// QualityBalanced provides a good balance between speed and recall.
	// Default preset with reasonable defaults.
	QualityBalanced HNSWQualityPreset = "balanced"

	// QualityAccurate prioritizes recall over speed.
	// Higher efSearch and/or bigger candidate pool for better accuracy.
	QualityAccurate HNSWQualityPreset = "accurate"
)

type IVFHNSWCandidateGen ¶

type IVFHNSWCandidateGen struct {
	// contains filtered or unexported fields
}

IVFHNSWCandidateGen implements IVF-HNSW: centroid routing (IVF) into per-cluster HNSW indexes.

This is designed for large CPU-only datasets:

K-means provides a coarse routing layer (choose nearest clusters)
Per-cluster HNSW provides fast ANN within each cluster

For GPU-enabled setups, prefer the existing GPU brute-force and GPU k-means paths.

func NewIVFHNSWCandidateGen ¶

func NewIVFHNSWCandidateGen(clusterIndex *gpu.ClusterIndex, getClusterHNSW clusterHNSWLookup, numClustersToSearch int) *IVFHNSWCandidateGen

func (*IVFHNSWCandidateGen) SearchCandidates ¶

func (g *IVFHNSWCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)

func (*IVFHNSWCandidateGen) SetClusterSelector ¶

func (g *IVFHNSWCandidateGen) SetClusterSelector(fn func(ctx context.Context, query []float32, defaultN int) []int) *IVFHNSWCandidateGen

SetClusterSelector sets an optional custom cluster selector used for routing.

type IVFPQBuildStats ¶

type IVFPQBuildStats struct {
	VectorCount         int
	TrainingSampleCount int
	ListCount           int
	AvgListSize         float64
	MaxListSize         int
	BytesPerVector      float64
	BuildDuration       time.Duration
}

IVFPQBuildStats captures build observability for acceptance gates.

type IVFPQCandidateGen ¶

type IVFPQCandidateGen struct {
	// contains filtered or unexported fields
}

IVFPQCandidateGen implements CandidateGenerator using IVF/PQ compressed ANN.

func NewIVFPQCandidateGen ¶

func NewIVFPQCandidateGen(index *IVFPQIndex, nprobe int) *IVFPQCandidateGen

func (*IVFPQCandidateGen) SearchCandidates ¶

func (g *IVFPQCandidateGen) SearchCandidates(ctx context.Context, query []float32, k int, minSimilarity float64) ([]Candidate, error)

type IVFPQIndex ¶

type IVFPQIndex struct {
	// contains filtered or unexported fields
}

IVFPQIndex stores a compressed IVF/PQ ANN structure.

func LoadIVFPQBundle ¶

func LoadIVFPQBundle(basePath string) (*IVFPQIndex, error)

LoadIVFPQBundle loads an IVFPQ multipart snapshot bundle.

func (*IVFPQIndex) Count ¶

func (i *IVFPQIndex) Count() int

func (*IVFPQIndex) Profile ¶

func (i *IVFPQIndex) Profile() IVFPQProfile

func (*IVFPQIndex) SearchApprox ¶

func (i *IVFPQIndex) SearchApprox(ctx context.Context, query []float32, k int, minSimilarity float64, nprobe int) ([]Candidate, error)

SearchApprox searches compressed IVF/PQ lists and returns approximate candidates.

type IVFPQProfile ¶

type IVFPQProfile struct {
	Dimensions          int
	IVFLists            int
	PQSegments          int
	PQBits              int
	NProbe              int
	RerankTopK          int
	TrainingSampleMax   int
	KMeansMaxIterations int
}

IVFPQProfile is the concrete runtime profile used to build/query compressed ANN.

type IdentityExactScorer ¶

type IdentityExactScorer struct{}

IdentityExactScorer is used when the candidate generator already returns exact scores.

func (*IdentityExactScorer) ScoreCandidates ¶

func (i *IdentityExactScorer) ScoreCandidates(ctx context.Context, query []float32, candidates []Candidate) ([]ScoredCandidate, error)

type KMeansCandidateGen ¶

type KMeansCandidateGen struct {
	// contains filtered or unexported fields
}

KMeansCandidateGen implements CandidateGenerator using k-means cluster routing.

This candidate generator:

Finds the top numClustersToSearch clusters nearest to the query
Gets all node IDs from those clusters as candidates
Returns candidates with approximate scores (centroid similarity)

This is optimal for very large datasets (N > 100K) where cluster routing provides significant speedup. For smaller datasets, HNSW or brute-force may be faster.

func NewKMeansCandidateGen ¶

func NewKMeansCandidateGen(clusterIndex *gpu.ClusterIndex, vectorIndex *VectorIndex, numClustersToSearch int) *KMeansCandidateGen

NewKMeansCandidateGen creates a new k-means candidate generator.

Parameters:

clusterIndex: The GPU ClusterIndex (must be clustered)
vectorIndex: The VectorIndex for ID mapping and fallback
numClustersToSearch: Number of clusters to search (default: 3)

func (*KMeansCandidateGen) SearchCandidates ¶

func (k *KMeansCandidateGen) SearchCandidates(ctx context.Context, query []float32, limit int, minSimilarity float64) ([]Candidate, error)

SearchCandidates generates candidates using k-means cluster routing.

Algorithm:

Find top numClustersToSearch clusters nearest to query (centroid similarity)
Get all node IDs from those clusters
Return candidates with approximate scores (centroid similarity)

If clustering is not available or not clustered, falls back to brute-force.

func (*KMeansCandidateGen) SetClusterSelector ¶

func (k *KMeansCandidateGen) SetClusterSelector(fn func(ctx context.Context, query []float32, defaultN int) []int) *KMeansCandidateGen

SetClusterSelector sets an optional custom cluster selector used for routing. When nil, the generator uses the default nearest-centroid routing.

type KalmanSearchAdapter ¶

type KalmanSearchAdapter struct {
	// contains filtered or unexported fields
}

KalmanSearchAdapter wraps a search Service with Kalman filtering.

func NewKalmanSearchAdapter ¶

func NewKalmanSearchAdapter(service *Service, config KalmanSearchConfig) *KalmanSearchAdapter

NewKalmanSearchAdapter creates a new Kalman-enhanced search adapter.

Example:

service := search.NewService(engine, 1024)
adapter := search.NewKalmanSearchAdapter(service, search.DefaultKalmanSearchConfig())

// Search with enhanced ranking
results, _ := adapter.Search(ctx, "semantic search", embedding, &SearchOptions{Limit: 10})

func (*KalmanSearchAdapter) GetDocumentRelevanceTrend ¶

func (ka *KalmanSearchAdapter) GetDocumentRelevanceTrend(docID string) float64

GetDocumentRelevanceTrend returns whether a document is becoming more or less relevant.

Returns:

positive velocity: document is becoming more relevant (appearing higher)
negative velocity: document is becoming less relevant (appearing lower)
zero: stable relevance

func (*KalmanSearchAdapter) GetFallingDocuments ¶

func (ka *KalmanSearchAdapter) GetFallingDocuments(maxVelocity float64) []string

GetFallingDocuments returns documents whose relevance is decreasing.

func (*KalmanSearchAdapter) GetLatencyTrend ¶

func (ka *KalmanSearchAdapter) GetLatencyTrend() float64

GetLatencyTrend returns the current latency velocity (positive = getting slower).

func (*KalmanSearchAdapter) GetPredictedLatency ¶

func (ka *KalmanSearchAdapter) GetPredictedLatency(stepsAhead int) float64

GetPredictedLatency returns the predicted latency for the next query.

func (*KalmanSearchAdapter) GetRisingDocuments ¶

func (ka *KalmanSearchAdapter) GetRisingDocuments(minVelocity float64) []string

GetRisingDocuments returns documents whose relevance is increasing.

func (*KalmanSearchAdapter) GetService ¶

func (ka *KalmanSearchAdapter) GetService() *Service

GetService returns the underlying search service.

func (*KalmanSearchAdapter) GetStats ¶

func (ka *KalmanSearchAdapter) GetStats() SearchAdapterStats

GetStats returns adapter statistics.

func (*KalmanSearchAdapter) Reset ¶

func (ka *KalmanSearchAdapter) Reset()

Reset clears all cached data and filters.

func (*KalmanSearchAdapter) Search ¶

func (ka *KalmanSearchAdapter) Search(ctx context.Context, query string, embedding []float32, opts *SearchOptions) (*SearchResponse, error)

Search performs a Kalman-enhanced search.

The enhancement pipeline:

Execute underlying search
Smooth similarity scores with Kalman filter
Apply ranking stability boost
Track latency for prediction
Return enhanced results

type KalmanSearchConfig ¶

type KalmanSearchConfig struct {
	// EnableScoreSmoothing enables Kalman filtering of similarity scores
	EnableScoreSmoothing bool

	// EnableRankingStability applies stability boost to consistent results
	EnableRankingStability bool

	// EnableLatencyPrediction tracks and predicts query latency
	EnableLatencyPrediction bool

	// SimilarityConfig for score smoothing
	SimilarityConfig filter.Config

	// LatencyConfig for latency prediction
	LatencyConfig filter.Config

	// StabilityBoost is the boost factor for consistently-ranked documents
	StabilityBoost float64

	// StabilityWindow is how many recent queries to consider for stability
	StabilityWindow int

	// ScoreHistoryLimit is max history per document
	ScoreHistoryLimit int
}

KalmanSearchConfig holds configuration for the Kalman-enhanced search adapter.

func DefaultKalmanSearchConfig ¶

func DefaultKalmanSearchConfig() KalmanSearchConfig

DefaultKalmanSearchConfig returns sensible defaults.

type LLMFunc ¶

type LLMFunc func(ctx context.Context, prompt string) (string, error)

LLMFunc is a minimal, dependency-free function signature for calling an LLM. It is intentionally generic so the search package does not depend on Heimdall.

Implementations should be safe for concurrent use.

type LLMReranker ¶

type LLMReranker struct {
	// contains filtered or unexported fields
}

LLMReranker performs Stage-2 reranking using an LLM (e.g., Heimdall).

The model is prompted to output JSON ranking information using candidate indices (not IDs) to keep responses short and parsing robust.

func NewLLMReranker ¶

func NewLLMReranker(cfg *LLMRerankerConfig, llm LLMFunc) *LLMReranker

NewLLMReranker creates a new LLM reranker.

func (*LLMReranker) Enabled ¶

func (r *LLMReranker) Enabled() bool

func (*LLMReranker) IsAvailable ¶

func (r *LLMReranker) IsAvailable(ctx context.Context) bool

func (*LLMReranker) Name ¶

func (r *LLMReranker) Name() string

func (*LLMReranker) Rerank ¶

func (r *LLMReranker) Rerank(ctx context.Context, query string, candidates []RerankCandidate) ([]RerankResult, error)

Rerank takes a query and candidates, returns reranked results.

It is fail-open: if the LLM errors or returns malformed output, it returns the original ranking (pass-through).

type LLMRerankerConfig ¶

type LLMRerankerConfig struct {
	Enabled bool

	// Timeout bounds how long a single LLM rerank call can take.
	Timeout time.Duration

	// MaxCandidates caps how many candidates are included in one prompt.
	// This protects against long prompts on large candidate sets.
	MaxCandidates int

	// MaxDocChars truncates each candidate's text content to this many characters.
	MaxDocChars int

	// MaxQueryChars truncates the query string included in the prompt.
	MaxQueryChars int

	// MinScore filters candidates whose LLM score is below this value.
	// A value of 0 means "no filtering".
	MinScore float64
}

LLMRerankerConfig controls LLM-based reranking behavior.

This reranker is designed to be "fail-open": on errors or malformed output, it returns the original candidate order (pass-through).

func DefaultLLMRerankerConfig ¶

func DefaultLLMRerankerConfig() *LLMRerankerConfig

DefaultLLMRerankerConfig returns conservative defaults intended for small local models.

type LocalReranker ¶

type LocalReranker struct {
	// contains filtered or unexported fields
}

LocalReranker performs Stage-2 reranking using a local GGUF reranker model.

func NewLocalReranker ¶

func NewLocalReranker(scorer RerankScorer, cfg *LocalRerankerConfig) *LocalReranker

NewLocalReranker creates a reranker that uses the given scorer (e.g. *localllm.RerankerModel).

func (*LocalReranker) Enabled ¶

func (r *LocalReranker) Enabled() bool

func (*LocalReranker) IsAvailable ¶

func (r *LocalReranker) IsAvailable(ctx context.Context) bool

func (*LocalReranker) Name ¶

func (r *LocalReranker) Name() string

func (*LocalReranker) Rerank ¶

func (r *LocalReranker) Rerank(ctx context.Context, query string, candidates []RerankCandidate) ([]RerankResult, error)

Rerank scores each candidate with the scorer and returns results sorted by score (desc). Fail-open: on error returns original order.

type LocalRerankerConfig ¶

type LocalRerankerConfig struct {
	Enabled bool

	Timeout time.Duration

	// MaxCandidates caps how many candidates are scored per query.
	MaxCandidates int

	// MaxDocChars truncates each candidate's content before scoring.
	MaxDocChars int

	// MinScore filters out candidates below this score (0 = no filter).
	MinScore float64
}

LocalRerankerConfig configures local GGUF reranker behavior.

func DefaultLocalRerankerConfig ¶

func DefaultLocalRerankerConfig() *LocalRerankerConfig

DefaultLocalRerankerConfig returns sensible defaults for BGE-style reranking.

type NodeIterator ¶

type NodeIterator interface {
	IterateNodes(fn func(*storage.Node) bool) error
}

NodeIterator is an interface for streaming node iteration.

type RerankCandidate ¶

type RerankCandidate struct {
	ID      string
	Content string
	Score   float64 // Original score (from bi-encoder)
}

RerankCandidate represents a document to be reranked.

type RerankResult ¶

type RerankResult struct {
	ID           string
	Content      string
	OriginalRank int
	NewRank      int
	BiScore      float64 // Original bi-encoder score
	CrossScore   float64 // Cross-encoder score
	FinalScore   float64 // Combined or cross-encoder score
}

RerankResult is a reranked document with new score.

type RerankScorer ¶

type RerankScorer interface {
	Score(ctx context.Context, query, document string) (float32, error)
}

RerankScorer scores a (query, document) pair for relevance. Implementations are typically *localllm.RerankerModel (loaded from GGUF).

type Reranker ¶

type Reranker interface {
	// Name identifies the reranker implementation for observability.
	// Examples: "cross_encoder", "heimdall_llm".
	Name() string

	// Enabled indicates whether reranking is configured/enabled.
	Enabled() bool

	// IsAvailable is an optional health check. Implementations should keep this
	// cheap; Search() should not call it on the hot path.
	IsAvailable(ctx context.Context) bool

	// Rerank reorders candidates for a query.
	Rerank(ctx context.Context, query string, candidates []RerankCandidate) ([]RerankResult, error)
}

Reranker is a Stage-2 reranking component.

Implementations MUST be fail-open: if reranking cannot be performed (service unavailable, parse error, timeout), they should return a pass-through ranking rather than failing the overall search request.

type ScoredCandidate ¶

type ScoredCandidate struct {
	ID    string
	Score float64 // Exact score from exact scorer
}

ScoredCandidate represents a candidate with exact score.

type SearchAdapterStats ¶

type SearchAdapterStats struct {
	TotalQueries       int64
	ScoresSmoothed     int64
	StabilityApplied   int64
	LatencyPredictions int64
	AverageLatencyMs   float64
	PredictedLatencyMs float64
}

SearchAdapterStats holds adapter statistics.

type SearchCandidate ¶

type SearchCandidate struct {
	ID    string
	Score float64
}

SearchCandidate is a lightweight vector-search result: just the ID and score.

This is intended for high-throughput call paths that don’t require node enrichment (e.g. Qdrant-compatible gRPC searches that return IDs+scores).

type SearchMetrics ¶

type SearchMetrics struct {
	VectorSearchTimeMs int `json:"vector_search_time_ms"`
	BM25SearchTimeMs   int `json:"bm25_search_time_ms"`
	FusionTimeMs       int `json:"fusion_time_ms"`
	TotalTimeMs        int `json:"total_time_ms"`
	VectorCandidates   int `json:"vector_candidates"`
	BM25Candidates     int `json:"bm25_candidates"`
	FusedCandidates    int `json:"fused_candidates"`
}

SearchMetrics contains timing and statistics.

type SearchOptions ¶

type SearchOptions struct {
	// Limit is the maximum number of results to return
	Limit int

	// MinSimilarity is the minimum similarity threshold for vector search.
	// nil = use service default, otherwise use the provided value.
	MinSimilarity *float64

	// Types filters results by node type (labels)
	Types []string

	// RRF configuration
	RRFK         float64 // RRF constant (default: 60)
	VectorWeight float64 // Weight for vector results (default: 1.0)
	BM25Weight   float64 // Weight for BM25 results (default: 1.0)
	MinRRFScore  float64 // Minimum RRF score threshold (default: 0.01)

	// MMR (Maximal Marginal Relevance) diversification
	// When enabled, results are re-ranked to balance relevance with diversity
	MMREnabled bool    // Enable MMR diversification (default: false)
	MMRLambda  float64 // Balance: 1.0 = pure relevance, 0.0 = pure diversity (default: 0.7)

	// Cross-encoder reranking (Stage 2)
	// When enabled, top candidates are re-scored using a cross-encoder model
	// for higher accuracy at the cost of latency
	RerankEnabled  bool    // Enable cross-encoder reranking (default: false)
	RerankTopK     int     // How many candidates to rerank (default: 100)
	RerankMinScore float64 // Minimum cross-encoder score to include (default: 0)
}

SearchOptions configures the search behavior.

func DefaultSearchOptions ¶

func DefaultSearchOptions() *SearchOptions

DefaultSearchOptions returns sensible defaults.

func GetAdaptiveRRFConfig ¶

func GetAdaptiveRRFConfig(query string) *SearchOptions

GetAdaptiveRRFConfig returns optimized RRF weights based on query characteristics.

This function analyzes the query and adjusts weights to favor the search method most likely to perform well:

Short queries (1-2 words): Favor BM25 keyword matching Example: "python" or "graph database" Weights: Vector=0.5, BM25=1.5
Long queries (6+ words): Favor vector semantic understanding Example: "How do I implement a distributed consensus algorithm?" Weights: Vector=1.5, BM25=0.5
Medium queries (3-5 words): Balanced approach Example: "machine learning algorithms" Weights: Vector=1.0, BM25=1.0

Why this works:

Short queries lack context → keywords more reliable
Long queries have semantic meaning → embeddings capture intent better

Example:

// Automatic adaptation
query1 := "database"
opts1 := search.GetAdaptiveRRFConfig(query1)
fmt.Printf("Short query weights: V=%.1f, B=%.1f\n",
	opts1.VectorWeight, opts1.BM25Weight)
// Output: V=0.5, B=1.5 (favors keywords)

query2 := "What are the best practices for scaling graph databases?"
opts2 := search.GetAdaptiveRRFConfig(query2)
fmt.Printf("Long query weights: V=%.1f, B=%.1f\n",
	opts2.VectorWeight, opts2.BM25Weight)
// Output: V=1.5, B=0.5 (favors semantics)

Returns SearchOptions with adapted weights. Other options (Limit, MinSimilarity) are set to defaults.

func (*SearchOptions) GetMinSimilarity ¶

func (o *SearchOptions) GetMinSimilarity(fallback float64) float64

GetMinSimilarity returns the MinSimilarity value, or the fallback if nil.

type SearchResponse ¶

type SearchResponse struct {
	Status            string         `json:"status"`
	Query             string         `json:"query"`
	Results           []SearchResult `json:"results"`
	TotalCandidates   int            `json:"total_candidates"`
	Returned          int            `json:"returned"`
	SearchMethod      string         `json:"search_method"`
	FallbackTriggered bool           `json:"fallback_triggered"`
	Message           string         `json:"message,omitempty"`
	Metrics           *SearchMetrics `json:"metrics,omitempty"`
}

SearchResponse is the response from a search operation.

type SearchResult ¶

type SearchResult struct {
	ID             string         `json:"id"`
	NodeID         storage.NodeID `json:"nodeId"`
	Type           string         `json:"type"`
	Labels         []string       `json:"labels"`
	Title          string         `json:"title,omitempty"`
	Description    string         `json:"description,omitempty"`
	ContentPreview string         `json:"content_preview,omitempty"`
	Properties     map[string]any `json:"properties,omitempty"`

	// Scoring
	Score      float64 `json:"score"`
	Similarity float64 `json:"similarity,omitempty"`

	// RRF metadata (vector_rank/bm25_rank are always emitted so clients see original
	// ranks even when Stage-2 reranking is applied; 0 means not in that result set)
	RRFScore   float64 `json:"rrf_score,omitempty"`
	VectorRank int     `json:"vector_rank"`
	BM25Rank   int     `json:"bm25_rank"`
}

SearchResult represents a unified search result.

type Service ¶

type Service struct {
	// contains filtered or unexported fields
}

Service provides unified hybrid search with automatic index management.

The Service maintains:

Vector index (HNSW for fast approximate nearest neighbor search)
Full-text index (BM25 with inverted index)
Connection to storage engine for node data enrichment

Thread-safe: Multiple goroutines can call Search() concurrently.

Example:

svc := search.NewService(engine)
defer svc.Close()

// Index existing data
if err := svc.BuildIndexes(ctx); err != nil {
	log.Fatal(err)
}

// Index new nodes as they're created
node := &storage.Node{...}
if err := svc.IndexNode(node); err != nil {
	log.Printf("Failed to index: %v", err)
}

func NewService ¶

func NewService(engine storage.Engine) *Service

func NewServiceWithDimensions ¶

func NewServiceWithDimensions(engine storage.Engine, dimensions int) *Service

NewServiceWithDimensions creates a search Service with the specified embedding dimensions. Use this when your embedding model produces vectors of a different size than the default 1024.

Example:

// For Apple Intelligence embeddings (512 dimensions)
svc := search.NewServiceWithDimensions(engine, 512)

// For OpenAI text-embedding-3-small (1536 dimensions)
svc := search.NewServiceWithDimensions(engine, 1536)

func NewServiceWithDimensionsAndBM25Engine ¶

func NewServiceWithDimensionsAndBM25Engine(engine storage.Engine, dimensions int, bm25Engine string) *Service

NewServiceWithDimensionsAndBM25Engine creates a search Service with explicit BM25 engine selection. bm25Engine values: "v1", "v2" (invalid values default to "v1").

func (*Service) BM25Engine ¶

func (s *Service) BM25Engine() string

BM25Engine returns the configured BM25 engine for this service ("v1" or "v2").

func (*Service) BuildInProgress ¶

func (s *Service) BuildInProgress() bool

BuildInProgress reports whether BuildIndexes is currently running.

func (*Service) BuildIndexes ¶

func (s *Service) BuildIndexes(ctx context.Context) error

BuildIndexes builds search indexes from all nodes in the engine. Call this after storage (and WAL recovery) so indexes reflect durable state. When both fulltext and vector index paths are set, tries to load both from disk; if both load with count > 0 (and semver format version matches), the full iteration is skipped. Otherwise iterates over storage and saves both indexes at the end when paths are set.

func (*Service) ClearVectorIndex ¶

func (s *Service) ClearVectorIndex()

ClearVectorIndex removes all embeddings from the vector index. This is used when regenerating all embeddings to reset the index count. Also frees memory from HNSW tombstones which can accumulate over time.

func (*Service) ClusterStats ¶

func (s *Service) ClusterStats() *gpu.ClusterStats

ClusterStats returns k-means clustering statistics.

func (*Service) ClusteringInProgress ¶

func (s *Service) ClusteringInProgress() bool

ClusteringInProgress returns true if k-means is currently running for this service. Used to debounce timer ticks and avoid starting a second run while one is in progress.

func (*Service) CrossEncoderAvailable ¶

func (s *Service) CrossEncoderAvailable(ctx context.Context) bool

CrossEncoderAvailable returns true if a cross-encoder reranker is configured and available.

func (*Service) CurrentStrategy ¶

func (s *Service) CurrentStrategy() string

CurrentStrategy returns the currently active vector search strategy label. Returns "unknown" until a vector pipeline has been initialized.

func (*Service) EmbeddingCount ¶

func (s *Service) EmbeddingCount() int

EmbeddingCount returns the total number of nodes with embeddings in the vector index.

func (*Service) EnableClustering ¶

func (s *Service) EnableClustering(gpuManager *gpu.Manager, numClusters int)

EnableClustering enables GPU k-means clustering for accelerated vector search.

Performance Improvements (Real-World Benchmarks):

2,000 embeddings: ~14% faster (61ms vs 65ms avg)
4,500 embeddings: ~26% faster (35ms vs 47ms avg)
10,000+ embeddings: 10-50x faster (scales with dataset size)

The speedup increases with dataset size as the cluster-based search avoids comparing against all vectors.

Parameters:

gpuManager: GPU manager for acceleration (can be nil for CPU-only)
numClusters: Number of k-means clusters (0 for auto based on dataset size)

Call this BEFORE BuildIndexes(), then call TriggerClustering() after indexing.

Example:

gpuManager, _ := gpu.NewManager(nil)
svc.EnableClustering(gpuManager, 100) // 100 clusters
svc.BuildIndexes(ctx)
svc.TriggerClustering(ctx) // Run k-means

func (*Service) GetBuildProgress ¶

func (s *Service) GetBuildProgress() BuildProgress

GetBuildProgress returns current search build progress for status APIs.

func (*Service) GetDefaultMinSimilarity ¶

func (s *Service) GetDefaultMinSimilarity() float64

GetDefaultMinSimilarity returns the configured minimum similarity threshold.

func (*Service) GetMinEmbeddingsForClustering ¶

func (s *Service) GetMinEmbeddingsForClustering() int

GetMinEmbeddingsForClustering returns the current minimum embeddings threshold.

func (*Service) IndexNode ¶

func (s *Service) IndexNode(node *storage.Node) error

IndexNode adds a node to all search indexes. All embeddings are stored in ChunkEmbeddings (even single chunk = array of 1).

func (*Service) IsClusteringEnabled ¶

func (s *Service) IsClusteringEnabled() bool

IsClusteringEnabled returns true if GPU clustering is enabled.

func (*Service) IsReady ¶

func (s *Service) IsReady() bool

IsReady reports whether the search indexes are fully built and ready to serve queries.

func (*Service) PersistIndexesToDisk ¶

func (s *Service) PersistIndexesToDisk()

PersistIndexesToDisk writes the current BM25, vector, HNSW, and IVF-HNSW (per-cluster) indexes to disk immediately. Call this on shutdown so the latest in-memory state is saved before exit, same as every other persisted index. Cancels any pending debounced persist so no write runs after shutdown.

func (*Service) RemoveNode ¶

func (s *Service) RemoveNode(nodeID storage.NodeID) error

RemoveNode removes a node from all search indexes. Also removes all chunk embeddings (for nodes with multiple chunks).

func (*Service) RerankCandidates ¶

func (s *Service) RerankCandidates(ctx context.Context, query string, candidates []RerankCandidate, opts *SearchOptions) ([]RerankResult, error)

RerankCandidates applies the configured Stage-2 reranker to caller-provided candidates. This is a seam-friendly API for adapters that need rerank semantics without running retrieval.

func (*Service) RerankerAvailable ¶

func (s *Service) RerankerAvailable(ctx context.Context) bool

RerankerAvailable returns true if Stage-2 reranking is configured and available.

func (*Service) Search ¶

func (s *Service) Search(ctx context.Context, query string, embedding []float32, opts *SearchOptions) (*SearchResponse, error)

Search performs hybrid search with automatic fallback.

Search strategy:

Try RRF hybrid search (vector + BM25) if embedding provided
Fall back to vector-only if RRF returns no results
Fall back to BM25-only if vector search fails or no embedding

This ensures you always get results even if one index is empty or fails.

Parameters:

ctx: Context for cancellation
query: Text query for BM25 search
embedding: Vector embedding for similarity search (can be nil)
opts: Search options (use DefaultSearchOptions() if unsure)

Example:

svc := search.NewService(engine)

// Hybrid search (best results)
query := "graph database memory"
embedding, _ := embedder.Embed(ctx, query)
opts := search.DefaultSearchOptions()
opts.Limit = 10

resp, err := svc.Search(ctx, query, embedding, opts)
if err != nil {
	return err
}

fmt.Printf("Found %d results using %s\n",
	resp.Returned, resp.SearchMethod)

for i, result := range resp.Results {
	fmt.Printf("%d. [RRF: %.4f] %s\n",
		i+1, result.RRFScore, result.Title)
	fmt.Printf("   Vector rank: #%d, BM25 rank: #%d\n",
		result.VectorRank, result.BM25Rank)
}

Returns a SearchResponse with ranked results and metadata about the search method used.

func (*Service) SetCrossEncoder ¶

func (s *Service) SetCrossEncoder(ce *CrossEncoder)

SetCrossEncoder configures the Stage-2 reranker to use the cross-encoder implementation.

Example:

svc := search.NewService(engine)
svc.SetCrossEncoder(search.NewCrossEncoder(&search.CrossEncoderConfig{
	Enabled: true,
	APIURL:  "http://localhost:8081/rerank",
	Model:   "cross-encoder/ms-marco-MiniLM-L-6-v2",
}))

func (*Service) SetDefaultMinSimilarity ¶

func (s *Service) SetDefaultMinSimilarity(threshold float64)

SetDefaultMinSimilarity sets the default minimum cosine similarity threshold for vector search. Apple Intelligence embeddings produce scores in 0.2-0.8 range, bge-m3/mxbai produce 0.7-0.99. Default: 0.0 (let RRF ranking handle relevance filtering)

func (*Service) SetFulltextIndexPath ¶

func (s *Service) SetFulltextIndexPath(path string)

SetFulltextIndexPath sets the path for persisting the BM25 fulltext index. When both fulltext and vector paths are set, BuildIndexes() will try to load both; if both load with count > 0, the full storage iteration is skipped.

func (*Service) SetGPUManager ¶

func (s *Service) SetGPUManager(manager *gpu.Manager)

SetGPUManager enables GPU acceleration for exact brute-force vector search.

This is independent from k-means clustering. When enabled, the vector pipeline may choose GPU brute-force search (exact) for datasets where it outperforms HNSW.

func (*Service) SetHNSWIndexPath ¶

func (s *Service) SetHNSWIndexPath(path string)

SetHNSWIndexPath sets the path for persisting the HNSW index. When set with persist search indexes, the HNSW index is saved after build/warmup and loaded on startup so the full graph does not need to be rebuilt from vectors.

func (*Service) SetMinEmbeddingsForClustering ¶

func (s *Service) SetMinEmbeddingsForClustering(threshold int)

SetMinEmbeddingsForClustering sets the minimum number of embeddings required before k-means clustering is triggered. Below this threshold, brute-force search is used as it's faster for small datasets.

This should be called BEFORE TriggerClustering() to take effect.

Parameters:

threshold: Minimum embeddings (must be > 0, default: 1000)

Tuning Guidelines:

1000 (default): Safe for most workloads
500-1000: Latency-sensitive applications with moderate data
100-500: Testing or small datasets
2000+: Very large datasets, delay clustering until more data arrives

Example:

svc := search.NewService(engine)
svc.SetMinEmbeddingsForClustering(500) // Lower threshold for faster clustering
svc.EnableClustering(gpuManager, 100)
svc.BuildIndexes(ctx)
svc.TriggerClustering(ctx) // Will cluster if >= 500 embeddings

func (*Service) SetPersistenceEnabled ¶

func (s *Service) SetPersistenceEnabled(enabled bool)

SetPersistenceEnabled controls whether index persistence is allowed. When false, all persist paths (debounced writes, build checkpoints, and shutdown flush) are treated as no-ops even if index paths are configured.

func (*Service) SetReranker ¶

func (s *Service) SetReranker(r Reranker)

SetReranker configures the Stage-2 reranker. Uses a quick read-lock check to avoid write-lock contention when the value hasn't changed. This prevents deadlock when multiple goroutines call getOrCreateSearchService concurrently.

func (*Service) SetVectorIndexPath ¶

func (s *Service) SetVectorIndexPath(path string)

SetVectorIndexPath sets the path for persisting the vector index. When both fulltext and vector paths are set, BuildIndexes() will try to load both; if both load with count > 0, the full storage iteration is skipped.

func (*Service) TriggerClustering ¶

func (s *Service) TriggerClustering(ctx context.Context) error

TriggerClustering runs k-means clustering on all indexed embeddings. Stops promptly when ctx is cancelled (e.g. process shutdown). Only one run executes at a time per service; if clustering is already in progress, returns nil immediately (debounced).

Trigger Policies:

After bulk loads: Automatically called after BuildIndexes() completes
Periodic clustering: Background timer runs clustering at regular intervals
Manual trigger: Call this after bulk data loading to enable k-means routing

Once clustering completes, the vector search pipeline automatically uses KMeansCandidateGen for candidate generation, providing significant speedup for very large datasets (N > 100K).

Returns nil (not error) if there are too few embeddings - clustering will be skipped silently as brute-force search is faster for small datasets. Returns error only if clustering is not enabled or fails unexpectedly.

func (*Service) VectorIndexDimensions ¶

func (s *Service) VectorIndexDimensions() int

VectorIndexDimensions returns the configured dimensions of the vector index.

func (*Service) VectorQueryNodes ¶

func (s *Service) VectorQueryNodes(ctx context.Context, queryEmbedding []float32, spec VectorQuerySpec) ([]VectorQueryHit, error)

VectorQueryNodes executes a Cypher-style vector query.

This method preserves Cypher semantics for per-node embedding selection:

Prefer NamedEmbeddings[Property] (or "default" when Property is empty)
If Property is set, next try node.Properties[Property] as a vector array
Fallback to ChunkEmbeddings[0..N] (best score across chunks)

For performance, cosine-similarity queries are executed against the in-memory vector index (unified pipeline) rather than scanning storage.

func (*Service) VectorSearchCandidates ¶

func (s *Service) VectorSearchCandidates(ctx context.Context, embedding []float32, opts *SearchOptions) ([]SearchCandidate, error)

VectorSearchCandidates performs vector-only search and returns lightweight candidates without enrichment. It is optimized for throughput: it skips BM25, RRF fusion, and storage fetches.

This method uses the unified vector search pipeline (CandidateGen + ExactScore) with automatic strategy selection (brute-force for small N, HNSW for large N).

type VectorFileStore ¶

type VectorFileStore struct {
	// contains filtered or unexported fields
}

VectorFileStore is an append-only vector store backed by a file. Only id→offset is kept in RAM; vector data lives on disk. All vectors are stored normalized (one copy per id).

func NewVectorFileStore ¶

func NewVectorFileStore(vecBasePath string, dimensions int) (*VectorFileStore, error)

NewVectorFileStore creates a new file-backed store and opens the vector file for append. vecBasePath is the path prefix: .vec and .meta will be appended. If the .vec file exists it is opened for append; otherwise it is created with a header.

func (*VectorFileStore) Add ¶

func (v *VectorFileStore) Add(id string, vec []float32) error

Add appends a normalized vector to the store. vec is normalized in place/copied; only one copy is stored.

func (*VectorFileStore) Close ¶

func (v *VectorFileStore) Close() error

Close closes the underlying file. The store must not be used after Close.

func (*VectorFileStore) CompactIfNeeded ¶

func (v *VectorFileStore) CompactIfNeeded() (bool, error)

CompactIfNeeded rewrites .vec with only live records when stale entries accumulate. The rewrite is atomic: write temp file, fsync, rename. Returns true when compaction actually ran.

func (*VectorFileStore) Count ¶

func (v *VectorFileStore) Count() int

Count returns the number of vectors in the store.

func (*VectorFileStore) GetBuildIndexedCount ¶

func (v *VectorFileStore) GetBuildIndexedCount() int64

GetBuildIndexedCount returns the last persisted checkpoint count (0 if none). Used at start of BuildIndexes to skip the first N nodes when resuming.

func (*VectorFileStore) GetDimensions ¶

func (v *VectorFileStore) GetDimensions() int

GetDimensions returns the vector dimension.

func (*VectorFileStore) GetVector ¶

func (v *VectorFileStore) GetVector(id string) ([]float32, bool)

GetVector returns a copy of the stored (normalized) vector for id, or (nil, false) if not found.

func (*VectorFileStore) Has ¶

func (v *VectorFileStore) Has(id string) bool

Has reports whether id is present in the id→offset map.

func (*VectorFileStore) IterateChunked ¶

func (v *VectorFileStore) IterateChunked(chunkSize int, fn func(ids []string, vecs [][]float32) error) error

IterateChunked reads the vector file in chunks and calls fn(ids, vecs) for each chunk. Used to build HNSW without loading all vectors into memory. fn may be called with fewer than chunkSize vectors on the last chunk.

func (*VectorFileStore) Load ¶

func (v *VectorFileStore) Load() error

Load populates the store from an existing .vec + .meta. The store must be created with NewVectorFileStore(vecBasePath, dimensions); Load then reads the .meta file to populate idToOff. The .vec file is already open from NewVectorFileStore.

func (*VectorFileStore) Remove ¶

func (v *VectorFileStore) Remove(id string) bool

Remove deletes id from the live id→offset map. The old .vec record is left in-place and reclaimed by compaction.

func (*VectorFileStore) Save ¶

func (v *VectorFileStore) Save() error

Save writes the id→offset map to the .meta file so the store can be loaded later. Copies idToOff under a short lock so the (potentially slow) encode doesn't block Add().

func (*VectorFileStore) SetBuildIndexedCount ¶

func (v *VectorFileStore) SetBuildIndexedCount(n int64)

SetBuildIndexedCount sets the last checkpoint count from BuildIndexes (for resume). Call before Save() when persisting after a checkpoint so the next run can skip already-indexed nodes.

func (*VectorFileStore) Sync ¶

func (v *VectorFileStore) Sync() error

Sync flushes the .vec file to disk so progress is visible and durable.

type VectorFileStoreMeta ¶

type VectorFileStoreMeta struct {
	Version           int              `msgpack:"v"`
	Dimensions        int              `msgpack:"dim"`
	IDToOffset        map[string]int64 `msgpack:"id2off"`
	BuildIndexedCount int64            `msgpack:"build_count,omitempty"` // last checkpoint count during BuildIndexes; used for resume
}

VectorFileStoreMeta is persisted to the .meta file (msgpack).

type VectorGetter ¶

type VectorGetter interface {
	GetVector(id string) ([]float32, bool)
}

VectorGetter is implemented by *VectorIndex and by adapters for VectorLookup (e.g. file-backed store).

type VectorIndex ¶

type VectorIndex struct {
	// contains filtered or unexported fields
}

VectorIndex provides exact vector similarity search using cosine similarity.

The index stores normalized vectors for efficient similarity computation and supports concurrent read/write operations. It uses brute-force search which provides exact results but has O(n) time complexity.

Key features:

Exact cosine similarity computation
Automatic vector normalization
Thread-safe concurrent operations
Context-aware search with cancellation
Configurable similarity thresholds

Example:

// Create index for 512-dimensional vectors
index := search.NewVectorIndex(512)

// Add vectors
for id, vector := range vectors {
	index.Add(id, vector)
}

// Search with minimum similarity threshold
results, _ := index.Search(ctx, queryVector, 5, 0.8)
for _, result := range results {
	fmt.Printf("%s: %.3f\n", result.ID, result.Score)
}

Performance:

Add: O(d) where d = dimensions
Search: O(n×d) where n = number of vectors
Memory: O(n×d) for normalized vectors

Thread Safety:

All methods are thread-safe using RWMutex for concurrent access.

func NewVectorIndex ¶

func NewVectorIndex(dimensions int) *VectorIndex

NewVectorIndex creates a new vector similarity index for the given dimensions.

The index is initialized empty and ready to accept vectors of the specified dimensionality. All vectors added to the index must have exactly this number of dimensions.

Parameters:

dimensions: Number of dimensions for vectors (must be > 0)

Returns:

VectorIndex ready for use

Example:

// Create index for OpenAI embeddings (1536 dimensions)
index := search.NewVectorIndex(1536)

// Create index for sentence transformers (384 dimensions)
index = search.NewVectorIndex(384)

// Index is ready to accept vectors
index.Add("doc1", embedding1)

func (*VectorIndex) Add ¶

func (v *VectorIndex) Add(id string, vec []float32) error

Add adds or updates a vector in the index with automatic normalization.

The vector is normalized to unit length for efficient cosine similarity computation. If a vector with the same ID already exists, it is replaced.

Parameters:

id: Unique identifier for the vector
vector: Vector to add (must match index dimensions)

Returns:

ErrDimensionMismatch if vector dimensions don't match index

Example:

// Add document embeddings
for docID, embedding := range documentEmbeddings {
	err := index.Add(docID, embedding)
	if err != nil {
		log.Printf("Failed to add %s: %v", docID, err)
	}
}

// Update existing vector
newEmbedding := getUpdatedEmbedding("doc1")
index.Add("doc1", newEmbedding) // Replaces previous

Performance:

Time: O(d) where d is vector dimensions
Space: O(d) additional storage per vector
Thread-safe: Uses write lock during update

Normalization:

Vectors are automatically normalized to unit length, so cosine
similarity becomes a simple dot product during search.

func (*VectorIndex) Clear ¶

func (v *VectorIndex) Clear()

Clear removes all vectors from the index. This is useful when regenerating all embeddings to reset the index state.

func (*VectorIndex) Count ¶

func (v *VectorIndex) Count() int

Count returns the number of vectors in the index.

func (*VectorIndex) GetDimensions ¶

func (v *VectorIndex) GetDimensions() int

GetDimensions returns the vector dimensions.

func (*VectorIndex) GetVector ¶

func (v *VectorIndex) GetVector(id string) ([]float32, bool)

GetVector returns a copy of the normalized vector for id, or (nil, false) if not found. Used when loading a graph-only HNSW index to populate vectors from the vector index.

func (*VectorIndex) HasVector ¶

func (v *VectorIndex) HasVector(id string) bool

HasVector checks if a vector exists for the given ID.

func (*VectorIndex) Load ¶

func (v *VectorIndex) Load(path string) error

Load replaces the vector index with the one stored at path (msgpack format). If the file does not exist or decode fails, the index is left empty (or unchanged on missing file) and no error is returned, so the caller can proceed with a full rebuild. Returns an error only for unexpected I/O (e.g. permission denied).

func (*VectorIndex) Remove ¶

func (v *VectorIndex) Remove(id string)

Remove removes a vector from the index by its ID.

If the vector doesn't exist, this operation is a no-op.

Parameters:

id: Identifier of the vector to remove

Example:

// Remove a document that was deleted
index.Remove("doc-123")

// Remove multiple vectors
for _, docID := range deletedDocs {
	index.Remove(docID)
}

Performance:

Time: O(1) hash map deletion
Thread-safe: Uses write lock during removal

func (*VectorIndex) Save ¶

func (v *VectorIndex) Save(path string) error

Save writes the vector index to path (msgpack format). Dir is created if needed. Copies index data under a short read lock so I/O does not block Search/IndexNode.

func (*VectorIndex) Search ¶

func (v *VectorIndex) Search(ctx context.Context, query []float32, limit int, minSimilarity float64) ([]indexResult, error)

Search finds vectors similar to the query vector using cosine similarity.

The search computes cosine similarity between the query and all indexed vectors, filters by minimum similarity threshold, and returns the top results sorted by similarity score (highest first).

Parameters:

ctx: Context for cancellation and timeouts
query: Query vector (must match index dimensions)
limit: Maximum number of results to return
minSimilarity: Minimum similarity threshold (0.0 to 1.0)

Returns:

Slice of indexResult sorted by similarity (descending)
ErrDimensionMismatch if query dimensions don't match

Example:

// Search for top 10 most similar vectors
results, err := index.Search(ctx, queryVector, 10, 0.0)
if err != nil {
	log.Fatal(err)
}

// Search with high similarity threshold
results, err = index.Search(ctx, queryVector, 5, 0.8)
for _, result := range results {
	fmt.Printf("ID: %s, Similarity: %.3f\n", result.ID, result.Score)
}

// Search with timeout
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
results, err = index.Search(ctx, queryVector, 10, 0.7)

Performance:

Time: O(n×d) where n=vectors, d=dimensions
Space: O(k) where k=number of results
Cancellation: Respects context cancellation during search

Similarity Range:

1.0: Identical vectors (same direction)
0.0: Orthogonal vectors (perpendicular)
-1.0: Opposite vectors (opposite directions)

Thread Safety:

Uses read lock during search, allowing concurrent searches.

type VectorLookup ¶

type VectorLookup func(id string) ([]float32, bool)

VectorLookup returns a vector by ID (e.g. from the vector index). Used when loading a graph-only HNSW file.

type VectorQueryHit ¶

type VectorQueryHit struct {
	ID    string
	Score float64
}

VectorQueryHit is a lightweight result row for Cypher-compatible vector queries. ID is the node ID (not a chunk/named vector ID).

type VectorQuerySpec ¶

type VectorQuerySpec struct {
	// IndexName is informational only (used for debugging/observability).
	IndexName string

	// Label optionally filters candidate nodes to those that have this label.
	Label string

	// Property is the Cypher vector index "property" name.
	//
	// Resolution semantics:
	//   1) Prefer NamedEmbeddings[Property] (or "default" when Property is empty)
	//   2) If Property is set, next try node.Properties[Property] as a vector array
	//   3) Fallback to ChunkEmbeddings[0..N] (best score across chunks)
	Property string

	// Similarity is the similarity function name. Supported values:
	// "cosine" (default), "dot", "euclidean".
	Similarity string

	// Limit is the maximum number of results to return (top-K).
	Limit int
}

VectorQuerySpec describes how to resolve embeddings for a Cypher-style vector query.

This is intentionally a "core" representation of Cypher vector index metadata: the Cypher layer can remain syntax-compatible, while the search layer owns the implementation details and can choose the best execution strategy.

type VectorSearchPipeline ¶

type VectorSearchPipeline struct {
	// contains filtered or unexported fields
}

VectorSearchPipeline implements the unified vector search pipeline.

Pipeline stages:

CandidateGen: Generate candidates (brute-force or HNSW)
ExactScore: Re-score candidates exactly (CPU or GPU)
Filter: Apply minSimilarity threshold
TopK: Return top-k results

func NewVectorSearchPipeline ¶

func NewVectorSearchPipeline(candidateGen CandidateGenerator, exactScorer ExactScorer) *VectorSearchPipeline

NewVectorSearchPipeline creates a new vector search pipeline.

func (*VectorSearchPipeline) Search ¶

func (p *VectorSearchPipeline) Search(ctx context.Context, query []float32, k int, minSimilarity float64) ([]ScoredCandidate, error)

Search performs vector search using the pipeline.

Parameters:

ctx: Context for cancellation
query: Query vector (will be normalized)
k: Desired number of results
minSimilarity: Minimum similarity threshold

Returns:

candidates: Top-k candidates with exact scores
error: Context cancellation or other errors

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL