Documentation
¶
Overview ¶
Package goformersearch provides pure-Go vector similarity search.
It supports brute-force and HNSW (Hierarchical Navigable Small World) algorithms for nearest-neighbour lookups. The library requires no CGO and has zero native dependencies.
Quick start ¶
// Build a brute-force index (exact results). idx := goformersearch.NewFlatIndex(384) idx.Add(1, embedding) // Or an HNSW index (approximate, much faster at scale). idx := goformersearch.NewHNSWIndex(384) idx.Add(1, embedding) // Query the k nearest neighbours. results := idx.Search(query, 10)
Key types ¶
- Index: interface satisfied by all search backends.
- FlatIndex: exact nearest-neighbour search via exhaustive comparison.
- HNSWIndex: approximate nearest-neighbour search using a navigable small-world graph.
- Result: a search result containing the vector ID and similarity score.
Example (FlatSearch) ¶
package main
import (
"fmt"
"math"
"github.com/MichaelAyles/goformersearch"
)
func norm(v []float32) []float32 {
var s float64
for _, x := range v {
s += float64(x) * float64(x)
}
s = math.Sqrt(s)
out := make([]float32, len(v))
for i, x := range v {
out[i] = float32(float64(x) / s)
}
return out
}
func main() {
idx := goformersearch.NewFlatIndex(3)
idx.Add(1, norm([]float32{1, 0, 0}))
idx.Add(2, norm([]float32{0, 1, 0}))
idx.Add(3, norm([]float32{1, 1, 0}))
results := idx.Search(norm([]float32{1, 0, 0}), 2)
for _, r := range results {
fmt.Printf("ID=%d similarity=%.4f\n", r.ID, r.Similarity)
}
}
Output: ID=1 similarity=1.0000 ID=3 similarity=0.7071
Example (HnswSearch) ¶
package main
import (
"fmt"
"math"
"github.com/MichaelAyles/goformersearch"
)
func norm(v []float32) []float32 {
var s float64
for _, x := range v {
s += float64(x) * float64(x)
}
s = math.Sqrt(s)
out := make([]float32, len(v))
for i, x := range v {
out[i] = float32(float64(x) / s)
}
return out
}
func main() {
idx := goformersearch.NewHNSWIndex(3,
goformersearch.WithM(4),
goformersearch.WithEfConstruction(50),
)
idx.Add(1, norm([]float32{1, 0, 0}))
idx.Add(2, norm([]float32{0, 1, 0}))
idx.Add(3, norm([]float32{1, 1, 0}))
idx.SetEfSearch(50)
results := idx.Search(norm([]float32{1, 0, 0}), 2)
for _, r := range results {
fmt.Printf("ID=%d similarity=%.4f\n", r.ID, r.Similarity)
}
}
Output: ID=1 similarity=1.0000 ID=3 similarity=0.7071
Example (SaveLoad) ¶
package main
import (
"bytes"
"fmt"
"math"
"github.com/MichaelAyles/goformersearch"
)
func norm(v []float32) []float32 {
var s float64
for _, x := range v {
s += float64(x) * float64(x)
}
s = math.Sqrt(s)
out := make([]float32, len(v))
for i, x := range v {
out[i] = float32(float64(x) / s)
}
return out
}
func main() {
idx := goformersearch.NewFlatIndex(3)
idx.Add(1, norm([]float32{1, 0, 0}))
idx.Add(2, norm([]float32{0, 1, 0}))
var buf bytes.Buffer
_ = goformersearch.Save(&buf, idx)
loaded, _ := goformersearch.LoadFlat(&buf)
fmt.Printf("Loaded %d vectors of %d dims\n", loaded.Len(), loaded.Dims())
}
Output: Loaded 2 vectors of 3 dims
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CosineSimilarity ¶
CosineSimilarity returns the cosine similarity between two vectors. For L2-normalised vectors (e.g. goformer output), this equals the dot product.
func DotProduct ¶
DotProduct returns the dot product of two vectors.
func L2Distance ¶
L2Distance returns the squared L2 (Euclidean) distance between two vectors.
Types ¶
type FlatIndex ¶
type FlatIndex struct {
// contains filtered or unexported fields
}
FlatIndex is a brute-force exact nearest-neighbour index. It computes cosine similarity against every vector on each query. Exact results, O(n) per query.
Safe for concurrent Search calls once all Add calls are complete.
func NewFlatIndex ¶
NewFlatIndex creates a brute-force index for vectors of the given dimensionality.
type HNSWIndex ¶
type HNSWIndex struct {
// contains filtered or unexported fields
}
HNSWIndex is an approximate nearest-neighbour index using the Hierarchical Navigable Small World algorithm (Malkov & Yashunin, 2018).
Safe for concurrent Search calls once all Add calls are complete.
func NewHNSWIndex ¶
func NewHNSWIndex(dims int, opts ...HNSWOption) *HNSWIndex
NewHNSWIndex creates an HNSW index for approximate nearest-neighbour search.
func (*HNSWIndex) Search ¶
Search returns the k nearest neighbours to the query vector, ordered by decreasing similarity (highest first).
func (*HNSWIndex) SetEfSearch ¶
SetEfSearch adjusts the search-time quality/speed tradeoff. Higher values give better recall at the cost of latency.
type HNSWOption ¶
type HNSWOption func(*hnswConfig)
HNSWOption configures HNSW index parameters.
func WithEfConstruction ¶
func WithEfConstruction(ef int) HNSWOption
WithEfConstruction sets the build-time search width. Default 200. Higher values produce a better graph at the cost of slower insertion.
func WithEfSearch ¶
func WithEfSearch(ef int) HNSWOption
WithEfSearch sets the query-time search width. Default 50. Higher values give better recall at the cost of latency.
func WithM ¶
func WithM(m int) HNSWOption
WithM sets the maximum number of connections per node per layer. Default 16. Higher values improve recall at the cost of memory and build time.
type Index ¶
type Index interface {
// Add inserts a vector with the given ID. The vector is copied.
Add(id uint64, vec []float32)
// Search returns the k nearest neighbours to the query vector,
// ordered by decreasing similarity (highest first).
Search(query []float32, k int) []Result
// Len returns the number of vectors in the index.
Len() int
// Dims returns the dimensionality of the index.
Dims() int
}
Index is the interface implemented by all index types.