core

package

v0.1.5 Latest Latest Go to latest Published: May 11, 2026 License: MIT Imports: 12 Imported by: 1

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/DotNetAge/govector

Links

Open Source Insights

Documentation ¶

Overview ¶

Package core provides the core functionality for GoVector, a vector database library. It includes vector indexing, storage, and search capabilities compatible with Qdrant.

Index ¶

func CalculateDistance(metric Distance, a, b []float32) float32
func MatchFilter(payload Payload, filter *Filter) bool
type Collection
- func NewCollection(name string, vectorLen int, metric Distance, store *Storage, useHNSW bool) (*Collection, error)
- func NewCollectionWithParams(name string, vectorLen int, metric Distance, store *Storage, useHNSW bool, ...) (*Collection, error)
- func (c *Collection) Count() int
- func (c *Collection) Delete(points []string, filter *Filter) (int, error)
- func (c *Collection) GetPointsByFilter(filter *Filter) ([]PointStruct, error)
- func (c *Collection) Search(queryVector []float32, filter *Filter, topK int) ([]ScoredPoint, error)
- func (c *Collection) Upsert(points []PointStruct) error
type CollectionMeta
type Condition
type ConditionType
type Distance
type Filter
type FlatIndex
- func NewFlatIndex(metric Distance) *FlatIndex
- func (f *FlatIndex) Count() int
- func (f *FlatIndex) Delete(id string) error
- func (f *FlatIndex) DeleteByFilter(filter *Filter) ([]string, error)
- func (f *FlatIndex) GetIDsByFilter(filter *Filter) []string
- func (f *FlatIndex) GetPointsByFilter(filter *Filter) []PointStruct
- func (f *FlatIndex) Search(query []float32, filter *Filter, topK int) ([]ScoredPoint, error)
- func (f *FlatIndex) Upsert(points []PointStruct) error
type HNSWIndex
- func NewHNSWIndex(metric Distance) *HNSWIndex
- func NewHNSWIndexWithParams(metric Distance, params HNSWParams) *HNSWIndex
- func (h *HNSWIndex) Count() int
- func (h *HNSWIndex) Delete(id string) error
- func (h *HNSWIndex) DeleteByFilter(filter *Filter) ([]string, error)
- func (h *HNSWIndex) GetIDsByFilter(filter *Filter) []string
- func (h *HNSWIndex) GetPointsByFilter(filter *Filter) []PointStruct
- func (h *HNSWIndex) Search(query []float32, filter *Filter, topK int) ([]ScoredPoint, error)
- func (h *HNSWIndex) Upsert(points []PointStruct) error
type HNSWParams
- func DefaultHNSWParams() HNSWParams
type MatchValue
type Payload
type PointStruct
type Quantizer
type RangeValue
type SQ8Quantizer
- func NewSQ8Quantizer() *SQ8Quantizer
- func (q *SQ8Quantizer) Dequantize(data []byte) []float32
- func (q *SQ8Quantizer) GetCompressedSize(dim int) int
- func (q *SQ8Quantizer) Quantize(vector []float32) []byte
type ScoredPoint
type Storage
- func NewStorage(dbPath string) (*Storage, error)
- func NewStorageWithQuantization(dbPath string, useQuant bool, quantizer Quantizer) (*Storage, error)
- func (s *Storage) Close() error
- func (s *Storage) DeletePoints(colName string, ids []string) error
- func (s *Storage) DropCollection(name string) error
- func (s *Storage) EnsureCollection(colName string) error
- func (s *Storage) ListCollectionMetas() ([]CollectionMeta, error)
- func (s *Storage) ListCollections() ([]string, error)
- func (s *Storage) LoadCollection(colName string) (map[string]*PointStruct, error)
- func (s *Storage) LoadCollectionMeta(name string) (*CollectionMeta, error)
- func (s *Storage) SaveCollectionMeta(name string, meta CollectionMeta) error
- func (s *Storage) UpsertPoints(colName string, points []PointStruct) error
type VectorIndex

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func CalculateDistance ¶

func CalculateDistance(metric Distance, a, b []float32) float32

CalculateDistance computes the similarity or distance between two vectors based on the specified metric. For Cosine and Dot, higher values indicate greater similarity. For Euclidean, lower values indicate greater similarity.

func MatchFilter ¶

func MatchFilter(payload Payload, filter *Filter) bool

MatchFilter evaluates whether a given payload matches the filter criteria. It returns true if:

The filter is nil (no filtering)
All Must conditions are satisfied
No MustNot conditions are satisfied

Types ¶

type Collection ¶

type Collection struct {
	Name      string   // Unique name of the collection
	VectorLen int      // Dimension of vectors in this collection
	Metric    Distance // Distance metric used for similarity search
	// contains filtered or unexported fields
}

Collection represents a single logical group of vectors, similar to a table in SQL databases. It provides thread-safe operations for inserting, searching, and deleting vectors. Each collection has a fixed vector dimension and uses a specific distance metric.

func NewCollection ¶

func NewCollection(name string, vectorLen int, metric Distance, store *Storage, useHNSW bool) (*Collection, error)

NewCollection initializes a new vector collection, optionally loading from storage. If useHNSW is true, it uses the optimized HNSW graph search; otherwise uses flat memory search. When storage is provided, existing points are automatically loaded into memory. Returns an error if the collection cannot be created or if loaded points have invalid dimensions.

func NewCollectionWithParams ¶ added in v0.1.3

func NewCollectionWithParams(name string, vectorLen int, metric Distance, store *Storage, useHNSW bool, hnswParams HNSWParams) (*Collection, error)

NewCollectionWithParams initializes a new vector collection with custom HNSW parameters. If useHNSW is true, it uses the optimized HNSW graph search with the provided parameters; otherwise uses flat memory search. When storage is provided, existing points are automatically loaded into memory. Returns an error if the collection cannot be created or if loaded points have invalid dimensions.

func (*Collection) Count ¶

func (c *Collection) Count() int

Count returns the number of points currently stored in the collection.

func (*Collection) Delete ¶

func (c *Collection) Delete(points []string, filter *Filter) (int, error)

Delete removes points either by explicit IDs or by a filter match. If points slice is provided, those specific points are deleted. If filter is provided, all points matching the filter are deleted. Returns the number of points deleted and any error encountered. Returns an error if neither points nor filter is provided. Ensures data consistency between storage and memory index.

func (*Collection) GetPointsByFilter ¶ added in v0.1.5

func (c *Collection) GetPointsByFilter(filter *Filter) ([]PointStruct, error)

GetPointsByFilter returns all points with full payload and vectors that match the given filter.

func (*Collection) Search ¶

func (c *Collection) Search(queryVector []float32, filter *Filter, topK int) ([]ScoredPoint, error)

Search performs a similarity search using the underlying VectorIndex. It finds the topK nearest neighbors to the query vector, optionally filtered by payload. Returns an error if the query vector dimension doesn't match the collection.

func (*Collection) Upsert ¶

func (c *Collection) Upsert(points []PointStruct) error

Upsert adds or updates points in the collection. Points are first persisted to disk (if storage is configured), then updated in the memory index. Returns an error if any point has an invalid vector length or if persistence fails. Ensures data consistency between storage and memory index.

type CollectionMeta ¶ added in v0.1.3

type CollectionMeta struct {
	Name       string     `json:"name"`
	VectorLen  int        `json:"vector_size"`
	Metric     Distance   `json:"distance"`
	UseHNSW    bool       `json:"hnsw"`
	HNSWParams HNSWParams `json:"parameters"`
}

CollectionMeta stores the configuration and metadata for a vector collection. It is used for persisting collection settings and reloading them on server restart.

type Condition ¶

type Condition struct {
	Key   string        `json:"key"`
	Type  ConditionType `json:"type"`
	Match MatchValue    `json:"match,omitempty"`
	Range *RangeValue   `json:"range,omitempty"`
}

Condition represents a single filter condition on a specific payload key.

type ConditionType ¶ added in v0.1.3

type ConditionType string

ConditionType defines the type of filter condition

const (
	// MatchTypeExact exact value match
	MatchTypeExact ConditionType = "exact"
	// MatchTypeRange range match (greater than, less than, etc.)
	MatchTypeRange ConditionType = "range"
	// MatchTypePrefix prefix match
	MatchTypePrefix ConditionType = "prefix"
	// MatchTypeContains contains match (for arrays)
	MatchTypeContains ConditionType = "contains"
	// MatchTypeRegex regex match
	MatchTypeRegex ConditionType = "regex"
)

type Distance ¶

type Distance string

Distance represents the metric used for vector comparison and similarity search. Different metrics are suitable for different use cases and vector types.

const (
	// Cosine measures the cosine of the angle between two vectors.
	// It is normalized by vector magnitude, making it suitable for
	// comparing vectors of different scales. Range: [-1, 1] (higher is more similar)
	Cosine Distance = "Cosine"

	// Euclid measures the straight-line distance between two vectors.
	// It is sensitive to vector magnitude. Range: [0, +inf) (lower is more similar)
	Euclid Distance = "Euclid"

	// Dot computes the dot product of two vectors.
	// It measures both magnitude and direction. Range: (-inf, +inf) (higher is more similar)
	Dot Distance = "Dot"
)

type Filter ¶

type Filter struct {
	Must    []Condition `json:"must,omitempty"`
	MustNot []Condition `json:"must_not,omitempty"`
}

Filter defines conditions for querying or deleting points based on their payload. It supports Must (all conditions must match) and MustNot (all conditions must not match) clauses.

type FlatIndex ¶

type FlatIndex struct {
	// contains filtered or unexported fields
}

FlatIndex implements VectorIndex using a brute-force search algorithm. It stores all vectors in memory and performs exhaustive distance calculations for each query. This provides exact results but has O(n) search complexity, making it suitable for small to medium datasets.

func NewFlatIndex ¶

func NewFlatIndex(metric Distance) *FlatIndex

NewFlatIndex creates a new flat memory index with the specified distance metric.

func (*FlatIndex) Count ¶

func (f *FlatIndex) Count() int

Count returns the number of vectors currently stored in the index.

func (*FlatIndex) Delete ¶

func (f *FlatIndex) Delete(id string) error

Delete removes a point from the index by its ID. Returns an error if the point does not exist.

func (*FlatIndex) DeleteByFilter ¶

func (f *FlatIndex) DeleteByFilter(filter *Filter) ([]string, error)

DeleteByFilter removes all points that match the given filter and returns their IDs.

func (*FlatIndex) GetIDsByFilter ¶ added in v0.1.3

func (f *FlatIndex) GetIDsByFilter(filter *Filter) []string

GetIDsByFilter returns all point IDs that match the given filter.

func (*FlatIndex) GetPointsByFilter ¶ added in v0.1.5

func (f *FlatIndex) GetPointsByFilter(filter *Filter) []PointStruct

GetPointsByFilter returns all points with full payload matching the given filter.

func (*FlatIndex) Search ¶

func (f *FlatIndex) Search(query []float32, filter *Filter, topK int) ([]ScoredPoint, error)

Search performs a brute-force search for the nearest neighbors. It calculates the distance between the query vector and all stored vectors, filters by payload if specified, and returns the topK results. Results are sorted by relevance: descending for Cosine/Dot, ascending for Euclidean.

func (*FlatIndex) Upsert ¶

func (f *FlatIndex) Upsert(points []PointStruct) error

Upsert adds or updates points in the index. If a point with the same ID already exists, it will be overwritten.

type HNSWIndex ¶

type HNSWIndex struct {
	// contains filtered or unexported fields
}

HNSWIndex wraps the coder/hnsw graph to provide an approximate nearest neighbor search. It uses Hierarchical Navigable Small World graphs for efficient similarity search with sub-linear complexity, making it suitable for large datasets.

func NewHNSWIndex ¶

func NewHNSWIndex(metric Distance) *HNSWIndex

NewHNSWIndex creates a new HNSW index engine with the specified distance metric. It configures the underlying HNSW graph with appropriate distance functions for Cosine, Euclidean, or Dot product metrics.

func NewHNSWIndexWithParams ¶ added in v0.1.3

func NewHNSWIndexWithParams(metric Distance, params HNSWParams) *HNSWIndex

NewHNSWIndexWithParams creates a new HNSW index engine with custom parameters. It allows fine-tuning of HNSW parameters for specific use cases.

func (*HNSWIndex) Count ¶

func (h *HNSWIndex) Count() int

Count returns the number of vectors currently stored in the index.

func (*HNSWIndex) Delete ¶

func (h *HNSWIndex) Delete(id string) error

Delete removes a point from both the HNSW graph and the local points map. Returns an error if the point does not exist in the graph.

func (*HNSWIndex) DeleteByFilter ¶

func (h *HNSWIndex) DeleteByFilter(filter *Filter) ([]string, error)

DeleteByFilter removes all points that match the given filter from both the HNSW graph and the local points map. Returns the IDs of deleted points.

func (*HNSWIndex) GetIDsByFilter ¶ added in v0.1.3

func (h *HNSWIndex) GetIDsByFilter(filter *Filter) []string

GetIDsByFilter returns all point IDs that match the given filter.

func (*HNSWIndex) GetPointsByFilter ¶ added in v0.1.5

func (h *HNSWIndex) GetPointsByFilter(filter *Filter) []PointStruct

GetPointsByFilter returns all points with full payload matching the given filter.

func (*HNSWIndex) Search ¶

func (h *HNSWIndex) Search(query []float32, filter *Filter, topK int) ([]ScoredPoint, error)

Search performs an approximate nearest neighbor search using the HNSW algorithm. It uses a post-filtering strategy: over-fetches results to account for filtered points, then applies the payload filter and returns the topK matches.

func (*HNSWIndex) Upsert ¶

func (h *HNSWIndex) Upsert(points []PointStruct) error

Upsert adds or updates points in the HNSW graph. Points are added to both the HNSW graph for search and a local map for payload lookup.

type HNSWParams ¶ added in v0.1.3

type HNSWParams struct {
	// M is the maximum number of connections per node
	// Default: 16
	M int

	// EfConstruction is the size of the dynamic candidate list during construction
	// Default: 200
	EfConstruction int

	// EfSearch is the size of the dynamic candidate list during search
	// Default: 64
	EfSearch int

	// K is the number of nearest neighbors to return
	// Default: 10
	K int
}

HNSWParams contains configurable parameters for HNSW index See https://github.com/coder/hnsw for more details on these parameters

func DefaultHNSWParams ¶ added in v0.1.3

func DefaultHNSWParams() HNSWParams

DefaultHNSWParams returns default HNSW parameters

type MatchValue ¶

type MatchValue struct {
	Value any `json:"value"`
}

MatchValue holds the value to match against in a filter condition.

type Payload ¶

type Payload map[string]any

Payload mimics Qdrant's payload structure, storing metadata as a map of string keys to any values. It is used for filtering points based on their associated metadata.

type PointStruct ¶

type PointStruct struct {
	ID      string    `json:"id"`                // UUID or uint64 (using string for now)
	Version uint64    `json:"version"`           // Incremental or timestamp-based version
	Vector  []float32 `json:"vector"`            // The actual embeddings
	Payload Payload   `json:"payload,omitempty"` // Metadata for filtering
}

PointStruct represents a single vector data point, compatible with Qdrant's data model. It contains a unique identifier, the vector embedding, and optional metadata.

type Quantizer ¶ added in v0.1.3

type Quantizer interface {
	// Quantize compresses a float32 vector to a compressed representation
	Quantize(vector []float32) []byte

	// Dequantize decompresses a compressed representation back to float32
	Dequantize(data []byte) []float32

	// GetCompressedSize returns the size in bytes of a quantized vector
	GetCompressedSize(dim int) int
}

type RangeValue ¶ added in v0.1.3

type RangeValue struct {
	GT  any `json:"gt,omitempty"`  // Greater than
	GTE any `json:"gte,omitempty"` // Greater than or equal
	LT  any `json:"lt,omitempty"`  // Less than
	LTE any `json:"lte,omitempty"` // Less than or equal
}

RangeValue holds range values for range conditions

type SQ8Quantizer ¶ added in v0.1.3

type SQ8Quantizer struct{}

func NewSQ8Quantizer ¶ added in v0.1.3

func NewSQ8Quantizer() *SQ8Quantizer

NewSQ8Quantizer creates a new SQ8 quantizer

func (*SQ8Quantizer) Dequantize ¶ added in v0.1.3

func (q *SQ8Quantizer) Dequantize(data []byte) []float32

Dequantize decompresses an 8-bit integer vector back to float32

func (*SQ8Quantizer) GetCompressedSize ¶ added in v0.1.3

func (q *SQ8Quantizer) GetCompressedSize(dim int) int

GetCompressedSize returns the size in bytes of a quantized vector

func (*SQ8Quantizer) Quantize ¶ added in v0.1.3

func (q *SQ8Quantizer) Quantize(vector []float32) []byte

Quantize compresses a float32 vector to 8-bit integers

type ScoredPoint ¶

type ScoredPoint struct {
	ID      string  `json:"id"`
	Version uint64  `json:"version"`
	Score   float32 `json:"score"`
	Payload Payload `json:"payload,omitempty"`
}

ScoredPoint is returned by search queries, containing the distance score and point data. The Score field represents the computed similarity/distance based on the collection's metric.

type Storage ¶

type Storage struct {
	// contains filtered or unexported fields
}

Storage handles local persistence using BoltDB (bbolt). It provides durable storage for vector collections and their points.

func NewStorage ¶

func NewStorage(dbPath string) (*Storage, error)

NewStorage initializes a new BoltDB storage engine at the specified path. The database file will be created if it doesn't exist. Returns an error if the database cannot be opened.

func NewStorageWithQuantization ¶ added in v0.1.3

func NewStorageWithQuantization(dbPath string, useQuant bool, quantizer Quantizer) (*Storage, error)

NewStorageWithQuantization initializes a new BoltDB storage engine with optional vector quantization. If useQuant is true, vectors will be compressed using the provided quantizer. Returns an error if the database cannot be opened.

func (*Storage) Close ¶

func (s *Storage) Close() error

Close gracefully closes the database connection. It is important to call this when done to ensure all data is flushed to disk. This method is idempotent and can be called multiple times.

func (*Storage) DeletePoints ¶

func (s *Storage) DeletePoints(colName string, ids []string) error

DeletePoints deletes a batch of points from disk by their IDs. If a point ID doesn't exist, it is silently skipped. Returns an error if the collection doesn't exist.

func (*Storage) DropCollection ¶ added in v0.1.4

func (s *Storage) DropCollection(name string) error

DropCollection removes a collection and its metadata from the storage.

func (*Storage) EnsureCollection ¶

func (s *Storage) EnsureCollection(colName string) error

EnsureCollection creates a bucket for the collection if it doesn't already exist. Each collection is stored as a separate bucket in BoltDB.

func (*Storage) ListCollectionMetas ¶ added in v0.1.3

func (s *Storage) ListCollectionMetas() ([]CollectionMeta, error)

ListCollectionMetas returns all collection metadata stored in the database.

func (*Storage) ListCollections ¶ added in v0.1.3

func (s *Storage) ListCollections() ([]string, error)

ListCollections returns all collection names (bucket names) in the storage. It excludes the internal metadata bucket.

func (*Storage) LoadCollection ¶

func (s *Storage) LoadCollection(colName string) (map[string]*PointStruct, error)

LoadCollection loads all points for a collection from disk into memory. Returns a map of point IDs to PointStruct pointers. If quantization is enabled, compressed vectors are decompressed during loading. If the collection doesn't exist, returns an empty map without error.

func (*Storage) LoadCollectionMeta ¶ added in v0.1.3

func (s *Storage) LoadCollectionMeta(name string) (*CollectionMeta, error)

LoadCollectionMeta retrieves collection metadata from the special bucket.

func (*Storage) SaveCollectionMeta ¶ added in v0.1.3

func (s *Storage) SaveCollectionMeta(name string, meta CollectionMeta) error

SaveCollectionMeta persists collection metadata to a special bucket.

func (*Storage) UpsertPoints ¶

func (s *Storage) UpsertPoints(colName string, points []PointStruct) error

UpsertPoints saves or updates a batch of points to disk. Points are serialized as Protocol Buffers and stored with their ID as the key. If quantization is enabled, vectors are compressed before storage. Returns an error if the collection doesn't exist or if serialization fails.

type VectorIndex ¶

type VectorIndex interface {
	// Upsert adds or updates points in the index.
	// If a point with the same ID already exists, it will be overwritten.
	Upsert(points []PointStruct) error

	// Search finds the nearest neighbors to the query vector, optionally applying a filter.
	// Returns up to topK results sorted by relevance.
	Search(query []float32, filter *Filter, topK int) ([]ScoredPoint, error)

	// Delete removes a point from the index by its ID.
	// Returns an error if the point does not exist.
	Delete(id string) error

	// Count returns the number of vectors currently stored in the index.
	Count() int

	// GetIDsByFilter returns all point IDs that match the given filter.
	GetIDsByFilter(filter *Filter) []string

	// GetPointsByFilter returns all points (with full payload) that match the given filter.
	GetPointsByFilter(filter *Filter) []PointStruct

	// DeleteByFilter removes all points that match the given filter and returns their IDs.
	DeleteByFilter(filter *Filter) ([]string, error)
}

VectorIndex defines the standard interface for vector search engines. This allows transparent switching between Flat (brute-force) and HNSW (approximate) strategies.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
proto

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL