vectorcodec

package
v0.0.0-...-f8c6edd Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 4, 2026 License: Apache-2.0 Imports: 3 Imported by: 0

Documentation

Overview

Package vectorcodec is the on-disk byte codec for HNSW vector columns, wire-compatible with Java's RealVector.fromBytes / VectorType. It is a leaf package (stdlib only) so it can be shared by the record-layer maintainer and the Cascades values package without an import cycle — the latter needs it to decode a stored VECTOR column for a row-by-row distance expression.

Format: byte 0 = VectorType ordinal, bytes 1.. = big-endian IEEE-754 payload.

0 = HALF   (16-bit, 2 bytes/component)
1 = SINGLE (32-bit, 4 bytes/component)
2 = DOUBLE (64-bit, 8 bytes/component)
3 = RABITQ (quantized — not decodable here; use the quantizer)

Index

Constants

View Source
const (
	TypeHalf   = typeHalf
	TypeSingle = typeSingle
	TypeDouble = typeDouble
)

Type ordinals re-exported for callers that read components directly.

Variables

This section is empty.

Functions

func Deserialize

func Deserialize(data []byte) ([]float64, error)

func HalfToFloat32

func HalfToFloat32(h uint16) float32

HalfToFloat32 converts an IEEE-754 half-precision (16-bit) value to float32. Exported for zero-alloc readers that decode HALF payloads component by component (see Payload).

func Payload

func Payload(data []byte) (typeOrdinal byte, payload []byte, stride int, ok bool)

Deserialize decodes a stored vector's bytes into float64 components. The precision is self-describing (byte 0), so no external type info is needed. RaBitQ-quantized vectors are not decodable here (they require the quantizer) and return an error.

Payload exposes a stored vector's raw IEEE-754 payload for zero-allocation, component-at-a-time reads (e.g. computing a distance without materializing a []float64). It returns the type ordinal, the payload slice (sans the leading type byte), the number of bytes per component, and ok=false when the data is empty or RaBitQ-quantized (which must go through the VectorQuantizer instead).

func Serialize

func Serialize(vec []float64) []byte

Serialize encodes a float64 vector into the on-disk DOUBLE byte format the HNSW vector index reads (Java RealVector.fromBytes, VectorType.DOUBLE).

func SerializeHalf

func SerializeHalf(vec []float64) []byte

SerializeHalf encodes a float64 vector into the HALF on-disk format (byte 0 = VectorType.HALF, then 2 big-endian bytes per component). Values are rounded to nearest-even half precision; magnitudes beyond half range become ±Inf, exactly as a Java HalfRealVector would store them. The SPFresh index (RFC-094) uses this for centroid/sidecar/staging vector fields — a raw fixed-width layout with no tuple escaping.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL