turboquant

package module

v0.0.2 Latest Latest Go to latest Published: Apr 1, 2026 License: MIT Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/mredencom/turboquant

Links

Open Source Insights

README ¶

TurboQuant

A Go library implementing the TurboQuant online vector quantization algorithm (arXiv:2504.19874). It compresses float32 vectors to 2/3/4-bit representations using random orthogonal rotation and Lloyd-Max scalar quantization on the Beta distribution — no training data required.

Features

2-bit, 3-bit, and 4-bit quantization
Data-oblivious: works without training data
Concurrent batch quantize/dequantize via goroutines
Compact binary serialization (bit-packed)
Codebook auto-caching (thread-safe)
Deterministic: same seed → same results

Install

go get github.com/mredencom/turboquant@latest

Quick Start

package main

import (
    "fmt"
    "github.com/mredencom/turboquant"
)

func main() {
    // Create a 4-bit quantizer for 128-dimensional vectors
    tq, err := turboquant.NewTurboQuant(128, turboquant.Bit4, 42)
    if err != nil {
        panic(err)
    }

    // Quantize
    vec := make([]float32, 128)
    for i := range vec {
        vec[i] = float32(i) * 0.01
    }
    qv, _ := tq.Quantize(vec)

    // Serialize → Deserialize
    data, _ := tq.Serialize(qv)
    qv2, _ := tq.Deserialize(data)

    // Dequantize
    restored, _ := tq.Dequantize(qv2)

    // Check quality
    sim, _ := turboquant.CosineSimilarity(vec, restored)
    fmt.Printf("Cosine similarity: %.4f\n", sim)
    fmt.Printf("Compression ratio: %.1fx\n", tq.CompressionRatio())
}

API

Method	Description
`NewTurboQuant(dimension, bitWidth, seed)`	Create a quantizer instance
`Quantize(vec)`	Quantize a single float32 vector
`Dequantize(qv)`	Reconstruct a float32 vector
`QuantizeBatch(vecs)`	Batch quantize (concurrent)
`DequantizeBatch(qvs)`	Batch dequantize (concurrent)
`Serialize(qv)`	Serialize to compact binary
`Deserialize(data)`	Deserialize from binary
`CompressionRatio()`	Get theoretical compression ratio
`CosineSimilarity(a, b)`	Compute cosine similarity between two vectors

How It Works

Compute L2 norm and normalize the input vector
Apply a random orthogonal rotation (QR decomposition)
Quantize each rotated coordinate using a Lloyd-Max codebook optimized for the Beta distribution
Store the norm (float32) + quantized indices (bit-packed)

Dequantization reverses the process: look up centroids → inverse rotation → scale by norm.

Quantization Pipeline

flowchart LR
    A[Input vector x] --> B[Compute L2 norm]
    B --> C[Normalize to unit sphere]
    C --> D[Apply orthogonal rotation R·x̂]
    D --> E[Per-coordinate Lloyd-Max quantize]
    E --> F[QuantizedVector: norm + indices]

Module Dependencies

graph TD
    API[turboquant.go<br/>Public API] --> CB[codebook.go<br/>Codebook build & cache]
    API --> ROT[rotation.go<br/>Rotation matrix]
    API --> Q[quantize.go<br/>Quantize / Dequantize]
    Q --> CB
    Q --> ROT
    API --> SER[serialize.go<br/>Serialization]
    CB --> MU[math_utils.go<br/>Math utilities]
    API --> MU

Project Structure

turboquant.go      Public API: NewTurboQuant, Quantize, Dequantize, Batch, Serialize
codebook.go        Lloyd-Max codebook builder with cache
rotation.go        Random orthogonal matrix via QR decomposition
quantize.go        Core quantize/dequantize logic
serialize.go       Bit-packed binary serialization
math_utils.go      Beta PDF, cosine similarity, compression ratio
convert.go         Type conversion helpers (float64, int, byte, string → float32)

Testing

go test -v ./...

50 tests including property-based tests for correctness properties:

Codebook centroid count = 2^bitWidth
Rotation matrix orthogonality (R^T·R ≈ I)
Rotation reproducibility (same seed → same matrix)
Quantize-dequantize cosine similarity thresholds
Serialization round-trip consistency

Benchmarks

Measured on Apple M4 (darwin/arm64), Go 1.24, pure Go BLAS backend.

Run benchmarks yourself:

go test -bench=BenchmarkQuantize -benchmem -benchtime=1s -run='^$' .
go test -bench=BenchmarkDequantize -benchmem -benchtime=1s -run='^$' .

Single-Vector Quantize

Dimension	2-bit	3-bit	4-bit
128	33.8 µs	32.4 µs	35.4 µs
256	183 µs	161 µs	161 µs
512	709 µs	678 µs	742 µs
1024	3.26 ms	3.36 ms	4.42 ms

Single-Vector Dequantize

Dimension	2-bit	3-bit	4-bit
128	21.6 µs	17.6 µs	18.3 µs
256	71.7 µs	68.4 µs	61.7 µs
512	304 µs	297 µs	263 µs
1024	1.30 ms	1.47 ms	1.23 ms

Batch Quantize (dim=256, 4-bit)

Batch Size	Time	Allocs
100	6.21 ms	907
1,000	40.1 ms	9,034
10,000	387 ms	90,079

Serialize / Deserialize (dim=256)

Operation	2-bit	3-bit	4-bit
Serialize	322 ns	1.03 µs	445 ns
Deserialize	770 ns	1.09 µs	721 ns

Performance Tuning

By default, TurboQuant uses gonum's pure Go BLAS backend — no CGO or system libraries required. For large dimensions (≥ 512), you can link OpenBLAS or Intel MKL for 2–6× faster matrix-vector operations:

import _ "gonum.org/v1/gonum/blas/cgo"  // activate native BLAS

# macOS
brew install openblas
CGO_ENABLED=1 go build ./...

# Linux
sudo apt-get install libopenblas-dev
CGO_ENABLED=1 go build ./...

For dimensions ≤ 256, the pure Go backend is typically fast enough. See docs/BLAS.md for full benchmarks, MKL setup, and detailed guidance.

License

This project is licensed under the MIT License.

Documentation ¶

Overview ¶

Package turboquant implements the TurboQuant online vector quantization algorithm from "TurboQuant: Online Vector Quantization" (arXiv:2504.19874) by Google Research.

TurboQuant compresses float32 vectors into compact low-bit representations using a two-step approach: random orthogonal rotation followed by Lloyd-Max scalar quantization on a Beta distribution. The algorithm is data-oblivious, meaning no training data is needed — codebooks are derived analytically from the statistical properties of uniformly distributed unit-sphere vectors.

Algorithm ¶

After rotating a normalized vector by a random orthogonal matrix, each coordinate is approximately distributed as Beta((d-1)/2, (d-1)/2) where d is the vector dimension. A Lloyd-Max optimal scalar quantizer is pre-computed on this distribution, and each rotated coordinate is independently quantized by looking up the nearest centroid. Dequantization reverses the process: centroid lookup, inverse rotation (transpose), and norm rescaling.

Supported Bit Widths ¶

The SDK supports 2-bit, 3-bit, and 4-bit quantization. Higher bit widths yield better reconstruction quality (higher cosine similarity) at the cost of larger compressed size. For vectors with dimension ≥ 64, typical cosine similarities are ≥ 0.90 (2-bit), ≥ 0.96 (3-bit), and ≥ 0.99 (4-bit).

Usage ¶

The main entry point is NewTurboQuant, which builds (or retrieves from cache) the Lloyd-Max codebook and generates the rotation matrix:

tq, err := turboquant.NewTurboQuant(128, 4, 42) // dim=128, 4-bit, seed=42
if err != nil {
    log.Fatal(err)
}

// Quantize a vector
qv, err := tq.Quantize(vec)

// Serialize to compact binary
data, err := tq.Serialize(qv)

// Deserialize and dequantize
qv2, err := tq.Deserialize(data)
restored, err := tq.Dequantize(qv2)

Batch operations (TurboQuant.QuantizeBatch, TurboQuant.DequantizeBatch) process multiple vectors concurrently using goroutines.

Codebooks are cached globally by (dimension, bitWidth), so creating multiple TurboQuant instances with the same parameters reuses the codebook.

Index ¶

Constants
func BetaPDF(x, alpha, beta float64) float64
func BytesToFloat32s(src []byte) []float32
func CompressionRatio(dimension, bitWidth int) float64
func CosineSimilarity(a, b []float32) (float64, error)
func Float32sToBytes(src []float32) []byte
func Float32sToFloat64s(src []float32) []float64
func Float32sToInts(src []float32) []int
func Float32sToString(src []float32) string
func Float64sToFloat32s(src []float64) []float32
func IntsToFloat32s(src []int) []float32
func ResetCodebookCache()
func SerializeQuantizedVector(qv *QuantizedVector, bitWidth int) ([]byte, error)
func SerializeQuantizedVectorTo(qv *QuantizedVector, bitWidth int, w io.Writer) error
func StringToFloat32s(s string) []float32
func ValidateBitWidth(bitWidth int) error
func ValidateDimension(dimension int) error
type Codebook
- func GetOrBuildCodebook(dimension, bitWidth int) (*Codebook, error)
- func (c *Codebook) FindNearestIndex(value float64) uint8
type CodebookBuilder
- func NewCodebookBuilder() *CodebookBuilder
- func (cb *CodebookBuilder) Build(dimension, bitWidth int) (*Codebook, error)
type Matrix
- func NewRandomOrthogonalMatrix(dimension int, seed int64) (*Matrix, error)
- func (m *Matrix) Apply(vec []float64) []float64
- func (m *Matrix) ApplyInto(vec, dst []float64)
- func (m *Matrix) ApplyTranspose(vec []float64) []float64
- func (m *Matrix) ApplyTransposeInto(vec, dst []float64)
type Option
- func WithConcurrency(n int) Option
- func WithGridPoints(n int) Option
- func WithIterations(n int) Option
type QuantizedVector
- func DeserializeQuantizedVector(data []byte, bitWidth, dimension int) (*QuantizedVector, error)
- func DeserializeQuantizedVectorFrom(r io.Reader, bitWidth, dimension int) (*QuantizedVector, error)
type TurboQuant
- func NewTurboQuant(dimension, bitWidth int, seed int64, opts ...Option) (*TurboQuant, error)
- func (tq *TurboQuant) BitWidth() int
- func (tq *TurboQuant) CompressionRatio() float64
- func (tq *TurboQuant) Concurrency() int
- func (tq *TurboQuant) Dequantize(qv *QuantizedVector) ([]float32, error)
- func (tq *TurboQuant) DequantizeBatch(qvs []*QuantizedVector) ([][]float32, error)
- func (tq *TurboQuant) DequantizeBatchFloat64(qvs []*QuantizedVector) ([][]float64, error)
- func (tq *TurboQuant) DequantizeFloat64(qv *QuantizedVector) ([]float64, error)
- func (tq *TurboQuant) Deserialize(data []byte) (*QuantizedVector, error)
- func (tq *TurboQuant) DeserializeBatchFrom(r io.Reader) ([]*QuantizedVector, error)
- func (tq *TurboQuant) DeserializeFrom(r io.Reader) (*QuantizedVector, error)
- func (tq *TurboQuant) Dimension() int
- func (tq *TurboQuant) Quantize(vec []float32) (*QuantizedVector, error)
- func (tq *TurboQuant) QuantizeBatch(vecs [][]float32) ([]*QuantizedVector, error)
- func (tq *TurboQuant) QuantizeBatchFloat64(vecs [][]float64) ([]*QuantizedVector, error)
- func (tq *TurboQuant) QuantizeFloat64(vec []float64) (*QuantizedVector, error)
- func (tq *TurboQuant) Serialize(qv *QuantizedVector) ([]byte, error)
- func (tq *TurboQuant) SerializeBatchTo(qvs []*QuantizedVector, w io.Writer) error
- func (tq *TurboQuant) SerializeTo(qv *QuantizedVector, w io.Writer) error

Constants ¶

View Source

const (
	Bit2 = 2
	Bit3 = 3
	Bit4 = 4
)

BitWidth constants for supported quantization bit widths.

Variables ¶

This section is empty.

Functions ¶

func BetaPDF ¶

func BetaPDF(x, alpha, beta float64) float64

BetaPDF computes the probability density of the Beta(alpha, beta) distribution at x. Uses log-space computation via math.Lgamma to avoid numerical overflow. Returns 0.0 for x outside the open interval (0, 1).

func BytesToFloat32s ¶

func BytesToFloat32s(src []byte) []float32

BytesToFloat32s converts a []byte slice to []float32. Each byte value (0-255) is stored as a float32.

func CompressionRatio ¶

func CompressionRatio(dimension, bitWidth int) float64

CompressionRatio computes the theoretical compression ratio for a given dimension and bit width. Formula: (dimension * 32) / (32 + dimension * bitWidth) Original size: dimension * 32 bits (one float32 per element). Compressed size: 32 bits (float32 norm) + dimension * bitWidth bits.

func CosineSimilarity ¶

func CosineSimilarity(a, b []float32) (float64, error)

CosineSimilarity computes the cosine similarity between two float32 vectors. Returns a float64 value in [-1, 1]. Returns an error if the vectors have different dimensions. Returns 0.0 if either vector is a zero vector.

Example ¶

ExampleCosineSimilarity demonstrates computing cosine similarity between an original vector and its quantized-then-dequantized version.

package main

import (
	"fmt"
	"log"

	"github.com/mredencom/turboquant"
)

func main() {
	tq, err := turboquant.NewTurboQuant(8, 4, 42)
	if err != nil {
		log.Fatal(err)
	}

	vec := []float32{1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0}
	qv, err := tq.Quantize(vec)
	if err != nil {
		log.Fatal(err)
	}
	restored, err := tq.Dequantize(qv)
	if err != nil {
		log.Fatal(err)
	}

	sim, err := turboquant.CosineSimilarity(vec, restored)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("Cosine similarity: %.4f\n", sim)
}

Output:
Cosine similarity: 0.9979

func Float32sToBytes ¶

func Float32sToBytes(src []float32) []byte

Float32sToBytes converts a []float32 slice back to []byte by rounding and clamping to [0, 255].

func Float32sToFloat64s ¶

func Float32sToFloat64s(src []float32) []float64

Float32sToFloat64s converts a []float32 slice to []float64.

func Float32sToInts ¶

func Float32sToInts(src []float32) []int

Float32sToInts converts a []float32 slice back to []int by rounding.

func Float32sToString ¶

func Float32sToString(src []float32) string

Float32sToString converts a []float32 slice back to a string by rounding each value to a byte.

func Float64sToFloat32s ¶

func Float64sToFloat32s(src []float64) []float32

Float64sToFloat32s converts a []float64 slice to []float32.

func IntsToFloat32s ¶

func IntsToFloat32s(src []int) []float32

IntsToFloat32s converts a []int slice to []float32.

func ResetCodebookCache ¶

func ResetCodebookCache()

ResetCodebookCache clears the global codebook cache. Intended for testing.

func SerializeQuantizedVector ¶

func SerializeQuantizedVector(qv *QuantizedVector, bitWidth int) ([]byte, error)

SerializeQuantizedVector serializes a QuantizedVector into a compact binary format. Format: [4 bytes float32 norm (little-endian)][bit-packed indices] Packing rules:

2-bit: 4 indices per byte, low bits first
3-bit: bitstream, indices packed continuously across byte boundaries
4-bit: 2 indices per byte, low nibble first

func SerializeQuantizedVectorTo ¶ added in v0.0.2

func SerializeQuantizedVectorTo(qv *QuantizedVector, bitWidth int, w io.Writer) error

SerializeQuantizedVectorTo writes a QuantizedVector directly to an io.Writer using the same binary format as SerializeQuantizedVector.

func StringToFloat32s ¶

func StringToFloat32s(s string) []float32

StringToFloat32s converts a string to []float32 by treating each byte as a float32 value. This is a raw byte-level conversion, not a semantic embedding.

func ValidateBitWidth ¶

func ValidateBitWidth(bitWidth int) error

ValidateBitWidth returns an error if bitWidth is not one of the supported values (2, 3, or 4).

func ValidateDimension ¶

func ValidateDimension(dimension int) error

ValidateDimension returns an error if dimension is less than 2.

Types ¶

type Codebook ¶

type Codebook struct {
	Centroids  []float64 // 2^BitWidth centroids, sorted ascending
	Boundaries []float64 // 2^BitWidth - 1 partition boundaries
	BitWidth   int
}

Codebook contains the centroids and partition boundaries of a Lloyd-Max quantizer.

func GetOrBuildCodebook ¶

func GetOrBuildCodebook(dimension, bitWidth int) (*Codebook, error)

GetOrBuildCodebook returns a cached Codebook for the given parameters, or builds a new one using NewCodebookBuilder().Build and caches it. Thread-safe via sync.Map.

func (*Codebook) FindNearestIndex ¶

func (c *Codebook) FindNearestIndex(value float64) uint8

FindNearestIndex finds the index of the nearest centroid for the given value using binary search on the partition boundaries. The boundaries divide the real line into intervals, each mapped to a centroid index. Interval mapping: (-inf, b[0]) -> 0, [b[0], b[1]) -> 1, ..., [b[n-1], +inf) -> n

type CodebookBuilder ¶

type CodebookBuilder struct {
	// contains filtered or unexported fields
}

CodebookBuilder constructs a Codebook by running Lloyd-Max optimization on the Beta distribution derived from the vector dimension.

func NewCodebookBuilder ¶

func NewCodebookBuilder() *CodebookBuilder

NewCodebookBuilder returns a CodebookBuilder with default parameters (gridPoints=50000, iterations=300).

func (*CodebookBuilder) Build ¶

func (cb *CodebookBuilder) Build(dimension, bitWidth int) (*Codebook, error)

Build constructs a Codebook for the given dimension and bitWidth using Lloyd-Max optimization on the Beta((d-1)/2, (d-1)/2) distribution.

The Beta distribution is defined on (0,1) and mapped to (-1,1) via x_mapped = 2*x - 1. Returns an error if bitWidth is not 2, 3, or 4, or if dimension < 2.

type Matrix ¶

type Matrix struct {
	// contains filtered or unexported fields
}

Matrix represents a dense matrix, internally using gonum/mat.Dense.

func NewRandomOrthogonalMatrix ¶

func NewRandomOrthogonalMatrix(dimension int, seed int64) (*Matrix, error)

NewRandomOrthogonalMatrix generates a random orthogonal matrix. Obtained by QR decomposition of a random Gaussian matrix. Same seed produces the same matrix. Returns an error if dimension < 2.

func (*Matrix) Apply ¶

func (m *Matrix) Apply(vec []float64) []float64

Apply multiplies the matrix by a vector: result = M * vec.

func (*Matrix) ApplyInto ¶ added in v0.0.2

func (m *Matrix) ApplyInto(vec, dst []float64)

ApplyInto multiplies the matrix by a vector, writing the result into dst. dst must have length >= m.dim.

func (*Matrix) ApplyTranspose ¶

func (m *Matrix) ApplyTranspose(vec []float64) []float64

ApplyTranspose multiplies the matrix transpose by a vector: result = M^T * vec.

func (*Matrix) ApplyTransposeInto ¶ added in v0.0.2

func (m *Matrix) ApplyTransposeInto(vec, dst []float64)

ApplyTransposeInto multiplies the matrix transpose by a vector, writing the result into dst. dst must have length >= m.dim.

type Option ¶ added in v0.0.2

type Option func(*options)

Option is a functional option for configuring NewTurboQuant.

func WithConcurrency ¶ added in v0.0.2

func WithConcurrency(n int) Option

WithConcurrency sets the maximum number of concurrent goroutines used by QuantizeBatch and DequantizeBatch. The default (0) resolves to runtime.NumCPU(). Values less than 1 are treated as runtime.NumCPU().

func WithGridPoints ¶ added in v0.0.2

func WithGridPoints(n int) Option

WithGridPoints sets the number of grid points for numerical integration in the Lloyd-Max codebook builder. Default is 50000.

func WithIterations ¶ added in v0.0.2

func WithIterations(n int) Option

WithIterations sets the number of Lloyd-Max iterations for codebook construction. Default is 300.

type QuantizedVector ¶

type QuantizedVector struct {
	Norm    float32 // L2 norm of the original vector
	Indices []uint8 // Quantization index for each coordinate
}

QuantizedVector represents a quantized vector, containing the original L2 norm and an array of quantization indices.

func DeserializeQuantizedVector ¶

func DeserializeQuantizedVector(data []byte, bitWidth, dimension int) (*QuantizedVector, error)

DeserializeQuantizedVector deserializes a compact binary byte slice back into a QuantizedVector. Format: [4 bytes float32 norm (little-endian)][bit-packed indices] Returns a format error if the byte slice length does not match the expected size.

func DeserializeQuantizedVectorFrom ¶ added in v0.0.2

func DeserializeQuantizedVectorFrom(r io.Reader, bitWidth, dimension int) (*QuantizedVector, error)

DeserializeQuantizedVectorFrom reads and deserializes a QuantizedVector from an io.Reader. It uses the same binary format as DeserializeQuantizedVector.

type TurboQuant ¶

type TurboQuant struct {
	// contains filtered or unexported fields
}

TurboQuant is the core entry point of the SDK, encapsulating all quantization functionality.

func NewTurboQuant ¶

func NewTurboQuant(dimension, bitWidth int, seed int64, opts ...Option) (*TurboQuant, error)

NewTurboQuant creates and initializes a quantizer instance. dimension: vector dimension, must be >= 2 bitWidth: quantization bit width, must be 2, 3, or 4 seed: random seed for rotation matrix generation; same seed produces same matrix opts: optional functional options (WithGridPoints, WithIterations)

Example ¶

ExampleNewTurboQuant demonstrates creating a TurboQuant quantizer instance.

package main

import (
	"fmt"
	"log"

	"github.com/mredencom/turboquant"
)

func main() {
	tq, err := turboquant.NewTurboQuant(8, 4, 42) // dim=8, 4-bit, seed=42
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("Dimension: %d\n", tq.Dimension())
	fmt.Printf("BitWidth: %d\n", tq.BitWidth())
	fmt.Printf("CompressionRatio: %.2f\n", tq.CompressionRatio())
}

Output:
Dimension: 8
BitWidth: 4
CompressionRatio: 4.00

func (*TurboQuant) BitWidth ¶

func (tq *TurboQuant) BitWidth() int

BitWidth returns the quantization bit width of this quantizer.

func (*TurboQuant) CompressionRatio ¶

func (tq *TurboQuant) CompressionRatio() float64

CompressionRatio returns the theoretical compression ratio for the current configuration.

func (*TurboQuant) Concurrency ¶ added in v0.0.2

func (tq *TurboQuant) Concurrency() int

Concurrency returns the maximum number of concurrent goroutines used by batch operations.

func (*TurboQuant) Dequantize ¶

func (tq *TurboQuant) Dequantize(qv *QuantizedVector) ([]float32, error)

Dequantize reconstructs a float32 vector from a QuantizedVector.

func (*TurboQuant) DequantizeBatch ¶

func (tq *TurboQuant) DequantizeBatch(qvs []*QuantizedVector) ([][]float32, error)

DequantizeBatch dequantizes multiple QuantizedVectors concurrently using a worker pool. Concurrency is controlled by the WithConcurrency option (default: runtime.NumCPU()).

func (*TurboQuant) DequantizeBatchFloat64 ¶ added in v0.0.2

func (tq *TurboQuant) DequantizeBatchFloat64(qvs []*QuantizedVector) ([][]float64, error)

DequantizeBatchFloat64 batch-dequantizes multiple QuantizedVectors, returning float64 vectors. It delegates to DequantizeBatch, then converts each result to float64.

func (*TurboQuant) DequantizeFloat64 ¶ added in v0.0.2

func (tq *TurboQuant) DequantizeFloat64(qv *QuantizedVector) ([]float64, error)

DequantizeFloat64 reconstructs a float64 vector from a QuantizedVector. It delegates to Dequantize, then converts the result to float64 using Float32sToFloat64s.

func (*TurboQuant) Deserialize ¶

func (tq *TurboQuant) Deserialize(data []byte) (*QuantizedVector, error)

Deserialize deserializes a binary byte slice into a QuantizedVector.

func (*TurboQuant) DeserializeBatchFrom ¶ added in v0.0.2

func (tq *TurboQuant) DeserializeBatchFrom(r io.Reader) ([]*QuantizedVector, error)

DeserializeBatchFrom reads multiple QuantizedVectors from an io.Reader. Expects a 4-byte uint32 count header followed by that many serialized vectors.

func (*TurboQuant) DeserializeFrom ¶ added in v0.0.2

func (tq *TurboQuant) DeserializeFrom(r io.Reader) (*QuantizedVector, error)

DeserializeFrom reads and deserializes a QuantizedVector from an io.Reader.

func (*TurboQuant) Dimension ¶

func (tq *TurboQuant) Dimension() int

Dimension returns the vector dimension of this quantizer.

func (*TurboQuant) Quantize ¶

func (tq *TurboQuant) Quantize(vec []float32) (*QuantizedVector, error)

Quantize quantizes a single float32 vector into a QuantizedVector.

Example ¶

ExampleTurboQuant_Quantize demonstrates quantizing a float32 vector.

package main

import (
	"fmt"
	"log"

	"github.com/mredencom/turboquant"
)

func main() {
	tq, err := turboquant.NewTurboQuant(8, 4, 42)
	if err != nil {
		log.Fatal(err)
	}

	vec := []float32{1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0}
	qv, err := tq.Quantize(vec)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("Norm: %.4f\n", qv.Norm)
	fmt.Printf("Indices length: %d\n", len(qv.Indices))
}

Output:
Norm: 14.2829
Indices length: 8

func (*TurboQuant) QuantizeBatch ¶

func (tq *TurboQuant) QuantizeBatch(vecs [][]float32) ([]*QuantizedVector, error)

QuantizeBatch quantizes multiple vectors concurrently using a worker pool. Concurrency is controlled by the WithConcurrency option (default: runtime.NumCPU()). All vectors must have the same dimension as the TurboQuant instance. If any vector has a mismatched dimension, returns an error indicating the first such index.

func (*TurboQuant) QuantizeBatchFloat64 ¶ added in v0.0.2

func (tq *TurboQuant) QuantizeBatchFloat64(vecs [][]float64) ([]*QuantizedVector, error)

QuantizeBatchFloat64 batch-quantizes multiple float64 vectors with concurrent execution. Each vector is converted to float32 before quantization.

func (*TurboQuant) QuantizeFloat64 ¶ added in v0.0.2

func (tq *TurboQuant) QuantizeFloat64(vec []float64) (*QuantizedVector, error)

QuantizeFloat64 quantizes a single float64 vector into a QuantizedVector. It converts the input to float32 using Float64sToFloat32s, then delegates to Quantize.

func (*TurboQuant) Serialize ¶

func (tq *TurboQuant) Serialize(qv *QuantizedVector) ([]byte, error)

Serialize serializes a QuantizedVector into a compact binary byte slice.

Example ¶

ExampleTurboQuant_Serialize demonstrates the full serialize/deserialize round-trip.

package main

import (
	"fmt"
	"log"

	"github.com/mredencom/turboquant"
)

func main() {
	tq, err := turboquant.NewTurboQuant(8, 4, 42)
	if err != nil {
		log.Fatal(err)
	}

	vec := []float32{1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0}
	qv, err := tq.Quantize(vec)
	if err != nil {
		log.Fatal(err)
	}

	// Serialize to compact binary
	data, err := tq.Serialize(qv)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("Serialized bytes: %d\n", len(data))

	// Deserialize back
	qv2, err := tq.Deserialize(data)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("Deserialized norm: %.4f\n", qv2.Norm)
	fmt.Printf("Indices match: %v\n", indicesEqual(qv.Indices, qv2.Indices))
}

func indicesEqual(a, b []uint8) bool {
	if len(a) != len(b) {
		return false
	}
	for i := range a {
		if a[i] != b[i] {
			return false
		}
	}
	return true
}

Output:
Serialized bytes: 8
Deserialized norm: 14.2829
Indices match: true

func (*TurboQuant) SerializeBatchTo ¶ added in v0.0.2

func (tq *TurboQuant) SerializeBatchTo(qvs []*QuantizedVector, w io.Writer) error

SerializeBatchTo writes multiple QuantizedVectors sequentially to an io.Writer. Format: 4-byte uint32 count (little-endian) followed by count serialized vectors.

func (*TurboQuant) SerializeTo ¶ added in v0.0.2

func (tq *TurboQuant) SerializeTo(qv *QuantizedVector, w io.Writer) error

SerializeTo writes a QuantizedVector directly to an io.Writer using the compact binary format.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
example
benchmark command
compare command
convert command
kvcache command

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL