gloom

package module

v0.0.3 Latest Latest Go to latest Published: Feb 15, 2026 License: MIT Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/jcalabro/gloom

Links

Open Source Insights

README ¶

Gloom

A high-performance bloom filter library for Go, implementing cache-line blocked one-hashing (OHBF).

Features

Cache-line optimized: All k bit probes for a key are aligned and fit within a single 64-byte cache line, minimizing memory access latency
One-hashing technique: Uses a single xxh3 call with prime modulo partitions instead of k independent hash functions
Three implementations:
- Filter - Non-thread-safe, fastest for single-threaded workloads, allows for serialization/deserialization
- AtomicFilter - Thread-safe using atomic.Uint64.Or(), best for read-heavy concurrent workloads
- ShardedAtomicFilter - Thread-safe with sharding, best for write-heavy concurrent workloads
Zero allocations: Hot paths (Add/Test) allocate no memory
100% test coverage: Comprehensive test suite
Go 1.23+: Uses modern atomic operations for best performance

Usage

Single-Threaded Usage

Requires the caller to synchronize parallel reads and writes, if any.

package main

import "github.com/jcalabro/gloom"

func main() {
    // Create a filter for 1 million items with 1% false positive rate
    f := gloom.New(1_000_000, 0.01)

    // Add items
    f.Add([]byte("hello"))
    f.AddString("world")

    // Test membership
    if f.Test([]byte("hello")) {
        println("hello might be present")
    }
    if !f.TestString("not-added") {
        println("definitely not present")
    }
}

Thread-Safe Usage

package main

import (
    "sync"
    "github.com/jcalabro/gloom"
)

func main() {
    // Create an atomic filter for concurrent access
    f := gloom.NewAtomic(1_000_000, 0.01)

    var wg sync.WaitGroup
    defer wg.Wait()

    for i := range 8 {
        wg.Go(func() {
            for j := range 100000 {
                f.AddString(fmt.Sprintf("key-%d-%d", i, j))
            }
        })
    }
}

High-Throughput Concurrent Writes

For write-heavy concurrent workloads, use ShardedAtomicFilter which distributes writes across multiple independent shards to reduce contention:

package main

import (
    "sync"
    "github.com/jcalabro/gloom"
)

func main() {
    // Create a sharded filter with auto-tuned shard count (based on GOMAXPROCS)
    f := gloom.NewShardedAtomicDefault(1_000_000, 0.01)

    // Or specify shard count explicitly (must be power of 2)
    // f := gloom.NewShardedAtomic(1_000_000, 0.01, 16)

    var wg sync.WaitGroup
    defer wg.Wait()

    for i := range 32 {
        wg.Go(func() {
            for j := range 100000 {
                f.AddString(fmt.Sprintf("key-%d-%d", i, j))
            }
        })
    }
}

Advanced Configuration

// Create with explicit parameters
// numBlocks: number of 512-bit cache-line blocks
// k: number of hash functions (partitions)
f := gloom.NewWithParams(1000, 7)

// Get filter statistics
fmt.Printf("Capacity: %d bits\n", f.Cap())
fmt.Printf("Hash functions: %d\n", f.K())
fmt.Printf("Items added: %d\n", f.Count())
fmt.Printf("Fill ratio: %.2f%%\n", f.EstimatedFillRatio()*100)
fmt.Printf("Est. FP rate: %.4f%%\n", f.EstimatedFalsePositiveRate()*100)

Design

Cache-Line Blocked One-Hashing

Traditional bloom filters use k independent hash functions, each potentially accessing a different cache line. Gloom instead:

Blocks memory into 512-bit (64-byte) chunks matching CPU cache line size
Uses one xxh3 call per operation - upper 32 bits select the block, lower 32 bits are reused
Partitions each block by k primes - the same hash value mod different primes gives k independent bit positions

┌───────────────────────────────────┐
│          Key: "hello"             │
└─────────────────┬─────────────────┘
                  │
                  ▼
┌───────────────────────────────────┐
│         xxh3.Hash(key)            │
│    h = 0xA1B2C3D4_E5F67890        │
└─────────────────┬─────────────────┘
                  │
    ┌─────────────┴─────────────┐
    │                           │
    ▼                           ▼
┌─────────────────┐   ┌─────────────────┐
│  Block Index    │   │  Intra-Hash     │
│  h >> 32        │   │  uint32(h)      │
│  % numBlocks    │   │  = 0xE5F67890   │
│  = Block 2      │   │                 │
└────────┬────────┘   └────────┬────────┘
         │                     │
         └──────────┬──────────┘
                    ▼
┌───────────────────────────────────────────────────────────────────┐
│                      Bloom Filter Memory                          │
│ ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────────┐ │
│ │ Block 0 │ Block 1 │ Block 2 │ Block 3 │ Block 4 │ ...         │ │
│ │ 512 bits│ 512 bits│ 512 bits│ 512 bits│ 512 bits│             │ │
│ └─────────┴─────────┴────┬────┴─────────┴─────────┴─────────────┘ │
│                          │                                        │
│              ┌───────────┴───────────┐                            │
│              ▼   ONE cache line (64B)▼                            │
│ ┌────────────────────────────────────────────────────────────┐    │
│ │                    Block 2 (512 bits)                      │    │
│ │ ┌──────┬──────┬──────┬──────┬──────┬──────┬──────┐         │    │
│ │ │ 67   │ 71   │ 73   │ 79   │ 83   │ 89   │ 50   │ ← prime │    │
│ │ │ bits │ bits │ bits │ bits │ bits │ bits │ bits │   sizes │    │
│ │ └──┬───┴──┬───┴──┬───┴──┬───┴──┬───┴──┬───┴──┬───┘         │    │
│ │    │      │      │      │      │      │                    │    │
│ │    ▼      ▼      ▼      ▼      ▼      ▼      ▼             │    │
│ │  h%67   h%71   h%73   h%79   h%83   h%89   h%50            │    │
│ │  =23    =45    =12    =67    =34    =78    =40             │    │
│ │    │      │      │      │      │      │      │             │    │
│ │    ▼      ▼      ▼      ▼      ▼      ▼      ▼             │    │
│ │  SET    SET    SET    SET    SET    SET    SET             │    │
│ │ bit 23 bit 112 bit 185 bit 264 bit 347 bit 436 bit 486     │    │
│ └────────────────────────────────────────────────────────────┘    │
└───────────────────────────────────────────────────────────────────┘

References

Space/Time Trade-offs in Hash Coding with Allowable Errors - Original bloom filter paper (Bloom, 1970)
Network Applications of Bloom Filters: A Survey - Comprehensive survey (Broder & Mitzenmacher, 2004)
One-Hashing Bloom Filter - Prime partition technique
Less Hashing, Same Performance - Double hashing theory (Kirsch & Mitzenmacher, 2006)
RocksDB Bloom Filter - Cache-line blocking implementation
xxHash - Extremely fast hash algorithm (used via zeebo/xxh3)

Benchmarks

The times shown below are the mean time per operation as reported by Go's testing.B framework. For latency percentiles, see the Histogram Benchmarks section.

Benchmarks run on AMD Ryzen 9 9950X (32 threads), Go 1.23+, comparing against:

bits-and-blooms/bloom - Popular non-thread-safe implementation
jazware/atomic-bloom - Thread-safe fork of bits-and-blooms using atomics
greatroar/blobloom - Cache-blocked filter (requires pre-hashing)

Sequential Performance (single-threaded)

Operation	Gloom	Gloom Atomic	BitsAndBlooms	AtomicBloom	Blobloom*
Add	19.9 ns	35.7 ns	44.4 ns	57.5 ns	16.9 ns
Test	16.3 ns	17.4 ns	40.7 ns	41.4 ns	5.5 ns

*Blobloom requires pre-hashing input, so times exclude hash computation.

Parallel Performance (GOMAXPROCS=32)

Operation	Gloom Atomic	Gloom Sharded	AtomicBloom
Parallel Add	51.3 ns	11.2 ns	19.2 ns
Parallel Test	0.96 ns	1.00 ns	1.94 ns
Mixed R/W	30.9 ns	7.1 ns	19.6 ns
High Contention	64.6 ns	17.2 ns	43.1 ns

Throughput

Implementation	Items/sec (8 goroutines)
Gloom (non-atomic)	38.3M items/sec
Gloom Atomic	19.4M items/sec
Gloom Sharded	78.6M items/sec

Histogram Benchmarks

Sample output (Add operations on the same hardware as above):

Library	Mean	p50	p99	p9999
Gloom	57 ns	30 ns	60 ns	401 ns
GloomAtomic	76 ns	50 ns	100 ns	501 ns
GloomSharded	75 ns	50 ns	91 ns	471 ns
BitsAndBlooms	103 ns	80 ns	170 ns	682 ns
AtomicBloom	95 ns	70 ns	120 ns	571 ns
Blobloom*	52 ns	30 ns	60 ns	380 ns

Running Tests and Benchmarks

# Using https://github.com/casey/just
just # runs the linter and short tests with the race detector enabled

just test
just t # runs without the race detector

just test-long
just t-long

just bench
just bench-long

Tips

For maximum performance on modern x86-64 CPUs, ensure you're building with GOAMD64=v2 or above. This enables hardware POPCNT instructions (used for fill ratio estimation) without runtime CPU detection overhead. Ensure your CPU supports popcnt first.

License

MIT

Documentation ¶

Overview ¶

Package gloom provides high-performance bloom filter implementations for Go.

A bloom filter is a space-efficient probabilistic data structure that tests whether an element is a member of a set. False positive matches are possible, but false negatives are not – if the filter says an element is not present, it definitely is not. If it says an element might be present, it could be a false positive.

Architecture ¶

Gloom uses two key optimizations for maximum throughput:

Cache-line blocking: Each bloom filter is divided into 512-bit (64-byte) blocks that match the CPU cache line size. All k hash probes for a single key access the same block, meaning only one cache line is touched per operation. This dramatically reduces memory latency compared to traditional bloom filters where k probes might touch k different cache lines.

One-hashing with prime partitions: Instead of computing k independent hash functions, gloom computes a single xxh3 hash and derives k bit positions using modulo operations with distinct prime numbers. This technique, based on the paper "One-Hashing Bloom Filter", provides excellent bit distribution while being much faster than multiple hash computations.

Implementations ¶

Three filter implementations are provided for different use cases:

Filter is the fastest option for single-threaded workloads. It has no synchronization overhead and achieves ~20ns per Add/Test operation.

AtomicFilter provides thread-safety using lock-free atomic operations. Multiple goroutines can safely call Add and Test concurrently. It uses sync/atomic.Uint64.Or (Go 1.23+) for efficient atomic bit-setting.

ShardedAtomicFilter distributes keys across multiple independent shards to reduce contention under heavy parallel writes. The shard count is auto-tuned to GOMAXPROCS by default. Use this when you have many goroutines performing concurrent writes.

Choosing Parameters ¶

Use New, NewAtomic, or NewShardedAtomicDefault with your expected number of items and desired false positive rate:

// Filter for 1 million items with 1% false positive rate
f := gloom.New(1_000_000, 0.01)

The functions automatically calculate optimal filter size and number of hash functions. For advanced use cases, NewWithParams and NewAtomicWithParams allow explicit control over the number of 512-bit blocks and hash functions.

False Positive Rate ¶

The false positive rate depends on:

Filter capacity (number of blocks)
Number of hash functions (k)
Number of items added

When the filter is filled to its intended capacity, it will achieve approximately the target false positive rate. Adding more items than the capacity increases the false positive rate. Use Filter.EstimatedFalsePositiveRate to monitor the current rate.

Memory Usage ¶

Memory usage is determined by the number of 512-bit blocks:

memory_bytes = num_blocks * 64

For a filter sized for n items with false positive rate p:

memory_bits ≈ -n * ln(p) / (ln(2))²

Example: 1 million items at 1% FP rate ≈ 1.2 MB

Thread Safety ¶

Filter is NOT thread-safe. Use external synchronization or choose AtomicFilter or ShardedAtomicFilter for concurrent access.

AtomicFilter and ShardedAtomicFilter are safe for concurrent Add and Test operations.

Performance Tips ¶

Use Filter for single-threaded workloads (fastest)
Use ShardedAtomicFilter for write-heavy concurrent workloads
Use AtomicFilter for read-heavy concurrent workloads
Use string methods (Filter.AddString, Filter.TestString) to avoid allocating when you have string keys
Build with GOAMD64=v2 or higher to enable hardware POPCNT for Filter.EstimatedFillRatio

References ¶

One-Hashing Bloom Filter: https://yangtonghome.github.io/uploads/One_Hashing.pdf
Cache-line blocking (RocksDB): https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter
Less Hashing, Same Performance: https://www.eecs.harvard.edu/~michaelm/postscripts/rsa2008.pdf

Example ¶

This example demonstrates basic bloom filter usage for membership testing.

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Create a filter for 10,000 items with 1% false positive rate
	f := gloom.New(10_000, 0.01)

	// Add some items
	f.Add([]byte("apple"))
	f.Add([]byte("banana"))
	f.Add([]byte("cherry"))

	// Test membership
	fmt.Println("apple:", f.Test([]byte("apple")))   // true (added)
	fmt.Println("banana:", f.Test([]byte("banana"))) // true (added)
	fmt.Println("grape:", f.Test([]byte("grape")))   // false (not added)

}

Output:
apple: true
banana: true
grape: false

Example (Concurrent) ¶

This example demonstrates using AtomicFilter for concurrent access.

package main

import (
	"fmt"
	"sync"

	"github.com/jcalabro/gloom"
)

func main() {
	// AtomicFilter is safe for concurrent Add and Test
	f := gloom.NewAtomic(100_000, 0.01)

	var wg sync.WaitGroup

	// Spawn multiple writers
	for i := range 4 {
		wg.Add(1)
		go func(id int) {
			defer wg.Done()
			for j := range 1000 {
				key := fmt.Sprintf("worker-%d-item-%d", id, j)
				f.AddString(key)
			}
		}(i)
	}

	// Spawn multiple readers (can run concurrently with writers)
	for i := range 4 {
		wg.Add(1)
		go func(id int) {
			defer wg.Done()
			for j := range 1000 {
				key := fmt.Sprintf("worker-%d-item-%d", id, j)
				_ = f.TestString(key)
			}
		}(i)
	}

	wg.Wait()
	fmt.Println("Items added:", f.Count())

}

Output:
Items added: 4000

Example (CustomParameters) ¶

This example demonstrates creating a filter with explicit parameters.

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Create with explicit block count and hash functions.
	// 1000 blocks * 512 bits = 512,000 bits total capacity.
	// k=7 hash functions is optimal for ~1% false positive rate.
	f := gloom.NewWithParams(1000, 7)

	f.AddString("custom")
	fmt.Println("Contains 'custom':", f.TestString("custom"))
	fmt.Printf("Blocks: %d, K: %d\n", f.NumBlocks(), f.K())

}

Output:
Contains 'custom': true
Blocks: 1000, K: 7

Example (Sharded) ¶

This example shows ShardedAtomicFilter for high-throughput concurrent writes.

package main

import (
	"fmt"
	"sync"

	"github.com/jcalabro/gloom"
)

func main() {
	// ShardedAtomicFilter distributes writes across shards to reduce contention.
	// Use NewShardedAtomicDefault for auto-tuned shard count (based on GOMAXPROCS).
	f := gloom.NewShardedAtomicDefault(1_000_000, 0.01)

	var wg sync.WaitGroup

	// High-throughput parallel writes
	for i := range 8 {
		wg.Add(1)
		go func(id int) {
			defer wg.Done()
			for j := range 10_000 {
				f.AddString(fmt.Sprintf("key-%d-%d", id, j))
			}
		}(i)
	}

	wg.Wait()
	fmt.Println("Total items:", f.Count())

}

Output:
Total items: 80000

Example (Statistics) ¶

This example shows how to monitor filter statistics.

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	f := gloom.New(10_000, 0.01)

	// Add some items
	for i := range 5000 {
		f.Add(fmt.Appendf(nil, "item-%d", i))
	}

	fmt.Printf("Capacity: %d bits\n", f.Cap())
	fmt.Printf("Hash functions (k): %d\n", f.K())
	fmt.Printf("Items added: %d\n", f.Count())
	fmt.Printf("Fill ratio: %.1f%%\n", f.EstimatedFillRatio()*100)

}

Output:
Capacity: 96256 bits
Hash functions (k): 7
Items added: 5000
Fill ratio: 30.4%

Example (StringKeys) ¶

This example shows how to use string keys without allocation overhead.

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	f := gloom.New(10_000, 0.01)

	// AddString and TestString avoid allocating when you have string keys
	f.AddString("user:12345")
	f.AddString("user:67890")

	fmt.Println("user:12345 exists:", f.TestString("user:12345"))
	fmt.Println("user:99999 exists:", f.TestString("user:99999"))

}

Output:
user:12345 exists: true
user:99999 exists: false

Index ¶

Constants
Variables
func ComputeOffsets(primes []uint32) []uint32
func EstimateFalsePositiveRate(numBlocks uint64, k uint32, itemsAdded uint64) float64
func GetPrimePartition(k uint32) []uint32
func OptimalParams(expectedItems uint64, fpRate float64) (numBlocks uint64, k uint32, bitsPerItem float64)
type AtomicFilter
- func NewAtomic(expectedItems uint64, fpRate float64) *AtomicFilter
- func NewAtomicWithParams(numBlocks uint64, k uint32) *AtomicFilter
type Filter
type ShardedAtomicFilter
- func NewShardedAtomic(expectedItems uint64, fpRate float64, numShards uint64) *ShardedAtomicFilter
- func NewShardedAtomicDefault(expectedItems uint64, fpRate float64) *ShardedAtomicFilter

Constants ¶

View Source

const (
	// BlockBits is the number of bits per block (cache line size).
	BlockBits = 512
	// BlockWords is the number of uint64s per block.
	BlockWords = BlockBits / 64 // 8

)

Variables ¶

View Source

var (
	// ErrInvalidData is returned when the serialized data is invalid or corrupted.
	ErrInvalidData = errors.New("gloom: invalid serialized data")

	// ErrUnsupportedVersion is returned when the serialization version is not supported.
	ErrUnsupportedVersion = errors.New("gloom: unsupported serialization version")

	// ErrInvalidK is returned when k value in serialized data is not supported.
	ErrInvalidK = errors.New("gloom: invalid k value in serialized data")
)

Functions ¶

func ComputeOffsets ¶

func ComputeOffsets(primes []uint32) []uint32

ComputeOffsets computes the cumulative bit offsets for each partition. offset[i] = sum of primes[0..i-1]

func EstimateFalsePositiveRate ¶

func EstimateFalsePositiveRate(numBlocks uint64, k uint32, itemsAdded uint64) float64

EstimateFalsePositiveRate estimates the false positive rate for given parameters.

For a cache-line blocked bloom filter, items are distributed across blocks following a Poisson distribution (balls-into-bins). Some blocks receive more items than average, increasing their local FP rate. This function computes the expected per-block FP rate over this Poisson distribution:

FP = E[(1 - e^(-k*J/s))^k]  where J ~ Poisson(n/B), s = BlockBits

This gives a more accurate estimate than the standard formula (1 - e^(-kn/m))^k, which assumes uniform bit placement and underestimates the FP rate of blocked filters.

Example ¶

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Estimate false positive rate for given parameters
	numBlocks := uint64(1000)
	k := uint32(7)
	itemsAdded := uint64(50000)

	rate := gloom.EstimateFalsePositiveRate(numBlocks, k, itemsAdded)
	fmt.Printf("Estimated FP rate: %.2f%%\n", rate*100)

}

Output:
Estimated FP rate: 0.86%

func GetPrimePartition ¶

func GetPrimePartition(k uint32) []uint32

GetPrimePartition returns the prime partition for the given k value. Returns nil if k is not supported.

func OptimalParams ¶

func OptimalParams(expectedItems uint64, fpRate float64) (numBlocks uint64, k uint32, bitsPerItem float64)

OptimalParams calculates the optimal bloom filter parameters. Returns the number of blocks, number of hash functions (k), and bits per item.

Example ¶

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Calculate optimal parameters for your use case
	blocks, k, bitsPerItem := gloom.OptimalParams(1_000_000, 0.01)

	fmt.Printf("For 1M items at 1%% FP rate:\n")
	fmt.Printf("  Blocks: %d\n", blocks)
	fmt.Printf("  Hash functions (k): %d\n", k)
	fmt.Printf("  Bits per item: %.1f\n", bitsPerItem)

}

Output:
For 1M items at 1% FP rate:
  Blocks: 18721
  Hash functions (k): 7
  Bits per item: 9.6

Types ¶

type AtomicFilter ¶

type AtomicFilter struct {
	// contains filtered or unexported fields
}

AtomicFilter is a thread-safe bloom filter using atomic operations. It uses the same cache-line blocked one-hashing technique as Filter but with atomic.Uint64 for concurrent access.

func NewAtomic ¶

func NewAtomic(expectedItems uint64, fpRate float64) *AtomicFilter

NewAtomic creates a new thread-safe bloom filter optimized for the expected number of items and desired false positive rate.

Example ¶

package main

import (
	"fmt"
	"sync"

	"github.com/jcalabro/gloom"
)

func main() {
	// Create a thread-safe filter for concurrent access.
	f := gloom.NewAtomic(100_000, 0.01)

	// Safe to call from multiple goroutines
	var wg sync.WaitGroup
	wg.Add(2)

	go func() {
		defer wg.Done()
		f.AddString("from-goroutine-1")
	}()

	go func() {
		defer wg.Done()
		f.AddString("from-goroutine-2")
	}()

	wg.Wait()
	fmt.Println("Count:", f.Count())

}

Output:
Count: 2

func NewAtomicWithParams ¶

func NewAtomicWithParams(numBlocks uint64, k uint32) *AtomicFilter

NewAtomicWithParams creates a new thread-safe bloom filter with explicit parameters.

func (*AtomicFilter) Add ¶

func (f *AtomicFilter) Add(data []byte)

Add adds data to the bloom filter atomically.

func (*AtomicFilter) AddString ¶

func (f *AtomicFilter) AddString(s string)

AddString adds a string to the bloom filter atomically without allocating.

func (*AtomicFilter) Cap ¶

func (f *AtomicFilter) Cap() uint64

Cap returns the capacity of the filter in bits.

func (*AtomicFilter) Count ¶

func (f *AtomicFilter) Count() uint64

Count returns the approximate number of items added to the filter.

func (*AtomicFilter) EstimatedFalsePositiveRate ¶

func (f *AtomicFilter) EstimatedFalsePositiveRate() float64

EstimatedFalsePositiveRate estimates the current false positive rate.

func (*AtomicFilter) EstimatedFillRatio ¶

func (f *AtomicFilter) EstimatedFillRatio() float64

EstimatedFillRatio estimates the proportion of bits that are set.

func (*AtomicFilter) K ¶

func (f *AtomicFilter) K() uint32

K returns the number of hash functions (partitions) used.

func (*AtomicFilter) NumBlocks ¶

func (f *AtomicFilter) NumBlocks() uint64

NumBlocks returns the number of 512-bit blocks in the filter.

func (*AtomicFilter) Test ¶

func (f *AtomicFilter) Test(data []byte) bool

Test checks if data might be in the bloom filter. This operation is safe to call concurrently with Add.

func (*AtomicFilter) TestString ¶

func (f *AtomicFilter) TestString(s string) bool

TestString checks if a string might be in the bloom filter.

type Filter ¶

type Filter struct {
	// contains filtered or unexported fields
}

Filter is a non-thread-safe bloom filter using cache-line blocked one-hashing for optimal performance.

The filter divides memory into 512-bit (64-byte) blocks that fit in a single CPU cache line. Each block is partitioned into k segments using distinct prime sizes, enabling the one-hashing technique where a single hash value generates k independent bit positions via modulo operations.

func New ¶

func New(expectedItems uint64, fpRate float64) *Filter

New creates a new bloom filter optimized for the expected number of items and desired false positive rate.

Example ¶

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Create a filter sized for 1 million items with 1% false positive rate.
	// The filter automatically calculates optimal size and hash functions.
	f := gloom.New(1_000_000, 0.01)

	f.Add([]byte("hello"))
	fmt.Println(f.Test([]byte("hello")))

}

Output:
true

func NewWithParams ¶

func NewWithParams(numBlocks uint64, k uint32) *Filter

NewWithParams creates a new bloom filter with explicit parameters. numBlocks is the number of 512-bit blocks, k is the number of hash functions.

func UnmarshalBinary ¶ added in v0.0.2

func UnmarshalBinary(data []byte) (*Filter, error)

UnmarshalBinary deserializes a bloom filter from a byte slice. Returns an error if the data is invalid or corrupted.

func (*Filter) Add ¶

func (f *Filter) Add(data []byte)

Add adds data to the bloom filter.

func (*Filter) AddString ¶

func (f *Filter) AddString(s string)

AddString adds a string to the bloom filter without allocating.

func (*Filter) Cap ¶

func (f *Filter) Cap() uint64

Cap returns the capacity of the filter in bits.

func (*Filter) Count ¶

func (f *Filter) Count() uint64

Count returns the approximate number of items added to the filter.

func (*Filter) EstimatedFalsePositiveRate ¶

func (f *Filter) EstimatedFalsePositiveRate() float64

EstimatedFalsePositiveRate estimates the current false positive rate based on the number of items added.

func (*Filter) EstimatedFillRatio ¶

func (f *Filter) EstimatedFillRatio() float64

EstimatedFillRatio estimates the proportion of bits that are set.

func (*Filter) K ¶

func (f *Filter) K() uint32

K returns the number of hash functions (partitions) used.

func (*Filter) MarshalBinary ¶ added in v0.0.2

func (f *Filter) MarshalBinary() ([]byte, error)

MarshalBinary serializes the bloom filter to a byte slice. The serialized format is:

Version (1 byte): serialization format version
K (4 bytes): number of hash functions (little-endian uint32)
NumBlocks (8 bytes): number of 512-bit blocks (little-endian uint64)
Count (8 bytes): number of items added (little-endian uint64)
Blocks (numBlocks * 64 bytes): the bit array data (little-endian uint64s)

The primes and offsets are not serialized as they can be derived from k.

func (*Filter) NumBlocks ¶

func (f *Filter) NumBlocks() uint64

NumBlocks returns the number of 512-bit blocks in the filter.

func (*Filter) Test ¶

func (f *Filter) Test(data []byte) bool

Test checks if data might be in the bloom filter. Returns true if the data might be present (with false positive probability), or false if the data is definitely not present.

func (*Filter) TestString ¶

func (f *Filter) TestString(s string) bool

TestString checks if a string might be in the bloom filter without allocating.

type ShardedAtomicFilter ¶

type ShardedAtomicFilter struct {
	// contains filtered or unexported fields
}

ShardedAtomicFilter is a thread-safe bloom filter that distributes writes across multiple shards to reduce contention under parallel workloads. Each shard is an independent AtomicFilter, and keys are consistently routed to shards based on their hash.

func NewShardedAtomic ¶

func NewShardedAtomic(expectedItems uint64, fpRate float64, numShards uint64) *ShardedAtomicFilter

NewShardedAtomic creates a new sharded thread-safe bloom filter. numShards must be a power of 2 (will be rounded up if not). The total capacity is distributed evenly across shards.

Example ¶

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Create a sharded filter with explicit shard count.
	// Use powers of 2 for optimal performance.
	f := gloom.NewShardedAtomic(100_000, 0.01, 16)

	f.AddString("key1")
	fmt.Println("Shards:", f.NumShards())

}

Output:
Shards: 16

func NewShardedAtomicDefault ¶

func NewShardedAtomicDefault(expectedItems uint64, fpRate float64) *ShardedAtomicFilter

NewShardedAtomicDefault creates a sharded filter with a number of shards automatically tuned to the current GOMAXPROCS value. This provides good parallel performance without over-sharding on smaller machines.

Example ¶

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Create a sharded filter with auto-tuned shard count.
	// Shard count is set to GOMAXPROCS (minimum 4).
	f := gloom.NewShardedAtomicDefault(100_000, 0.01)

	f.AddString("auto-tuned")
	fmt.Println("Has key:", f.TestString("auto-tuned"))

}

Output:
Has key: true

func (*ShardedAtomicFilter) Add ¶

func (f *ShardedAtomicFilter) Add(data []byte)

Add adds data to the bloom filter.

func (*ShardedAtomicFilter) AddString ¶

func (f *ShardedAtomicFilter) AddString(s string)

AddString adds a string to the bloom filter without allocating.

func (*ShardedAtomicFilter) Cap ¶

func (f *ShardedAtomicFilter) Cap() uint64

Cap returns the total capacity of all shards in bits.

func (*ShardedAtomicFilter) Count ¶

func (f *ShardedAtomicFilter) Count() uint64

Count returns the approximate total number of items added.

func (*ShardedAtomicFilter) EstimatedFalsePositiveRate ¶

func (f *ShardedAtomicFilter) EstimatedFalsePositiveRate() float64

EstimatedFalsePositiveRate estimates the current false positive rate. For sharded filters, this is approximately the average across shards.

func (*ShardedAtomicFilter) EstimatedFillRatio ¶

func (f *ShardedAtomicFilter) EstimatedFillRatio() float64

EstimatedFillRatio estimates the average fill ratio across all shards.

func (*ShardedAtomicFilter) K ¶

func (f *ShardedAtomicFilter) K() uint32

K returns the number of hash functions used per shard.

func (*ShardedAtomicFilter) NumBlocks ¶

func (f *ShardedAtomicFilter) NumBlocks() uint64

NumBlocks returns the total number of blocks across all shards.

func (*ShardedAtomicFilter) NumShards ¶

func (f *ShardedAtomicFilter) NumShards() uint64

NumShards returns the number of shards.

func (*ShardedAtomicFilter) Test ¶

func (f *ShardedAtomicFilter) Test(data []byte) bool

Test checks if data might be in the bloom filter.

func (*ShardedAtomicFilter) TestString ¶

func (f *ShardedAtomicFilter) TestString(s string) bool

TestString checks if a string might be in the bloom filter.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL