gloom

package module
v0.0.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 15, 2026 License: MIT Imports: 9 Imported by: 0

README

Gloom

Go Reference CI codecov Go Report Card

A high-performance bloom filter library for Go, implementing cache-line blocked one-hashing (OHBF).

Features

  • Cache-line optimized: All k bit probes for a key are aligned and fit within a single 64-byte cache line, minimizing memory access latency
  • One-hashing technique: Uses a single xxh3 call with prime modulo partitions instead of k independent hash functions
  • Three implementations:
    • Filter - Non-thread-safe, fastest for single-threaded workloads, allows for serialization/deserialization
    • AtomicFilter - Thread-safe using atomic.Uint64.Or(), best for read-heavy concurrent workloads
    • ShardedAtomicFilter - Thread-safe with sharding, best for write-heavy concurrent workloads
  • Zero allocations: Hot paths (Add/Test) allocate no memory
  • 100% test coverage: Comprehensive test suite
  • Go 1.23+: Uses modern atomic operations for best performance

Usage

Single-Threaded Usage

Requires the caller to synchronize parallel reads and writes, if any.

package main

import "github.com/jcalabro/gloom"

func main() {
    // Create a filter for 1 million items with 1% false positive rate
    f := gloom.New(1_000_000, 0.01)

    // Add items
    f.Add([]byte("hello"))
    f.AddString("world")

    // Test membership
    if f.Test([]byte("hello")) {
        println("hello might be present")
    }
    if !f.TestString("not-added") {
        println("definitely not present")
    }
}
Thread-Safe Usage
package main

import (
    "sync"
    "github.com/jcalabro/gloom"
)

func main() {
    // Create an atomic filter for concurrent access
    f := gloom.NewAtomic(1_000_000, 0.01)

    var wg sync.WaitGroup
    defer wg.Wait()

    for i := range 8 {
        wg.Go(func() {
            for j := range 100000 {
                f.AddString(fmt.Sprintf("key-%d-%d", i, j))
            }
        })
    }
}
High-Throughput Concurrent Writes

For write-heavy concurrent workloads, use ShardedAtomicFilter which distributes writes across multiple independent shards to reduce contention:

package main

import (
    "sync"
    "github.com/jcalabro/gloom"
)

func main() {
    // Create a sharded filter with auto-tuned shard count (based on GOMAXPROCS)
    f := gloom.NewShardedAtomicDefault(1_000_000, 0.01)

    // Or specify shard count explicitly (must be power of 2)
    // f := gloom.NewShardedAtomic(1_000_000, 0.01, 16)

    var wg sync.WaitGroup
    defer wg.Wait()

    for i := range 32 {
        wg.Go(func() {
            for j := range 100000 {
                f.AddString(fmt.Sprintf("key-%d-%d", i, j))
            }
        })
    }
}
Advanced Configuration
// Create with explicit parameters
// numBlocks: number of 512-bit cache-line blocks
// k: number of hash functions (partitions)
f := gloom.NewWithParams(1000, 7)

// Get filter statistics
fmt.Printf("Capacity: %d bits\n", f.Cap())
fmt.Printf("Hash functions: %d\n", f.K())
fmt.Printf("Items added: %d\n", f.Count())
fmt.Printf("Fill ratio: %.2f%%\n", f.EstimatedFillRatio()*100)
fmt.Printf("Est. FP rate: %.4f%%\n", f.EstimatedFalsePositiveRate()*100)

Design

Cache-Line Blocked One-Hashing

Traditional bloom filters use k independent hash functions, each potentially accessing a different cache line. Gloom instead:

  1. Blocks memory into 512-bit (64-byte) chunks matching CPU cache line size
  2. Uses one xxh3 call per operation - upper 32 bits select the block, lower 32 bits are reused
  3. Partitions each block by k primes - the same hash value mod different primes gives k independent bit positions
┌───────────────────────────────────┐
│          Key: "hello"             │
└─────────────────┬─────────────────┘
                  │
                  ▼
┌───────────────────────────────────┐
│         xxh3.Hash(key)            │
│    h = 0xA1B2C3D4_E5F67890        │
└─────────────────┬─────────────────┘
                  │
    ┌─────────────┴─────────────┐
    │                           │
    ▼                           ▼
┌─────────────────┐   ┌─────────────────┐
│  Block Index    │   │  Intra-Hash     │
│  h >> 32        │   │  uint32(h)      │
│  % numBlocks    │   │  = 0xE5F67890   │
│  = Block 2      │   │                 │
└────────┬────────┘   └────────┬────────┘
         │                     │
         └──────────┬──────────┘
                    ▼
┌───────────────────────────────────────────────────────────────────┐
│                      Bloom Filter Memory                          │
│ ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────────┐ │
│ │ Block 0 │ Block 1 │ Block 2 │ Block 3 │ Block 4 │ ...         │ │
│ │ 512 bits│ 512 bits│ 512 bits│ 512 bits│ 512 bits│             │ │
│ └─────────┴─────────┴────┬────┴─────────┴─────────┴─────────────┘ │
│                          │                                        │
│              ┌───────────┴───────────┐                            │
│              ▼   ONE cache line (64B)▼                            │
│ ┌────────────────────────────────────────────────────────────┐    │
│ │                    Block 2 (512 bits)                      │    │
│ │ ┌──────┬──────┬──────┬──────┬──────┬──────┬──────┐         │    │
│ │ │ 67   │ 71   │ 73   │ 79   │ 83   │ 89   │ 50   │ ← prime │    │
│ │ │ bits │ bits │ bits │ bits │ bits │ bits │ bits │   sizes │    │
│ │ └──┬───┴──┬───┴──┬───┴──┬───┴──┬───┴──┬───┴──┬───┘         │    │
│ │    │      │      │      │      │      │                    │    │
│ │    ▼      ▼      ▼      ▼      ▼      ▼      ▼             │    │
│ │  h%67   h%71   h%73   h%79   h%83   h%89   h%50            │    │
│ │  =23    =45    =12    =67    =34    =78    =40             │    │
│ │    │      │      │      │      │      │      │             │    │
│ │    ▼      ▼      ▼      ▼      ▼      ▼      ▼             │    │
│ │  SET    SET    SET    SET    SET    SET    SET             │    │
│ │ bit 23 bit 112 bit 185 bit 264 bit 347 bit 436 bit 486     │    │
│ └────────────────────────────────────────────────────────────┘    │
└───────────────────────────────────────────────────────────────────┘
References

Benchmarks

The times shown below are the mean time per operation as reported by Go's testing.B framework. For latency percentiles, see the Histogram Benchmarks section.

Benchmarks run on AMD Ryzen 9 9950X (32 threads), Go 1.23+, comparing against:

Sequential Performance (single-threaded)
Operation Gloom Gloom Atomic BitsAndBlooms AtomicBloom Blobloom*
Add 19.9 ns 35.7 ns 44.4 ns 57.5 ns 16.9 ns
Test 16.3 ns 17.4 ns 40.7 ns 41.4 ns 5.5 ns

*Blobloom requires pre-hashing input, so times exclude hash computation.

Parallel Performance (GOMAXPROCS=32)
Operation Gloom Atomic Gloom Sharded AtomicBloom
Parallel Add 51.3 ns 11.2 ns 19.2 ns
Parallel Test 0.96 ns 1.00 ns 1.94 ns
Mixed R/W 30.9 ns 7.1 ns 19.6 ns
High Contention 64.6 ns 17.2 ns 43.1 ns
Throughput
Implementation Items/sec (8 goroutines)
Gloom (non-atomic) 38.3M items/sec
Gloom Atomic 19.4M items/sec
Gloom Sharded 78.6M items/sec
Histogram Benchmarks

Sample output (Add operations on the same hardware as above):

Library Mean p50 p99 p9999
Gloom 57 ns 30 ns 60 ns 401 ns
GloomAtomic 76 ns 50 ns 100 ns 501 ns
GloomSharded 75 ns 50 ns 91 ns 471 ns
BitsAndBlooms 103 ns 80 ns 170 ns 682 ns
AtomicBloom 95 ns 70 ns 120 ns 571 ns
Blobloom* 52 ns 30 ns 60 ns 380 ns
Running Tests and Benchmarks
# Using https://github.com/casey/just
just # runs the linter and short tests with the race detector enabled

just test
just t # runs without the race detector

just test-long
just t-long

just bench
just bench-long
Tips

For maximum performance on modern x86-64 CPUs, ensure you're building with GOAMD64=v2 or above. This enables hardware POPCNT instructions (used for fill ratio estimation) without runtime CPU detection overhead. Ensure your CPU supports popcnt first.

License

MIT

Documentation

Overview

Package gloom provides high-performance bloom filter implementations for Go.

A bloom filter is a space-efficient probabilistic data structure that tests whether an element is a member of a set. False positive matches are possible, but false negatives are not – if the filter says an element is not present, it definitely is not. If it says an element might be present, it could be a false positive.

Architecture

Gloom uses two key optimizations for maximum throughput:

Cache-line blocking: Each bloom filter is divided into 512-bit (64-byte) blocks that match the CPU cache line size. All k hash probes for a single key access the same block, meaning only one cache line is touched per operation. This dramatically reduces memory latency compared to traditional bloom filters where k probes might touch k different cache lines.

One-hashing with prime partitions: Instead of computing k independent hash functions, gloom computes a single xxh3 hash and derives k bit positions using modulo operations with distinct prime numbers. This technique, based on the paper "One-Hashing Bloom Filter", provides excellent bit distribution while being much faster than multiple hash computations.

Implementations

Three filter implementations are provided for different use cases:

Filter is the fastest option for single-threaded workloads. It has no synchronization overhead and achieves ~20ns per Add/Test operation.

AtomicFilter provides thread-safety using lock-free atomic operations. Multiple goroutines can safely call Add and Test concurrently. It uses sync/atomic.Uint64.Or (Go 1.23+) for efficient atomic bit-setting.

ShardedAtomicFilter distributes keys across multiple independent shards to reduce contention under heavy parallel writes. The shard count is auto-tuned to GOMAXPROCS by default. Use this when you have many goroutines performing concurrent writes.

Choosing Parameters

Use New, NewAtomic, or NewShardedAtomicDefault with your expected number of items and desired false positive rate:

// Filter for 1 million items with 1% false positive rate
f := gloom.New(1_000_000, 0.01)

The functions automatically calculate optimal filter size and number of hash functions. For advanced use cases, NewWithParams and NewAtomicWithParams allow explicit control over the number of 512-bit blocks and hash functions.

False Positive Rate

The false positive rate depends on:

  • Filter capacity (number of blocks)
  • Number of hash functions (k)
  • Number of items added

When the filter is filled to its intended capacity, it will achieve approximately the target false positive rate. Adding more items than the capacity increases the false positive rate. Use Filter.EstimatedFalsePositiveRate to monitor the current rate.

Memory Usage

Memory usage is determined by the number of 512-bit blocks:

memory_bytes = num_blocks * 64

For a filter sized for n items with false positive rate p:

memory_bits ≈ -n * ln(p) / (ln(2))²

Example: 1 million items at 1% FP rate ≈ 1.2 MB

Thread Safety

Filter is NOT thread-safe. Use external synchronization or choose AtomicFilter or ShardedAtomicFilter for concurrent access.

AtomicFilter and ShardedAtomicFilter are safe for concurrent Add and Test operations.

Performance Tips

References

Example

This example demonstrates basic bloom filter usage for membership testing.

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Create a filter for 10,000 items with 1% false positive rate
	f := gloom.New(10_000, 0.01)

	// Add some items
	f.Add([]byte("apple"))
	f.Add([]byte("banana"))
	f.Add([]byte("cherry"))

	// Test membership
	fmt.Println("apple:", f.Test([]byte("apple")))   // true (added)
	fmt.Println("banana:", f.Test([]byte("banana"))) // true (added)
	fmt.Println("grape:", f.Test([]byte("grape")))   // false (not added)

}
Output:
apple: true
banana: true
grape: false
Example (Concurrent)

This example demonstrates using AtomicFilter for concurrent access.

package main

import (
	"fmt"
	"sync"

	"github.com/jcalabro/gloom"
)

func main() {
	// AtomicFilter is safe for concurrent Add and Test
	f := gloom.NewAtomic(100_000, 0.01)

	var wg sync.WaitGroup

	// Spawn multiple writers
	for i := range 4 {
		wg.Add(1)
		go func(id int) {
			defer wg.Done()
			for j := range 1000 {
				key := fmt.Sprintf("worker-%d-item-%d", id, j)
				f.AddString(key)
			}
		}(i)
	}

	// Spawn multiple readers (can run concurrently with writers)
	for i := range 4 {
		wg.Add(1)
		go func(id int) {
			defer wg.Done()
			for j := range 1000 {
				key := fmt.Sprintf("worker-%d-item-%d", id, j)
				_ = f.TestString(key)
			}
		}(i)
	}

	wg.Wait()
	fmt.Println("Items added:", f.Count())

}
Output:
Items added: 4000
Example (CustomParameters)

This example demonstrates creating a filter with explicit parameters.

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Create with explicit block count and hash functions.
	// 1000 blocks * 512 bits = 512,000 bits total capacity.
	// k=7 hash functions is optimal for ~1% false positive rate.
	f := gloom.NewWithParams(1000, 7)

	f.AddString("custom")
	fmt.Println("Contains 'custom':", f.TestString("custom"))
	fmt.Printf("Blocks: %d, K: %d\n", f.NumBlocks(), f.K())

}
Output:
Contains 'custom': true
Blocks: 1000, K: 7
Example (Sharded)

This example shows ShardedAtomicFilter for high-throughput concurrent writes.

package main

import (
	"fmt"
	"sync"

	"github.com/jcalabro/gloom"
)

func main() {
	// ShardedAtomicFilter distributes writes across shards to reduce contention.
	// Use NewShardedAtomicDefault for auto-tuned shard count (based on GOMAXPROCS).
	f := gloom.NewShardedAtomicDefault(1_000_000, 0.01)

	var wg sync.WaitGroup

	// High-throughput parallel writes
	for i := range 8 {
		wg.Add(1)
		go func(id int) {
			defer wg.Done()
			for j := range 10_000 {
				f.AddString(fmt.Sprintf("key-%d-%d", id, j))
			}
		}(i)
	}

	wg.Wait()
	fmt.Println("Total items:", f.Count())

}
Output:
Total items: 80000
Example (Statistics)

This example shows how to monitor filter statistics.

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	f := gloom.New(10_000, 0.01)

	// Add some items
	for i := range 5000 {
		f.Add(fmt.Appendf(nil, "item-%d", i))
	}

	fmt.Printf("Capacity: %d bits\n", f.Cap())
	fmt.Printf("Hash functions (k): %d\n", f.K())
	fmt.Printf("Items added: %d\n", f.Count())
	fmt.Printf("Fill ratio: %.1f%%\n", f.EstimatedFillRatio()*100)

}
Output:
Capacity: 96256 bits
Hash functions (k): 7
Items added: 5000
Fill ratio: 30.4%
Example (StringKeys)

This example shows how to use string keys without allocation overhead.

package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	f := gloom.New(10_000, 0.01)

	// AddString and TestString avoid allocating when you have string keys
	f.AddString("user:12345")
	f.AddString("user:67890")

	fmt.Println("user:12345 exists:", f.TestString("user:12345"))
	fmt.Println("user:99999 exists:", f.TestString("user:99999"))

}
Output:
user:12345 exists: true
user:99999 exists: false

Index

Examples

Constants

View Source
const (
	// BlockBits is the number of bits per block (cache line size).
	BlockBits = 512
	// BlockWords is the number of uint64s per block.
	BlockWords = BlockBits / 64 // 8

)

Variables

View Source
var (
	// ErrInvalidData is returned when the serialized data is invalid or corrupted.
	ErrInvalidData = errors.New("gloom: invalid serialized data")

	// ErrUnsupportedVersion is returned when the serialization version is not supported.
	ErrUnsupportedVersion = errors.New("gloom: unsupported serialization version")

	// ErrInvalidK is returned when k value in serialized data is not supported.
	ErrInvalidK = errors.New("gloom: invalid k value in serialized data")
)

Functions

func ComputeOffsets

func ComputeOffsets(primes []uint32) []uint32

ComputeOffsets computes the cumulative bit offsets for each partition. offset[i] = sum of primes[0..i-1]

func EstimateFalsePositiveRate

func EstimateFalsePositiveRate(numBlocks uint64, k uint32, itemsAdded uint64) float64

EstimateFalsePositiveRate estimates the false positive rate for given parameters.

For a cache-line blocked bloom filter, items are distributed across blocks following a Poisson distribution (balls-into-bins). Some blocks receive more items than average, increasing their local FP rate. This function computes the expected per-block FP rate over this Poisson distribution:

FP = E[(1 - e^(-k*J/s))^k]  where J ~ Poisson(n/B), s = BlockBits

This gives a more accurate estimate than the standard formula (1 - e^(-kn/m))^k, which assumes uniform bit placement and underestimates the FP rate of blocked filters.

Example
package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Estimate false positive rate for given parameters
	numBlocks := uint64(1000)
	k := uint32(7)
	itemsAdded := uint64(50000)

	rate := gloom.EstimateFalsePositiveRate(numBlocks, k, itemsAdded)
	fmt.Printf("Estimated FP rate: %.2f%%\n", rate*100)

}
Output:
Estimated FP rate: 0.86%

func GetPrimePartition

func GetPrimePartition(k uint32) []uint32

GetPrimePartition returns the prime partition for the given k value. Returns nil if k is not supported.

func OptimalParams

func OptimalParams(expectedItems uint64, fpRate float64) (numBlocks uint64, k uint32, bitsPerItem float64)

OptimalParams calculates the optimal bloom filter parameters. Returns the number of blocks, number of hash functions (k), and bits per item.

Example
package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Calculate optimal parameters for your use case
	blocks, k, bitsPerItem := gloom.OptimalParams(1_000_000, 0.01)

	fmt.Printf("For 1M items at 1%% FP rate:\n")
	fmt.Printf("  Blocks: %d\n", blocks)
	fmt.Printf("  Hash functions (k): %d\n", k)
	fmt.Printf("  Bits per item: %.1f\n", bitsPerItem)

}
Output:
For 1M items at 1% FP rate:
  Blocks: 18721
  Hash functions (k): 7
  Bits per item: 9.6

Types

type AtomicFilter

type AtomicFilter struct {
	// contains filtered or unexported fields
}

AtomicFilter is a thread-safe bloom filter using atomic operations. It uses the same cache-line blocked one-hashing technique as Filter but with atomic.Uint64 for concurrent access.

func NewAtomic

func NewAtomic(expectedItems uint64, fpRate float64) *AtomicFilter

NewAtomic creates a new thread-safe bloom filter optimized for the expected number of items and desired false positive rate.

Example
package main

import (
	"fmt"
	"sync"

	"github.com/jcalabro/gloom"
)

func main() {
	// Create a thread-safe filter for concurrent access.
	f := gloom.NewAtomic(100_000, 0.01)

	// Safe to call from multiple goroutines
	var wg sync.WaitGroup
	wg.Add(2)

	go func() {
		defer wg.Done()
		f.AddString("from-goroutine-1")
	}()

	go func() {
		defer wg.Done()
		f.AddString("from-goroutine-2")
	}()

	wg.Wait()
	fmt.Println("Count:", f.Count())

}
Output:
Count: 2

func NewAtomicWithParams

func NewAtomicWithParams(numBlocks uint64, k uint32) *AtomicFilter

NewAtomicWithParams creates a new thread-safe bloom filter with explicit parameters.

func (*AtomicFilter) Add

func (f *AtomicFilter) Add(data []byte)

Add adds data to the bloom filter atomically.

func (*AtomicFilter) AddString

func (f *AtomicFilter) AddString(s string)

AddString adds a string to the bloom filter atomically without allocating.

func (*AtomicFilter) Cap

func (f *AtomicFilter) Cap() uint64

Cap returns the capacity of the filter in bits.

func (*AtomicFilter) Count

func (f *AtomicFilter) Count() uint64

Count returns the approximate number of items added to the filter.

func (*AtomicFilter) EstimatedFalsePositiveRate

func (f *AtomicFilter) EstimatedFalsePositiveRate() float64

EstimatedFalsePositiveRate estimates the current false positive rate.

func (*AtomicFilter) EstimatedFillRatio

func (f *AtomicFilter) EstimatedFillRatio() float64

EstimatedFillRatio estimates the proportion of bits that are set.

func (*AtomicFilter) K

func (f *AtomicFilter) K() uint32

K returns the number of hash functions (partitions) used.

func (*AtomicFilter) NumBlocks

func (f *AtomicFilter) NumBlocks() uint64

NumBlocks returns the number of 512-bit blocks in the filter.

func (*AtomicFilter) Test

func (f *AtomicFilter) Test(data []byte) bool

Test checks if data might be in the bloom filter. This operation is safe to call concurrently with Add.

func (*AtomicFilter) TestString

func (f *AtomicFilter) TestString(s string) bool

TestString checks if a string might be in the bloom filter.

type Filter

type Filter struct {
	// contains filtered or unexported fields
}

Filter is a non-thread-safe bloom filter using cache-line blocked one-hashing for optimal performance.

The filter divides memory into 512-bit (64-byte) blocks that fit in a single CPU cache line. Each block is partitioned into k segments using distinct prime sizes, enabling the one-hashing technique where a single hash value generates k independent bit positions via modulo operations.

func New

func New(expectedItems uint64, fpRate float64) *Filter

New creates a new bloom filter optimized for the expected number of items and desired false positive rate.

Example
package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Create a filter sized for 1 million items with 1% false positive rate.
	// The filter automatically calculates optimal size and hash functions.
	f := gloom.New(1_000_000, 0.01)

	f.Add([]byte("hello"))
	fmt.Println(f.Test([]byte("hello")))

}
Output:
true

func NewWithParams

func NewWithParams(numBlocks uint64, k uint32) *Filter

NewWithParams creates a new bloom filter with explicit parameters. numBlocks is the number of 512-bit blocks, k is the number of hash functions.

func UnmarshalBinary added in v0.0.2

func UnmarshalBinary(data []byte) (*Filter, error)

UnmarshalBinary deserializes a bloom filter from a byte slice. Returns an error if the data is invalid or corrupted.

func (*Filter) Add

func (f *Filter) Add(data []byte)

Add adds data to the bloom filter.

func (*Filter) AddString

func (f *Filter) AddString(s string)

AddString adds a string to the bloom filter without allocating.

func (*Filter) Cap

func (f *Filter) Cap() uint64

Cap returns the capacity of the filter in bits.

func (*Filter) Count

func (f *Filter) Count() uint64

Count returns the approximate number of items added to the filter.

func (*Filter) EstimatedFalsePositiveRate

func (f *Filter) EstimatedFalsePositiveRate() float64

EstimatedFalsePositiveRate estimates the current false positive rate based on the number of items added.

func (*Filter) EstimatedFillRatio

func (f *Filter) EstimatedFillRatio() float64

EstimatedFillRatio estimates the proportion of bits that are set.

func (*Filter) K

func (f *Filter) K() uint32

K returns the number of hash functions (partitions) used.

func (*Filter) MarshalBinary added in v0.0.2

func (f *Filter) MarshalBinary() ([]byte, error)

MarshalBinary serializes the bloom filter to a byte slice. The serialized format is:

  • Version (1 byte): serialization format version
  • K (4 bytes): number of hash functions (little-endian uint32)
  • NumBlocks (8 bytes): number of 512-bit blocks (little-endian uint64)
  • Count (8 bytes): number of items added (little-endian uint64)
  • Blocks (numBlocks * 64 bytes): the bit array data (little-endian uint64s)

The primes and offsets are not serialized as they can be derived from k.

func (*Filter) NumBlocks

func (f *Filter) NumBlocks() uint64

NumBlocks returns the number of 512-bit blocks in the filter.

func (*Filter) Test

func (f *Filter) Test(data []byte) bool

Test checks if data might be in the bloom filter. Returns true if the data might be present (with false positive probability), or false if the data is definitely not present.

func (*Filter) TestString

func (f *Filter) TestString(s string) bool

TestString checks if a string might be in the bloom filter without allocating.

type ShardedAtomicFilter

type ShardedAtomicFilter struct {
	// contains filtered or unexported fields
}

ShardedAtomicFilter is a thread-safe bloom filter that distributes writes across multiple shards to reduce contention under parallel workloads. Each shard is an independent AtomicFilter, and keys are consistently routed to shards based on their hash.

func NewShardedAtomic

func NewShardedAtomic(expectedItems uint64, fpRate float64, numShards uint64) *ShardedAtomicFilter

NewShardedAtomic creates a new sharded thread-safe bloom filter. numShards must be a power of 2 (will be rounded up if not). The total capacity is distributed evenly across shards.

Example
package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Create a sharded filter with explicit shard count.
	// Use powers of 2 for optimal performance.
	f := gloom.NewShardedAtomic(100_000, 0.01, 16)

	f.AddString("key1")
	fmt.Println("Shards:", f.NumShards())

}
Output:
Shards: 16

func NewShardedAtomicDefault

func NewShardedAtomicDefault(expectedItems uint64, fpRate float64) *ShardedAtomicFilter

NewShardedAtomicDefault creates a sharded filter with a number of shards automatically tuned to the current GOMAXPROCS value. This provides good parallel performance without over-sharding on smaller machines.

Example
package main

import (
	"fmt"

	"github.com/jcalabro/gloom"
)

func main() {
	// Create a sharded filter with auto-tuned shard count.
	// Shard count is set to GOMAXPROCS (minimum 4).
	f := gloom.NewShardedAtomicDefault(100_000, 0.01)

	f.AddString("auto-tuned")
	fmt.Println("Has key:", f.TestString("auto-tuned"))

}
Output:
Has key: true

func (*ShardedAtomicFilter) Add

func (f *ShardedAtomicFilter) Add(data []byte)

Add adds data to the bloom filter.

func (*ShardedAtomicFilter) AddString

func (f *ShardedAtomicFilter) AddString(s string)

AddString adds a string to the bloom filter without allocating.

func (*ShardedAtomicFilter) Cap

func (f *ShardedAtomicFilter) Cap() uint64

Cap returns the total capacity of all shards in bits.

func (*ShardedAtomicFilter) Count

func (f *ShardedAtomicFilter) Count() uint64

Count returns the approximate total number of items added.

func (*ShardedAtomicFilter) EstimatedFalsePositiveRate

func (f *ShardedAtomicFilter) EstimatedFalsePositiveRate() float64

EstimatedFalsePositiveRate estimates the current false positive rate. For sharded filters, this is approximately the average across shards.

func (*ShardedAtomicFilter) EstimatedFillRatio

func (f *ShardedAtomicFilter) EstimatedFillRatio() float64

EstimatedFillRatio estimates the average fill ratio across all shards.

func (*ShardedAtomicFilter) K

func (f *ShardedAtomicFilter) K() uint32

K returns the number of hash functions used per shard.

func (*ShardedAtomicFilter) NumBlocks

func (f *ShardedAtomicFilter) NumBlocks() uint64

NumBlocks returns the total number of blocks across all shards.

func (*ShardedAtomicFilter) NumShards

func (f *ShardedAtomicFilter) NumShards() uint64

NumShards returns the number of shards.

func (*ShardedAtomicFilter) Test

func (f *ShardedAtomicFilter) Test(data []byte) bool

Test checks if data might be in the bloom filter.

func (*ShardedAtomicFilter) TestString

func (f *ShardedAtomicFilter) TestString(s string) bool

TestString checks if a string might be in the bloom filter.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL