Documentation
¶
Overview ¶
Package bloom provides a space-efficient, concurrent-safe Bloom filter with zero allocations per operation.
A Bloom filter is a probabilistic data structure that can answer the question "Is this item in the set?" with either a definitive No or a probabilistic Probably. There are no false negatives: if Contains returns false, the item was never added. False positives are possible but their rate is controlled by the filter's size and number of hash functions.
The filter uses atomic.Uint64 for its bitset, making Add and Contains fully safe for concurrent use without any locks. Hashers are recycled via sync.Pool to eliminate per-call heap allocations.
Hash independence is achieved through the Kirsch-Mitzenmacher optimization: a single 64-bit hash is split into two 32-bit halves to simulate k independent hash functions. The default hasher is FNV-1a from the standard library; a custom hash.Hash64 can be injected via WithHashFunc.
Filters implement encoding.BinaryMarshaler and encoding.BinaryUnmarshaler for compact serialization.
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Filter ¶
type Filter struct {
// contains filtered or unexported fields
}
Filter represents a Bloom filter. We use atomic.Uint64 so that Add and Contains are fully concurrent-safe without requiring an external mutex. On modern 64-bit architectures, atomic operations on aligned 64-bit words are essentially free.
func NewFilter ¶
NewFilter creates a Bloom filter directly with a specific bit size (m) and number of hash iterations (k).
func NewFilterFromProbability ¶
NewFilterFromProbability creates a Bloom filter tailored for an expected number of elements (`n`) and a desired false-positive probability (`p`). Example: NewFilterFromProbability(10000, 0.01) creates a filter for 10k items with a 1% chance of a false positive.
Example ¶
// Create a filter for 1000 items with a 1% false positive rate
f := NewFilterFromProbability(1000, 0.01)
// Add an item
f.Add([]byte("my-database-key"))
// Check if it exists
if f.Contains([]byte("my-database-key")) {
fmt.Println("Item is probably in the set.")
}
if !f.Contains([]byte("non-existent-key")) {
fmt.Println("Item is definitively NOT in the set.")
}
Output: Item is probably in the set. Item is definitively NOT in the set.
func (*Filter) Add ¶
Add inserts data into the Bloom filter. It hashes the data once, splits the 64-bit result into two 32-bit halves, and simulates `k` independent hashes via Kirsch-Mitzenmacher. This method is safe for concurrent use.
func (*Filter) Contains ¶
Contains checks if data might be in the set. If *any* of the `k` hash positions are 0, the element was definitively never added. If *all* of the positions are 1, it *might* have been added (or we have a collision). This method is safe for concurrent use.
func (*Filter) MarshalBinary ¶
MarshalBinary implements the encoding.BinaryMarshaler interface. It serializes the filter's parameters and bitset into a portable binary format. The hash function is NOT serialized — upon UnmarshalBinary, the default FNV-1a hasher is used. Use WithHashFunc after unmarshaling if needed.
func (*Filter) UnmarshalBinary ¶
UnmarshalBinary implements the encoding.BinaryUnmarshaler interface. It restores the filter's parameters and bitset from data produced by MarshalBinary. If the receiver was created with a custom hasher (via WithHashFunc), that hasher is preserved. Otherwise, it defaults to FNV-1a.
type HashFunc ¶
HashFunc is a function that returns a new hash.Hash64. By defining an interface for our hasher, we avoid hardcoding a specific algorithm. This gives users the flexibility to swap the default hasher (FNV-1a) with a faster one (like Murmur3 or xxHash) if their use case requires maximum throughput.
type Option ¶
type Option func(*Filter)
Option configures the Filter using the functional options pattern. This pattern allows us to provide sensible defaults while remaining open for extension later, without breaking the API.
func WithHashFunc ¶
WithHashFunc allows providing a custom hash function constructor.