sroar

package module
v0.0.14 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 3, 2026 License: Apache-2.0 Imports: 11 Imported by: 15

README

sroar: Serialized Roaring Bitmaps

sroar is a re-written version of Roaring Bitmaps in Go, with the aim to have equality between in-memory representation and on-disk representation. An sroar.Bitmap does not need to be marshalled or unmarshalled, as the underlying represetation is a byte slice. Therefore, it can be written to disk, brought to memory, or shipped over the network immediately. This is needed in Dgraph, where we need to deal with lots of bitmaps.

sroar only implements array and bitmap containers. It does NOT implement run containers, which is an optimization that RoaringBitmaps has. Despite that, it outperforms RoaringBitmaps as shown in the Benchmarks section.

The code borrows concepts and code from RoaringBitmaps.

Benchmarks

The benchmarks were run:

  • Using real data set as described in RoaringBitmaps.
  • Only on the 64-bit version of roaring bitmaps (roaring64).
  • Only on FastOr, which is the more expensive operation than And or equivalent.
  • On AMD Ryzen Threadripper 2950X 16-Core Processor.
  • Using Go benchmarks serially.

Based on the benchmarks, sroar is:

  • 6.5x faster (-85% p50) for benchmarks >1ms, uses
  • 15x (-93.5% p50) less memory for allocations >1MB.
  • 25x fewer allocations.

The benchmark command and the results are:

$ go test -bench BenchmarkRealDataFastOr --run=XXX --count=5 --benchmem

name CPU                                    old time/op    new time/op    delta
RealDataFastOr/census1881-32                 302ms ± 2%       2ms ± 3%   -99.29%  (p=0.008 n=5+5)
RealDataFastOr/wikileaks-noquotes-32        76.5ms ± 1%     0.9ms ± 1%   -98.83%  (p=0.008 n=5+5)
RealDataFastOr/wikileaks-noquotes_srt-32    34.8ms ± 5%     1.0ms ± 2%   -97.07%  (p=0.008 n=5+5)
RealDataFastOr/dimension_033-32             55.0ms ± 3%     2.7ms ± 0%   -95.16%  (p=0.016 n=5+4)
RealDataFastOr/census1881_srt-32            36.8ms ± 3%     2.9ms ± 1%   -92.13%  (p=0.008 n=5+5)
RealDataFastOr/dimension_003-32             50.4ms ± 1%    11.6ms ± 4%   -77.06%  (p=0.008 n=5+5)
RealDataFastOr/dimension_008-32             10.0ms ± 2%     3.7ms ± 2%   -62.69%  (p=0.008 n=5+5)
RealDataFastOr/weather_sept_85_srt-32       6.13ms ± 3%    2.72ms ± 2%   -55.66%  (p=0.008 n=5+5)
RealDataFastOr/census-income-32             1.70ms ± 3%    1.05ms ± 1%   -38.53%  (p=0.008 n=5+5)
RealDataFastOr/weather_sept_85-32           2.28ms ± 2%    4.07ms ± 2%   +78.52%  (p=0.008 n=5+5)

RealDataFastOr/uscensus2000-32               556µs ± 2%     791µs ± 1%   +42.17%  (p=0.008 n=5+5)
RealDataFastOr/census-income_srt-32          260µs ± 4%     986µs ± 2%  +279.09%  (p=0.008 n=5+5)

name MEM_BYTES                             old alloc/op   new alloc/op   delta
RealDataFastOr/census1881-32                 585MB ± 0%       1MB ± 0%   -99.75%  (p=0.008 n=5+5)
RealDataFastOr/wikileaks-noquotes-32        76.3MB ± 0%     0.6MB ± 0%   -99.24%  (p=0.008 n=5+5)
RealDataFastOr/wikileaks-noquotes_srt-32    22.8MB ± 0%     0.6MB ± 0%   -97.46%  (p=0.008 n=5+5)
RealDataFastOr/census1881_srt-32            15.3MB ± 0%     1.4MB ± 0%   -90.58%  (p=0.008 n=5+5)
RealDataFastOr/dimension_003-32             7.78MB ± 0%    1.44MB ± 0%   -81.49%  (p=0.008 n=5+5)
RealDataFastOr/dimension_033-32             1.10MB ± 0%    1.44MB ± 0%   +30.92%  (p=0.008 n=5+5)

RealDataFastOr/dimension_008-32              537kB ± 0%      97kB ± 0%   -81.94%  (p=0.008 n=5+5)
RealDataFastOr/census-income-32              187kB ± 0%      70kB ± 0%   -62.86%  (p=0.008 n=5+5)
RealDataFastOr/census-income_srt-32         99.1kB ± 0%    69.6kB ± 0%   -29.81%  (p=0.008 n=5+5)
RealDataFastOr/weather_sept_85_srt-32        375kB ± 0%     292kB ± 0%   -21.95%  (p=0.008 n=5+5)
RealDataFastOr/uscensus2000-32               169kB ± 0%     231kB ± 0%   +36.97%  (p=0.008 n=5+5)
RealDataFastOr/weather_sept_85-32            169kB ± 0%     292kB ± 0%   +72.93%  (p=0.008 n=5+5)

name MEM_ALLOCS                           old allocs/op  new allocs/op  delta
RealDataFastOr/census1881_srt-32             29.7k ± 0%      0.0k ± 0%   -99.91%  (p=0.008 n=5+5)
RealDataFastOr/wikileaks-noquotes_srt-32     6.06k ± 0%     0.02k ± 0%   -99.74%  (p=0.008 n=5+5)
RealDataFastOr/dimension_003-32              4.57k ± 0%     0.03k ± 2%   -99.42%  (p=0.008 n=5+5)
RealDataFastOr/dimension_033-32              4.33k ± 0%     0.03k ± 0%   -99.38%  (p=0.000 n=5+4)
RealDataFastOr/uscensus2000-32               1.75k ± 0%     0.06k ± 0%   -96.85%  (p=0.008 n=5+5)
RealDataFastOr/dimension_008-32                704 ± 0%        23 ± 3%   -96.79%  (p=0.008 n=5+5)
RealDataFastOr/census-income-32                271 ± 0%         9 ± 0%   -96.68%  (p=0.008 n=5+5)
RealDataFastOr/weather_sept_85_srt-32          248 ± 0%        14 ± 0%   -94.35%  (p=0.008 n=5+5)
RealDataFastOr/weather_sept_85-32             81.0 ± 0%      14.0 ± 0%   -82.72%  (p=0.008 n=5+5)
RealDataFastOr/census-income_srt-32           40.0 ± 0%       9.0 ± 0%   -77.50%  (p=0.008 n=5+5)
RealDataFastOr/census1881-32                 54.5k ± 0%      0.0k ± 0%      ~     (p=0.079 n=4+5)
RealDataFastOr/wikileaks-noquotes-32         39.2k ± 0%      0.0k ± 0%      ~     (p=0.079 n=4+5)

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Bitmap

type Bitmap struct {
	// contains filtered or unexported fields
}

func And

func And(a, b *Bitmap) *Bitmap

func AndNot added in v0.0.5

func AndNot(a, b *Bitmap) *Bitmap

func AndOld added in v0.0.5

func AndOld(a, b *Bitmap) *Bitmap

func CopresenceByMask added in v0.0.14

func CopresenceByMask(bms []*Bitmap, mask uint64) *Bitmap

CopresenceByMask returns a new bitmap containing every value v in the union of bms whose masked value (v & mask) is present in bms[i].Masked(mask) for every i. Original (unmasked) values are preserved. Equivalent in result to the (much more expensive) fold:

out := bms[0].Masked(mask)
for i := 1; i < len(bms); i++ {
    out = out.And(bms[i].Masked(mask))
}
// then map each masked value back to every original v in any bms[i]
// whose v & mask matches.

but processes all inputs in one pass — avoiding intermediate bitmap allocations and walking each input only once. None of the bms are modified.

Mask shape requirement: mask & 0xFFFF == 0xFFFF. Other mask shapes are not supported and will yield incorrect results.

Edge cases:

  • len(bms) == 0: empty result.
  • len(bms) == 1: result is a clone of bms[0] (single-input co-presence is trivially the input).
  • any input empty: empty result.

Algorithm: with mask_high = mask & 0xFFFFFFFFFFFF0000, group each input's containers by K & mask_high. Walk the masked-key entry slices of all N inputs in lockstep (multi-pointer max-key advance) to find groups present on every side. For each such group:

  1. compute pos_i = OR of positions across input i's containers in the group
  2. common_pos = AND_i (pos_i) — positions co-present in every side
  3. for each unique original key K appearing in any input's group, emit OR_i(bms[i][K]) AND common_pos at K (omitting empty results)

Step 2 short-circuits as soon as common_pos becomes empty.

func CopresenceByMaskToBuf added in v0.0.14

func CopresenceByMaskToBuf(bms []*Bitmap, mask uint64, buf []byte) *Bitmap

CopresenceByMaskToBuf is like CopresenceByMask but uses the provided byte slice as the underlying buffer for the result bitmap, avoiding the result's heap allocation when the buffer is large enough. If the buffer is too small the bitmap grows internally (and allocates), so correctness is unaffected — only the allocation-elision is.

Sizing guidance: a safe upper bound is the sum of input byte sizes (sum_i bms[i].LenInBytes()). The result's keys area is bounded by the sum of input keys areas, and each result container is bounded by the sum of its contributing inputs' containers (the per-key OR can't exceed it; the subsequent AND with commonPos can only shrink). In practice results are often much smaller — callers that know their workload's typical density can pass a smaller buffer and accept occasional growth.

Same mask shape requirement as CopresenceByMask: mask & 0xFFFF == 0xFFFF.

func FastAnd

func FastAnd(bitmaps ...*Bitmap) *Bitmap

func FastOr

func FastOr(bitmaps ...*Bitmap) *Bitmap

FastOr would merge given Bitmaps into one Bitmap. This is faster than doing an OR over the bitmaps iteratively.

func FastParOr

func FastParOr(numGo int, bitmaps ...*Bitmap) *Bitmap

FastParOr would group up bitmaps and call FastOr on them concurrently. It would then merge the groups into final Bitmap. This approach is simpler and faster than operating at a container level, because we can't operate on array containers belonging to the same Bitmap concurrently because array containers can expand, leaving no clear boundaries.

If FastParOr is called with numGo=1, it just calls FastOr.

Experiments with numGo=4 shows that FastParOr would be 2x the speed of FastOr, but 4x the memory usage, even under 50% CPU usage. So, use wisely.

func FromBuffer

func FromBuffer(data []byte) *Bitmap

FromBuffer returns a pointer to bitmap corresponding to the given buffer. This bitmap shouldn't be modified because it might corrupt the given buffer.

func FromBufferUnlimited added in v0.0.10

func FromBufferUnlimited(buf []byte) *Bitmap

FromBufferUnlimited returns a pointer to bitmap corresponding to the given buffer. Entire buffer capacity is utlized for future bitmap modifications and expansions.

func FromBufferWithCopy

func FromBufferWithCopy(src []byte) *Bitmap

FromBufferWithCopy creates a copy of the given buffer and returns a bitmap based on the copied buffer. This bitmap is safe for both read and write operations.

func FromSortedList

func FromSortedList(vals []uint64) *Bitmap

func MaskedAnd added in v0.0.14

func MaskedAnd(a, b *Bitmap, mask uint64) *Bitmap

MaskedAnd returns a new bitmap containing the AND of a and b, with mask applied to all keys. Keys that become equal after masking have their containers merged via OR. This is equivalent to Masked(And(a, b), mask) but allocates only one bitmap instead of two. Neither a nor b is modified. The lowest 16 bits of keys are always zeroed.

func MaskedAndToBuf added in v0.0.14

func MaskedAndToBuf(a, b *Bitmap, mask uint64, buf []byte) *Bitmap

MaskedAndToBuf is like MaskedAnd but uses the provided byte slice as the underlying buffer for the result bitmap, avoiding heap allocation when the buffer is large enough. The lowest 16 bits of keys are always zeroed.

func NewBitmap

func NewBitmap() *Bitmap

func NewBitmapToBuf added in v0.0.14

func NewBitmapToBuf(buf []byte) *Bitmap

NewBitmapToBuf creates a new bitmap using the provided byte slice as the underlying data buffer. If the buffer is too small to hold the initial bitmap structure, a heap-allocated bitmap is returned instead. The _ptr field is set to keep a GC reference to buf, since the bitmap operates on an unsafe []uint16 view of the same memory.

func NewBitmapWith

func NewBitmapWith(numKeys int) *Bitmap

func Or

func Or(a, b *Bitmap) *Bitmap

func OrOld added in v0.0.5

func OrOld(a, b *Bitmap) *Bitmap

func Prefill added in v0.0.9

func Prefill(maxX uint64) *Bitmap

Prefill creates bitmap prefilled with elements [0-maxX]

func (*Bitmap) And

func (ra *Bitmap) And(bm *Bitmap) *Bitmap

And performs in-place AND of ra and bm (ra &= bm). ra drives the two-pointer walk: ra containers with no matching bm key must be zeroed out, so every ra container must be visited.

func (*Bitmap) AndConc added in v0.0.9

func (ra *Bitmap) AndConc(bm *Bitmap, maxConcurrency int) *Bitmap

AndConc performs And merge concurrently (ra &= bm). ra drives the walk: every ra container must be visited to zero out those with no matching bm key. Concurrency is calculated based on number of internal containers in ra, so that each goroutine handles at least [minContainersPerRoutine] containers. maxConcurrency limits concurrency calculated internally. If maxConcurrency <= 0, then calculated concurrency is not limited.

E.g.: dst bitmap has 100 containers. Internal concurrency = 100/24 = 4. For: - maxConcurrency = 2, there will be 2 goroutines executed - maxConcurrency = 6, there will be 4 goroutines executed

func (*Bitmap) AndMasked added in v0.0.14

func (ra *Bitmap) AndMasked(b *Bitmap, mask uint64) *Bitmap

AndMasked is equivalent to ra.And(b.Masked(mask)) but avoids allocating an intermediate masked bitmap. b is not modified.

For each key k in ra, AndMasked finds all keys k' in b where k'&mask == k, ORs their containers together, then ANDs the result into ra's container at k. If no key in b maps to k under the mask, ra's container at k is zeroed out. The lowest 16 bits of the mask are always ignored.

func (*Bitmap) AndMaskedConc added in v0.0.14

func (ra *Bitmap) AndMaskedConc(b *Bitmap, mask uint64, maxConcurrency int) *Bitmap

AndMaskedConc is like AndMasked but processes ra's containers concurrently. Concurrency is calculated based on number of internal containers in ra, so that each goroutine handles at least [minContainersPerRoutine] containers. maxConcurrency limits concurrency calculated internally. If maxConcurrency <= 0, then calculated concurrency is not limited.

func (*Bitmap) AndNot

func (ra *Bitmap) AndNot(bm *Bitmap) *Bitmap

AndNot performs in-place AND-NOT of ra and bm (ra &^= bm). Neither bitmap has unmatched containers requiring mandatory action: unmatched ra containers are kept, unmatched bm containers are irrelevant. The smaller bitmap drives for a minor sequential optimisation.

func (*Bitmap) AndNotConc added in v0.0.9

func (ra *Bitmap) AndNotConc(bm *Bitmap, maxConcurrency int) *Bitmap

AndNotConc performs AndNot merge concurrently (ra &^= bm). Neither bitmap has unmatched containers requiring mandatory action, so the smaller bitmap drives to minimise goroutine count. Concurrency is calculated based on number of internal containers in the smaller bitmap, so that each goroutine handles at least [minContainersPerRoutine] containers. maxConcurrency limits concurrency calculated internally. If maxConcurrency <= 0, then calculated concurrency is not limited.

E.g.: smaller bitmap has 100 containers. Internal concurrency = 100/24 = 4. For: - maxConcurrency = 2, there will be 2 goroutines executed - maxConcurrency = 6, there will be 4 goroutines executed

func (*Bitmap) AndNotOld added in v0.0.5

func (ra *Bitmap) AndNotOld(bm *Bitmap)

func (*Bitmap) AndOld added in v0.0.5

func (ra *Bitmap) AndOld(bm *Bitmap)

func (*Bitmap) Cleanup

func (ra *Bitmap) Cleanup()

func (*Bitmap) Clone

func (ra *Bitmap) Clone() *Bitmap

func (*Bitmap) CloneToBuf added in v0.0.9

func (ra *Bitmap) CloneToBuf(buf []byte) *Bitmap

func (*Bitmap) Contains

func (ra *Bitmap) Contains(x uint64) bool

func (*Bitmap) ConvertToBitmapContainers added in v0.0.3

func (ra *Bitmap) ConvertToBitmapContainers()

func (*Bitmap) Debug

func (ra *Bitmap) Debug(x uint64) string

func (*Bitmap) FillUp added in v0.0.9

func (ra *Bitmap) FillUp(maxX uint64)

FillUp fill bitmap with elements (maximum-maxX], where maximum means last element. If bitmap is empty then [0-maxX] elements are added (reusing underlying data slice if big enough to fit all elements). If last element is >= than given maxX nothing is done.

func (*Bitmap) GetCardinality

func (ra *Bitmap) GetCardinality() int

func (*Bitmap) Intersects added in v0.0.14

func (ra *Bitmap) Intersects(bm *Bitmap) bool

Intersects reports whether ra and bm share at least one element. Equivalent to !ra.Clone().And(bm).IsEmpty() but short-circuits at the first common element without allocating a result bitmap.

func (*Bitmap) IntersectsMasked added in v0.0.14

func (ra *Bitmap) IntersectsMasked(bm *Bitmap, mask uint64) bool

IntersectsMasked reports whether ra and bm share at least one element after applying mask to bm's keys. Equivalent to !ra.Clone().And(bm.Masked(mask)).IsEmpty() but short-circuits at the first common element without allocating. Containers in ra whose key has a non-zero dimension (low 16 bits != 0) are not checked — same behaviour as AndMasked.

func (*Bitmap) IsEmpty

func (ra *Bitmap) IsEmpty() bool

func (*Bitmap) LenInBytes added in v0.0.9

func (ra *Bitmap) LenInBytes() int

func (*Bitmap) ManyIterator

func (r *Bitmap) ManyIterator() *ManyItr

TODO: See if this is needed, we should remove this

func (*Bitmap) Masked added in v0.0.14

func (ra *Bitmap) Masked(mask uint64) *Bitmap

Masked applies the given mask to every key and returns a new bitmap. Keys that collapse to the same masked value have their containers merged via container-level OR operations.

func (*Bitmap) MaskedToBuf added in v0.0.14

func (ra *Bitmap) MaskedToBuf(mask uint64, buf []byte) *Bitmap

MaskedToBuf is like Masked but uses the provided byte slice as the underlying buffer for the result bitmap, avoiding heap allocation when the buffer is large enough.

func (*Bitmap) Maximum

func (ra *Bitmap) Maximum() uint64

func (*Bitmap) Minimum

func (ra *Bitmap) Minimum() uint64

func (*Bitmap) NewIterator

func (bm *Bitmap) NewIterator() *Iterator

func (*Bitmap) NewRangeIterators

func (bm *Bitmap) NewRangeIterators(numRanges int) []*Iterator

func (*Bitmap) NumContainers added in v0.0.14

func (ra *Bitmap) NumContainers() int

func (*Bitmap) Or

func (ra *Bitmap) Or(bm *Bitmap) *Bitmap

Or performs in-place OR of ra and bm (ra |= bm). bm drives the two-pointer walk: bm containers with no matching ra key must be added to ra, so every bm container must be visited. Ra containers with no matching bm key are left unchanged — no visit required.

func (*Bitmap) OrConc added in v0.0.9

func (ra *Bitmap) OrConc(bm *Bitmap, maxConcurrency int) *Bitmap

OrConc performs Or merge concurrently (ra |= bm). bm drives the walk: every bm container must be visited to add those with no matching ra key. Concurrency is calculated based on number of internal containers in bm, so that each goroutine handles at least [minContainersPerRoutine] containers. maxConcurrency limits concurrency calculated internally. If maxConcurrency <= 0, then calculated concurrency is not limited.

E.g.: src bitmap has 100 containers. Internal concurrency = 100/24 = 4. For: - maxConcurrency = 2, there will be 2 goroutines executed - maxConcurrency = 6, there will be 4 goroutines executed

func (*Bitmap) OrOld added in v0.0.5

func (dst *Bitmap) OrOld(src *Bitmap)

TODO: Check if we want to use lazyMode

func (*Bitmap) Rank

func (ra *Bitmap) Rank(x uint64) int

func (*Bitmap) Remove

func (ra *Bitmap) Remove(x uint64) bool

func (*Bitmap) RemoveRange

func (ra *Bitmap) RemoveRange(lo, hi uint64)

Remove range removes [lo, hi) from the bitmap.

func (*Bitmap) Reset

func (ra *Bitmap) Reset()

func (*Bitmap) Select

func (ra *Bitmap) Select(x uint64) (uint64, error)

Select returns the element at the xth index. (0-indexed)

func (*Bitmap) Set

func (ra *Bitmap) Set(x uint64) bool

func (*Bitmap) SetMany

func (ra *Bitmap) SetMany(vals []uint64)

TODO: Potentially this can be optimized.

func (*Bitmap) Split

func (bm *Bitmap) Split(externalSize func(start, end uint64) uint64, maxSz uint64) []*Bitmap

Split splits the bitmap based on maxSz and the externalSize function. It splits the bitmap such that size of each split bitmap + external size corresponding to its elements approximately equal to maxSz (it can be greater than maxSz sometimes). The splits are returned in sorted order. externalSize is a function that should return the external size corresponding to elements in range [start, end]. External size is used to calculate the split boundaries.

func (*Bitmap) String

func (ra *Bitmap) String() string

func (*Bitmap) ToArray

func (ra *Bitmap) ToArray() []uint64

func (*Bitmap) ToBuffer

func (ra *Bitmap) ToBuffer() []byte

func (*Bitmap) ToBufferWithCopy

func (ra *Bitmap) ToBufferWithCopy() []byte

func (*Bitmap) ZeroOut added in v0.0.13

func (ra *Bitmap) ZeroOut()

type Iterator

type Iterator struct {
	// contains filtered or unexported fields
}

func (*Iterator) Next

func (it *Iterator) Next() uint64

type ManyItr

type ManyItr struct {
	// contains filtered or unexported fields
}

func (*ManyItr) NextMany

func (itr *ManyItr) NextMany(buf []uint64) int

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL