datatasketches

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 4, 2026 License: Apache-2.0 Imports: 0 Imported by: 0

README

Go Go Report Card Release GoDoc License Coverage Status

Apache® DataSketches™ Core Go Library Component

This is the core Go component of the DataSketches library. It contains some of the sketching algorithms and can be accessed directly from user applications.

This project is currently under development. Breaking changes may occur before a stable release.

Note that we have parallel core library components for C++, Java, Python, and Rust implementations of many of the same sketch algorithms:

Please visit the main DataSketches website for more information.

If you are interested in making contributions to this site, please see our Community page for information on how to contact us.

Major Sketches

Type Implementation Status
Cardinality
CpcSketch
HllSketch
ThetaSketch
TupleSketch
Quantiles
CormodeDoublesSketch
CormodeItemsSketch
KllDoublesSketch
KllFloatsSketch ⚠️
KllSketch
ReqFloatsSketch 🚧
TDigestDouble
Frequencies
FreqLongsSketch
FreqItemsSketch
CountMinSketch
Sampling
ReservoirLongsSketch ✅*
ReserviorItemsSketch ✅*
VarOptItemsSketch ✅*
Membership
BloomFilter
Density
DensitySketch

Specialty Sketches

Type Interface Name Status
Cardinality/FM85 UniqueCountMap
Cardinality/Tuple
FdtSketch
ArrayOfDoublesSketch
DoubleSketch
IntegerSketch
ArrayOfStringsSketch ⚠️
EngagementTest3

✅ = Released in v0.1.0

✅* = Released in v0.1.0, but partially implemented and unstable (API may change)

❌ = Not yet implemented

⚠️ = Implemented but not officially released

🚧 = In progress

Build & Runtime Dependencies

This code requires Go 1.24

Compilation and Test

Test can be run using go test command

go test ./...

A Dockerfile is also provided with the necessary env to build and test the project.

./build/Dockerfile
./build/run-docker-test.sh

Documentation

Overview

Package datatasketches is the parent package for all sketch families and common code areas.

The Sketching Core Library provides a range of stochastic streaming algorithms that are particularly useful when integrating this technology into systems that must deal with massive data. The library is designed to be easy to use, highly performant, and memory efficient.

Directories

Path Synopsis
Package filters provides probabilistic membership data structures for efficient set membership testing with controlled false positive rates.
Package filters provides probabilistic membership data structures for efficient set membership testing with controlled false positive rates.
Package frequencies is dedicated to streaming algorithms that enable estimation of the frequency of occurrence of items in a weighted multiset stream of items.
Package frequencies is dedicated to streaming algorithms that enable estimation of the frequency of occurrence of items in a weighted multiset stream of items.
Package hll is dedicated to streaming algorithms that enable estimation of the cardinality of a stream of items.
Package hll is dedicated to streaming algorithms that enable estimation of the cardinality of a stream of items.
binomialproportionsbounds
Package binomialproportionsbounds computes an approximation to the Clopper-Pearson confidence interval for a binomial proportion.
Package binomialproportionsbounds computes an approximation to the Clopper-Pearson confidence interval for a binomial proportion.
Package kll is an implementation of a very compact quantiles sketch with lazy compaction scheme and nearly optimal accuracy per retained quantile.</p>
Package kll is an implementation of a very compact quantiles sketch with lazy compaction scheme and nearly optimal accuracy per retained quantile.</p>

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL