## README ¶

### Count-Min-Log

**TL;DR:** Count-Min-Log Sketch for improved Average Relative Error on low frequency events

Count-Min Sketch is a widely adopted algorithm for approximate event counting in large scale processing. However, the original version of the Count-Min-Sketch (CMS) suffers of some deficiences, especially if one is interested in the low-frequency items, such as in text- mining related tasks. Several variants of CMS have been proposed to compensate for the high relative error for low-frequency events, but the proposed solutions tend to correct the errors instead of preventing them. In this paper, we propose the Count-Min-Log sketch, which uses logarithm-based, approximate counters instead of linear counters to improve the average relative error of CMS at constant memory footprint.

This version implements the 16 bit register version. Will add back the 8-bit version soon.

#### Example Usage

```
import cml
...
sk, err := cml.NewDefaultSketch()
sk.IncreaseCount([]byte("scott pilgrim"))
...
sk.Frequency([]byte("scott pilgrim")) // ==> 1
```

## Documentation ¶

### Overview ¶

Package cml implments the Count-Min-Log datastructure.

It is based on Count-Min-Log sketch: Approximately counting with approximate counters - Guillaume Pitel & Geoffroy Fouquier

http://iswag-symposium.org/2015/pdfs/shortpaper1.pdf

TL;DR: Count-Min-Log Sketch for improved Average Relative Error on low frequency events

Count-Min Sketch is a widely adopted algorithm for approximate event counting in large scale processing. However, the original version of the Count-Min-Sketch (CMS) suffers of some deficiences, especially if one is interested in the low-frequency items, such as in text- mining related tasks. Several variants of CMS have been proposed to compensate for the high relative error for low-frequency events, but the proposed solutions tend to correct the errors instead of preventing them. In this paper, we propose the Count-Min-Log sketch, which uses logarithm-based, approximate counters instead of linear counters to improve the average relative error of CMS at constant memory footprint.

### Index ¶

### Constants ¶

This section is empty.

### Variables ¶

This section is empty.

### Functions ¶

This section is empty.

### Types ¶

#### type Sketch ¶

```
type Sketch struct {
// contains filtered or unexported fields
}
```

Sketch is a Count-Min-Log Sketch 16-bit registers

#### func NewForCapacity16 ¶

NewForCapacity16 returns a new Count-Min-Log Sketch with 16-bit registers optimized for a given max capacity and expected error rate

#### func NewSketchForEpsilonDelta ¶

NewSketchForEpsilonDelta for a given error rate epsiolen with a probability of delta

#### func (*Sketch) BulkUpdate ¶

BulkUpdate increases the count of `s` by one, return true if added and the current count of `s`