stream

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 4, 2024 License: MIT Imports: 0 Imported by: 0

README

Stream

GoDoc Build Status Go Report Card codecov GitHub license

Stream is a Go library for online statistical algorithms. Provided statistics can be computed globally over an entire stream, or over a rolling window.

Table of Contents

Installation

Use go get:

go get github.com/alexander-yu/stream

Example Usage

In-depth examples are provided in the examples directory, but a small taste is provided below:

// tracks the autocorrelation over a
// rolling window of size 15 and lag of 5
autocorr, err := joint.NewAutocorr(5, 15)
// handle err

// all metrics in the joint package must be passed
// through joint.Init in order to consume values
err = joint.Init(autocorr)
// handle err

// tracks the global median using a pair of heaps
median, err := quantile.NewGlobalHeapMedian()
// handle err

for i := 0., i < 100; i++ {
    err = autocorr.Push(i)
    // handle err

    err = median.Push(i)
    // handle err
}

autocorrVal, err := autocorr.Value()
// handle err

medianVal, err := median.Value()
// handle err

fmt.Println("%s: %f", autocorr.String(), autocorrVal)
fmt.Println("%s: %f", median.String(), medianVal)

Statistics

For time/space complexity details on the algorithms listed below, see here.

Quantile
Quantile

Quantile keeps track of the quantiles of a stream. Quantile can calculate the global quantiles of a stream, or over a rolling window. You can also configure which implementation to use as the underlying data structure, as well as which interpolation method to use in the case that a quantile actually lies in between two elements. For now skip lists as well as order statistic trees (in particular modified forms of AVL trees and red black trees) are supported.

Median

Median keeps track of the median of a stream; this is simply a convenient wrapper over Quantile, that automatically sets the quantile to be 0.5 and the interpolation method to be the midpoint method.

IQR

IQR keeps track of the interquartile range of a stream; this is simply a convenient wrapper over Quantile, that retrieves the 1st and 3rd quartiles and sets the interpolation method to be the midpoint method.

HeapMedian

HeapMedian keeps track of the median of a stream with a pair of heaps. In particular, it uses a max-heap and a min-heap to keep track of elements below and above the median, respectively. HeapMedian can calculate the global median of a stream, or over a rolling window.

Min/Max
Min

Min keeps track of the minimum of a stream; it can track either the global minimum, or over a rolling window.

Max

Max keeps track of the maximum of a stream; it can track either the global maximum, or over a rolling window.

Moment-Based Statistics
Mean

Mean keeps track of the mean of a stream; it can track either the global mean, or over a rolling window.

EWMA

EWMA keeps track of the global exponentially weighted moving average.

Moment

Moment keeps track of the k-th sample central moment; it can track either the global moment, or over a rolling window.

EWMMoment

EWMMoment keeps track of the global k-sample exponentially weighted moving sample central moment. This uses the exponentially weighted moving average as its center of mass, and uses the same exponential weights for its power terms.

Std

Std keeps track of the sample standard deviation of a stream; it can track either the global standard deviation, or over a rolling window. To track the sample variance instead, you should use Moment, i.e.

variance := New(2, window)
EWMStd

EWMStd keeps track of the global exponentially weighted moving standard deviation. To track the exponentially weighted moving variance instead, you should use EWMMoment, i.e.

variance := NewEWMMoment(2, decay)
Skewness

Skewness keeps track of the sample skewness of a stream (in particular, the adjusted Fisher-Pearson standardized moment coefficient); it can track either the global skewness, or over a rolling window.

Kurtosis

Kurtosis keeps track of the sample kurtosis of a stream (in particular, the sample excess kurtosis); it can track either the global kurtosis, or over a rolling window.

Core (Univariate)

Core is the struct powering all of the statistics in the stream/moment subpackage; it keeps track of a pre-configured set of centralized k-th power sums of a stream in an efficient, numerically stable way; it can track either the global sums, or over a rolling window.

To configure which sums to track, you'll need to instantiate a CoreConfig struct and provide it to NewCore:

config := &moment.CoreConfig{
    Sums: SumsConfig{
        2: true, // tracks the sum of squared differences
        3: true, // tracks the sum of cubed differences
    },
    Window: stream.IntPtr(0),    // tracks global sums
    Decay: stream.FloatPtr(0.3), // tracks exponentially weighted sums with a decay factor of 0.3
}
core, err := NewCore(config)

See the godoc entry for more details on Core's methods.

Joint Distribution Statistics
Cov

Cov keeps track of the sample covariance of a stream; it can track either the global covariance, or over a rolling window.

EWMCov

EWMCov keeps track of the global exponentially weighted sample covariance of a stream. This uses the exponentially weighted moving average as its center of mass, and uses the same exponential weights for its power terms.

Corr

Corr keeps track of the sample correlation of a stream (in particular, the sample Pearson correlation coefficient); it can track either the global correlation, or over a rolling window.

EWMCorr

EWMCorr keeps track of the global sample exponentially weighted correlation of a stream (in particular, the exponentially weighted sample Pearson correlation coefficient). This uses the exponentially weighted moving average as its center of mass, and uses the same exponential weights for its power terms.

Autocorr

Autocorr keeps track of the sample autocorrelation of a stream (in particular, the sample autocorrelation) for a given lag; it can track either the global autocorrelation, or over a rolling window.

Autocov

Autocov keeps track of the sample autocovariance of a stream (in particular, the sample autocovariance) for a given lag; it can track either the global autocovariance, or over a rolling window.

Core (Multivariate)

Core is the struct powering all of the statistics in the stream/joint subpackage; it keeps track of a pre-configured set of joint centralized power sums of a stream in an efficient, numerically stable way; it can track either the global sums, or over a rolling window.

To configure which sums to track, you'll need to instantiate a CoreConfig struct and provide it to NewCore:

config := &joint.CoreConfig{
    Sums: SumsConfig{
        {1, 1}, // tracks the joint sum of differences
        {2, 0}, // tracks the sum of squared differences of variable 1
    },
    Vars: stream.IntPtr(2),      // declares that there are 2 variables to track (optional if Sums is set)
    Window: stream.IntPtr(0),    // tracks global sums
    Decay: stream.FloatPtr(0.3), // tracks exponentially weighted sums with a decay factor of 0.3
}
core, err := NewCore(config)

See the godoc entry for more details on Core's methods.

Aggregate Statistics
SimpleAggregateMetric

SimpleAggregateMetric is a convenience wrapper that stores multiple univariate metrics and will push a value to all metrics simultaneously; instead of returning a single scalar, it returns a map of metrics to their corresponding values.

SimpleJointAggregateMetric

SimpleJointAggregateMetric is a convenience wrapper that stores multiple multivariate metrics and will push a value to all metrics simultaneously; instead of returning a single scalar, it returns a map of metrics to their corresponding values.

Documentation

Overview

Package stream provides a library of data structures/algorithms for calculating online statistics from a stream of data.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func FloatPtr

func FloatPtr(v float64) *float64

FloatPtr returns a pointer to a float.

func IntPtr

func IntPtr(v int) *int

IntPtr returns a pointer to an int.

Types

type AggregateMetric

type AggregateMetric interface {
	Push(float64) error
	Values() (map[string]interface{}, error)
	Clear()
}

AggregateMetric is the interface for a metric that tracks multiple univariate single-value metrics simultaneously. Values() returns a map of metrics to their corresponding values at that given time. The keys are the string representations of the metrics (by calling the String() method).

type JointAggregateMetric

type JointAggregateMetric interface {
	Push(...float64) error
	Values() (map[string]interface{}, error)
	Clear()
}

JointAggregateMetric is the interface for a metric that tracks multiple multivariate single-value metrics simultaneously. Values() returns a map of metrics to their corresponding values at that given time. The keys are the string representations of the metrics (by calling the String() method).

type JointMetric

type JointMetric interface {
	Push(...float64) error
	String() string
	Clear()
}

JointMetric is the interface for a metric that tracks joint statistics from a stream. There is no Value method for this interface, allowing implementations to roll custom value methods.

type Metric

type Metric interface {
	Push(float64) error
	String() string
	Clear()
}

Metric is the interface for a metric that consumes from a stream. Metric is the standard interface for most metrics; in particular for those that consume single numeric values at a time. There is no Value method for this interface, allowing implementations to roll custom value methods.

type SimpleJointMetric

type SimpleJointMetric interface {
	JointMetric
	Value() (float64, error)
}

SimpleJointMetric is the interface for a JointMetric that returns a singular value.

type SimpleMetric

type SimpleMetric interface {
	Metric
	Value() (float64, error)
}

SimpleMetric is the interface for a Metric that returns a singular value.

Directories

Path Synopsis
Package aggregate is a helper library for keeping track of multiple metrics at a time.
Package aggregate is a helper library for keeping track of multiple metrics at a time.
examples
Package joint provides a library of data structures/algorithms for calculating online joint distribution statistics from a stream of data.
Package joint provides a library of data structures/algorithms for calculating online joint distribution statistics from a stream of data.
Package minmax provides a library of data structures/algorithms for calculating the online minimum or maximum from a stream of data.
Package minmax provides a library of data structures/algorithms for calculating the online minimum or maximum from a stream of data.
Package moment provides a library of data structures/algorithms for calculating online moment-based statistics from a stream of data.
Package moment provides a library of data structures/algorithms for calculating online moment-based statistics from a stream of data.
Package quantile provides a library of data structures/algorithms for calculating online quantiles from a stream of data.
Package quantile provides a library of data structures/algorithms for calculating online quantiles from a stream of data.
heap
Package heap provides the implementation for heaps.
Package heap provides the implementation for heaps.
order
Package order contains the interfaces for various implementations of order statistics-based data structures.
Package order contains the interfaces for various implementations of order statistics-based data structures.
ost
Package ost provides the interfaces for order statistic trees, which are binary trees with the ability to perform log(n) time searches for elements in the tree with a specified rank, as well as log(n) time retrieval for the rank of a specified element in the tree.
Package ost provides the interfaces for order statistic trees, which are binary trees with the ability to perform log(n) time searches for elements in the tree with a specified rank, as well as log(n) time retrieval for the rank of a specified element in the tree.
ost/avl
Package avl provides the implementation for an AVL tree, which satisfies the ost package interfaces, as well as the order package interfaces.
Package avl provides the implementation for an AVL tree, which satisfies the ost package interfaces, as well as the order package interfaces.
ost/rb
Package rb provides the implementation for a red black tree, which satisfies the ost package interfaces, as well as the order package interfaces.
Package rb provides the implementation for a red black tree, which satisfies the ost package interfaces, as well as the order package interfaces.
skiplist
Package skiplist provides the implementation for skiplists.
Package skiplist provides the implementation for skiplists.
Package sample provides a library of data structures/algorithms for sampling from a stream of data.
Package sample provides a library of data structures/algorithms for sampling from a stream of data.
util
math
Package math is a helper library for mathematical functions.
Package math is a helper library for mathematical functions.
test
Package test is a helper library for implementing tests for the stream package.
Package test is a helper library for implementing tests for the stream package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL