stats

package module
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 16, 2019 License: MIT Imports: 5 Imported by: 1,093

README

Stats

A well tested and comprehensive Golang statistics library package with no dependencies.

If you have any suggestions, problems or bug reports please create an issue and I'll do my best to accommodate you. In addition simply starring the repo would show your support for the project and be very much appreciated!

Installation

go get github.com/montanaflynn/stats

Example Usage

All the functions can be seen in examples/main.go but here's a little taste:

// start with some source data to use
data := []float64{1.0, 2.1, 3.2, 4.823, 4.1, 5.8}

// you could also use different types like this
// data := stats.LoadRawData([]int{1, 2, 3, 4, 5})
// data := stats.LoadRawData([]interface{}{1.1, "2", 3})
// etc...

median, _ := stats.Median(data)
fmt.Println(median) // 3.65

roundedMedian, _ := stats.Round(median, 0)
fmt.Println(roundedMedian) // 4

Documentation

The entire API documentation is available on GoDoc.org

You can view docs offline with the following commands:

# Command line
godoc ./
godoc ./ Median
godoc ./ Float64Data

# Local website
godoc -http=:4444
open http://localhost:4444/pkg/github.com/montanaflynn/stats/

The exported API is as follows:

var (
    EmptyInputErr = statsErr{"Input must not be empty."}
    NaNErr        = statsErr{"Not a number."}
    NegativeErr   = statsErr{"Must not contain negative values."}
    ZeroErr       = statsErr{"Must not contain zero values."}
    BoundsErr     = statsErr{"Input is outside of range."}
    SizeErr       = statsErr{"Must be the same length."}
    InfValue      = statsErr{"Value is infinite."}
    YCoordErr     = statsErr{"Y Value must be greater than zero."}
)

type Float64Data []float64

func LoadRawData(raw interface{}) (f Float64Data) {}
func AutoCorrelation(data Float64Data, lags int) (float64, error) {}
func ChebyshevDistance(dataPointX, dataPointY []float64) (distance float64, err error) {}
func Correlation(data1, data2 Float64Data) (float64, error) {}
func Covariance(data1, data2 Float64Data) (float64, error) {}
func CovariancePopulation(data1, data2 Float64Data) (float64, error) {}
func CumulativeSum(input Float64Data) ([]float64, error) {}
func EuclideanDistance(dataPointX, dataPointY []float64) (distance float64, err error) {}
func GeometricMean(input Float64Data) (float64, error) {}
func HarmonicMean(input Float64Data) (float64, error) {}
func InterQuartileRange(input Float64Data) (float64, error) {}
func ManhattanDistance(dataPointX, dataPointY []float64) (distance float64, err error) {}
func Max(input Float64Data) (max float64, err error) {}
func Mean(input Float64Data) (float64, error) {}
func Median(input Float64Data) (median float64, err error) {}
func MedianAbsoluteDeviation(input Float64Data) (mad float64, err error) {}
func MedianAbsoluteDeviationPopulation(input Float64Data) (mad float64, err error) {}
func Midhinge(input Float64Data) (float64, error) {}
func Min(input Float64Data) (min float64, err error) {}
func MinkowskiDistance(dataPointX, dataPointY []float64, lambda float64) (distance float64, err error) {}
func Mode(input Float64Data) (mode []float64, err error) {}
func Pearson(data1, data2 Float64Data) (float64, error) {}
func Percentile(input Float64Data, percent float64) (percentile float64, err error) {}
func PercentileNearestRank(input Float64Data, percent float64) (percentile float64, err error) {}
func PopulationVariance(input Float64Data) (pvar float64, err error) {}
func Round(input float64, places int) (rounded float64, err error) {}
func Sample(input Float64Data, takenum int, replacement bool) ([]float64, error) {}
func SampleVariance(input Float64Data) (svar float64, err error) {}
func Sigmoid(input Float64Data) ([]float64, error) {}
func SoftMax(input Float64Data) ([]float64, error) {}
func StandardDeviation(input Float64Data) (sdev float64, err error) {}
func StandardDeviationPopulation(input Float64Data) (sdev float64, err error) {}
func StandardDeviationSample(input Float64Data) (sdev float64, err error) {}
func StdDevP(input Float64Data) (sdev float64, err error) {}
func StdDevS(input Float64Data) (sdev float64, err error) {}
func Sum(input Float64Data) (sum float64, err error) {}
func Trimean(input Float64Data) (float64, error) {}
func VarP(input Float64Data) (sdev float64, err error) {}
func VarS(input Float64Data) (sdev float64, err error) {}
func Variance(input Float64Data) (sdev float64, err error) {}

type Coordinate struct {
    X, Y float64
}

type Series []Coordinate

func ExponentialRegression(s Series) (regressions Series, err error) {}
func LinearRegression(s Series) (regressions Series, err error) {}
func LogarithmicRegression(s Series) (regressions Series, err error) {}

type Outliers struct {
    Mild    Float64Data
    Extreme Float64Data
}

type Quartiles struct {
    Q1 float64
    Q2 float64
    Q3 float64
}

func Quartile(input Float64Data) (Quartiles, error) {}
func QuartileOutliers(input Float64Data) (Outliers, error) {}

Contributing

Pull request are always welcome no matter how big or small. I've included a Makefile that has a lot of helper targets for common actions such as linting, testing, code coverage reporting and more.

  1. Fork the repo and clone your fork
  2. Create new branch (git checkout -b some-thing)
  3. Make the desired changes
  4. Ensure tests pass (go test -cover or make test)
  5. Commit changes (git commit -am 'Did something')
  6. Push branch (git push origin some-thing)
  7. Submit pull request

To make things as seamless as possible please also consider the following steps:

  • Update examples/main.go with a simple example of the new feature
  • Update README.md documentation section with any new exported API
  • Keep 100% code coverage (you can check with make coverage)
  • Squash commits into single units of work with git rebase -i new-feature

MIT License

Copyright (c) 2014-2019 Montana Flynn http://anonfunction.com

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORpublicS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Documentation

Index

Examples

Constants

This section is empty.

Variables

View Source
var (
	EmptyInputErr = statsErr{"Input must not be empty."}
	NaNErr        = statsErr{"Not a number."}
	NegativeErr   = statsErr{"Must not contain negative values."}
	ZeroErr       = statsErr{"Must not contain zero values."}
	BoundsErr     = statsErr{"Input is outside of range."}
	SizeErr       = statsErr{"Must be the same length."}
	InfValue      = statsErr{"Value is infinite."}
	YCoordErr     = statsErr{"Y Value must be greater than zero."}
)

These are the package-wide error values. All error identification should use these values.

View Source
var EmptyInput = EmptyInputErr

EmptyInput legacy error name didn't end with Err

Functions

func AutoCorrelation

func AutoCorrelation(data Float64Data, lags int) (float64, error)

Autocorrelation is the correlation of a signal with a delayed copy of itself as a function of delay

Example
s1 := []float64{1, 2, 3, 4, 5}
a, _ := AutoCorrelation(s1, 1)
fmt.Println(a)
Output:

0.4

func ChebyshevDistance

func ChebyshevDistance(dataPointX, dataPointY []float64) (distance float64, err error)

ChebyshevDistance computes the Chebyshev distance between two data sets

Example
d1 := []float64{2, 3, 4, 5, 6, 7, 8}
d2 := []float64{8, 7, 6, 5, 4, 3, 2}
cd, _ := ChebyshevDistance(d1, d2)
fmt.Println(cd)
Output:

6

func Correlation

func Correlation(data1, data2 Float64Data) (float64, error)

Correlation describes the degree of relationship between two sets of data

Example
s1 := []float64{1, 2, 3, 4, 5}
s2 := []float64{1, 2, 3, 5, 6}
a, _ := Correlation(s1, s2)
fmt.Println(a)
Output:

0.9912407071619302

func Covariance

func Covariance(data1, data2 Float64Data) (float64, error)

Covariance is a measure of how much two sets of data change

func CovariancePopulation

func CovariancePopulation(data1, data2 Float64Data) (float64, error)

CovariancePopulation computes covariance for entire population between two variables.

func CumulativeSum

func CumulativeSum(input Float64Data) ([]float64, error)

CumulativeSum calculates the cumulative sum of the input slice

func EuclideanDistance

func EuclideanDistance(dataPointX, dataPointY []float64) (distance float64, err error)

EuclideanDistance computes the Euclidean distance between two data sets

func GeometricMean

func GeometricMean(input Float64Data) (float64, error)

GeometricMean gets the geometric mean for a slice of numbers

func HarmonicMean

func HarmonicMean(input Float64Data) (float64, error)

HarmonicMean gets the harmonic mean for a slice of numbers

func InterQuartileRange

func InterQuartileRange(input Float64Data) (float64, error)

InterQuartileRange finds the range between Q1 and Q3

func ManhattanDistance

func ManhattanDistance(dataPointX, dataPointY []float64) (distance float64, err error)

ManhattanDistance computes the Manhattan distance between two data sets

func Max

func Max(input Float64Data) (max float64, err error)

Max finds the highest number in a slice

Example
d := []float64{1.1, 2.3, 3.2, 4.0, 4.01, 5.09}
a, _ := Max(d)
fmt.Println(a)
Output:

5.09

func Mean

func Mean(input Float64Data) (float64, error)

Mean gets the average of a slice of numbers

func Median

func Median(input Float64Data) (median float64, err error)

Median gets the median number in a slice of numbers

func MedianAbsoluteDeviation

func MedianAbsoluteDeviation(input Float64Data) (mad float64, err error)

MedianAbsoluteDeviation finds the median of the absolute deviations from the dataset median

func MedianAbsoluteDeviationPopulation

func MedianAbsoluteDeviationPopulation(input Float64Data) (mad float64, err error)

MedianAbsoluteDeviationPopulation finds the median of the absolute deviations from the population median

func Midhinge

func Midhinge(input Float64Data) (float64, error)

Midhinge finds the average of the first and third quartiles

func Min

func Min(input Float64Data) (min float64, err error)

Min finds the lowest number in a set of data

Example
d := LoadRawData([]interface{}{1.1, "2", 3.0, 4, "5"})
a, _ := Min(d)
fmt.Println(a)
Output:

1.1

func MinkowskiDistance

func MinkowskiDistance(dataPointX, dataPointY []float64, lambda float64) (distance float64, err error)

MinkowskiDistance computes the Minkowski distance between two data sets

Arguments:

dataPointX: First set of data points
dataPointY: Second set of data points. Length of both data
            sets must be equal.
lambda:     aka p or city blocks; With lambda = 1
            returned distance is manhattan distance and
            lambda = 2; it is euclidean distance. Lambda
            reaching to infinite - distance would be chebysev
            distance.

Return:

Distance or error

func Mode

func Mode(input Float64Data) (mode []float64, err error)

Mode gets the mode [most frequent value(s)] of a slice of float64s

func Pearson

func Pearson(data1, data2 Float64Data) (float64, error)

Pearson calculates the Pearson product-moment correlation coefficient between two variables

func Percentile

func Percentile(input Float64Data, percent float64) (percentile float64, err error)

Percentile finds the relative standing in a slice of floats

func PercentileNearestRank

func PercentileNearestRank(input Float64Data, percent float64) (percentile float64, err error)

PercentileNearestRank finds the relative standing in a slice of floats using the Nearest Rank method

func PopulationVariance

func PopulationVariance(input Float64Data) (pvar float64, err error)

PopulationVariance finds the amount of variance within a population

func Round

func Round(input float64, places int) (rounded float64, err error)

Round a float to a specific decimal place or precision

func Sample

func Sample(input Float64Data, takenum int, replacement bool) ([]float64, error)

Sample returns sample from input with replacement or without

func SampleVariance

func SampleVariance(input Float64Data) (svar float64, err error)

SampleVariance finds the amount of variance within a sample

func Sigmoid added in v0.5.0

func Sigmoid(input Float64Data) ([]float64, error)

Sigmoid returns the input values in the range of -1 to 1 along the sigmoid or s-shaped curve, commonly used in machine learning while training neural networks as an activation function.

Example
s, _ := Sigmoid([]float64{3.0, 1.0, 0.2})
fmt.Println(s)
Output:

[0.9525741268224334 0.7310585786300049 0.549833997312478]

func SoftMax added in v0.5.0

func SoftMax(input Float64Data) ([]float64, error)

SoftMax returns the input values in the range of 0 to 1 with sum of all the probabilities being equal to one. It is commonly used in machine learning neural networks.

Example
sm, _ := SoftMax([]float64{3.0, 1.0, 0.2})
fmt.Println(sm)
Output:

[0.8360188027814407 0.11314284146556013 0.05083835575299916]

func StandardDeviation

func StandardDeviation(input Float64Data) (sdev float64, err error)

StandardDeviation the amount of variation in the dataset

func StandardDeviationPopulation

func StandardDeviationPopulation(input Float64Data) (sdev float64, err error)

StandardDeviationPopulation finds the amount of variation from the population

func StandardDeviationSample

func StandardDeviationSample(input Float64Data) (sdev float64, err error)

StandardDeviationSample finds the amount of variation from a sample

func StdDevP

func StdDevP(input Float64Data) (sdev float64, err error)

StdDevP is a shortcut to StandardDeviationPopulation

func StdDevS

func StdDevS(input Float64Data) (sdev float64, err error)

StdDevS is a shortcut to StandardDeviationSample

func Sum

func Sum(input Float64Data) (sum float64, err error)

Sum adds all the numbers of a slice together

Example
d := []float64{1.1, 2.2, 3.3}
a, _ := Sum(d)
fmt.Println(a)
Output:

6.6

func Trimean

func Trimean(input Float64Data) (float64, error)

Trimean finds the average of the median and the midhinge

func VarP

func VarP(input Float64Data) (sdev float64, err error)

VarP is a shortcut to PopulationVariance

func VarS

func VarS(input Float64Data) (sdev float64, err error)

VarS is a shortcut to SampleVariance

func Variance

func Variance(input Float64Data) (sdev float64, err error)

Variance the amount of variation in the dataset

Types

type Coordinate

type Coordinate struct {
	X, Y float64
}

Coordinate holds the data in a series

func ExpReg

func ExpReg(s []Coordinate) (regressions []Coordinate, err error)

ExpReg is a shortcut to ExponentialRegression

func LinReg

func LinReg(s []Coordinate) (regressions []Coordinate, err error)

LinReg is a shortcut to LinearRegression

func LogReg

func LogReg(s []Coordinate) (regressions []Coordinate, err error)

LogReg is a shortcut to LogarithmicRegression

type Float64Data

type Float64Data []float64

Float64Data is a named type for []float64 with helper methods

func LoadRawData

func LoadRawData(raw interface{}) (f Float64Data)

LoadRawData parses and converts a slice of mixed data types to floats

func (Float64Data) AutoCorrelation

func (f Float64Data) AutoCorrelation(lags int) (float64, error)

Autocorrelation is the correlation of a signal with a delayed copy of itself as a function of delay

func (Float64Data) Correlation

func (f Float64Data) Correlation(d Float64Data) (float64, error)

Correlation describes the degree of relationship between two sets of data

func (Float64Data) Covariance

func (f Float64Data) Covariance(d Float64Data) (float64, error)

Covariance is a measure of how much two sets of data change

func (Float64Data) CovariancePopulation

func (f Float64Data) CovariancePopulation(d Float64Data) (float64, error)

CovariancePopulation computes covariance for entire population between two variables.

func (Float64Data) CumulativeSum

func (f Float64Data) CumulativeSum() ([]float64, error)

CumulativeSum returns the cumulative sum of the data

func (Float64Data) GeometricMean

func (f Float64Data) GeometricMean() (float64, error)

GeometricMean returns the median of the data

func (Float64Data) Get

func (f Float64Data) Get(i int) float64

Get item in slice

func (Float64Data) HarmonicMean

func (f Float64Data) HarmonicMean() (float64, error)

HarmonicMean returns the mode of the data

func (Float64Data) InterQuartileRange

func (f Float64Data) InterQuartileRange() (float64, error)

InterQuartileRange finds the range between Q1 and Q3

func (Float64Data) Len

func (f Float64Data) Len() int

Len returns length of slice

func (Float64Data) Less

func (f Float64Data) Less(i, j int) bool

Less returns if one number is less than another

func (Float64Data) Max

func (f Float64Data) Max() (float64, error)

Max returns the maximum number in the data

func (Float64Data) Mean

func (f Float64Data) Mean() (float64, error)

Mean returns the mean of the data

func (Float64Data) Median

func (f Float64Data) Median() (float64, error)

Median returns the median of the data

func (Float64Data) MedianAbsoluteDeviation

func (f Float64Data) MedianAbsoluteDeviation() (float64, error)

MedianAbsoluteDeviation the median of the absolute deviations from the dataset median

func (Float64Data) MedianAbsoluteDeviationPopulation

func (f Float64Data) MedianAbsoluteDeviationPopulation() (float64, error)

MedianAbsoluteDeviationPopulation finds the median of the absolute deviations from the population median

func (Float64Data) Midhinge

func (f Float64Data) Midhinge(d Float64Data) (float64, error)

Midhinge finds the average of the first and third quartiles

func (Float64Data) Min

func (f Float64Data) Min() (float64, error)

Min returns the minimum number in the data

func (Float64Data) Mode

func (f Float64Data) Mode() ([]float64, error)

Mode returns the mode of the data

func (Float64Data) Pearson

func (f Float64Data) Pearson(d Float64Data) (float64, error)

Pearson calculates the Pearson product-moment correlation coefficient between two variables.

func (Float64Data) Percentile

func (f Float64Data) Percentile(p float64) (float64, error)

Percentile finds the relative standing in a slice of floats

func (Float64Data) PercentileNearestRank

func (f Float64Data) PercentileNearestRank(p float64) (float64, error)

PercentileNearestRank finds the relative standing using the Nearest Rank method

func (Float64Data) PopulationVariance

func (f Float64Data) PopulationVariance() (float64, error)

PopulationVariance finds the amount of variance within a population

func (Float64Data) Quartile

func (f Float64Data) Quartile(d Float64Data) (Quartiles, error)

Quartile returns the three quartile points from a slice of data

func (Float64Data) QuartileOutliers

func (f Float64Data) QuartileOutliers() (Outliers, error)

QuartileOutliers finds the mild and extreme outliers

func (Float64Data) Sample

func (f Float64Data) Sample(n int, r bool) ([]float64, error)

Sample returns sample from input with replacement or without

func (Float64Data) SampleVariance

func (f Float64Data) SampleVariance() (float64, error)

SampleVariance finds the amount of variance within a sample

func (Float64Data) StandardDeviation

func (f Float64Data) StandardDeviation() (float64, error)

StandardDeviation the amount of variation in the dataset

func (Float64Data) StandardDeviationPopulation

func (f Float64Data) StandardDeviationPopulation() (float64, error)

StandardDeviationPopulation finds the amount of variation from the population

func (Float64Data) StandardDeviationSample

func (f Float64Data) StandardDeviationSample() (float64, error)

StandardDeviationSample finds the amount of variation from a sample

func (Float64Data) Sum

func (f Float64Data) Sum() (float64, error)

Sum returns the total of all the numbers in the data

func (Float64Data) Swap

func (f Float64Data) Swap(i, j int)

Swap switches out two numbers in slice

func (Float64Data) Trimean

func (f Float64Data) Trimean(d Float64Data) (float64, error)

Trimean finds the average of the median and the midhinge

func (Float64Data) Variance

func (f Float64Data) Variance() (float64, error)

Variance the amount of variation in the dataset

type Outliers

type Outliers struct {
	Mild    Float64Data
	Extreme Float64Data
}

Outliers holds mild and extreme outliers found in data

func QuartileOutliers

func QuartileOutliers(input Float64Data) (Outliers, error)

QuartileOutliers finds the mild and extreme outliers

type Quartiles

type Quartiles struct {
	Q1 float64
	Q2 float64
	Q3 float64
}

Quartiles holds the three quartile points

func Quartile

func Quartile(input Float64Data) (Quartiles, error)

Quartile returns the three quartile points from a slice of data

type Series

type Series []Coordinate

Series is a container for a series of data

func ExponentialRegression

func ExponentialRegression(s Series) (regressions Series, err error)

ExponentialRegression returns an exponential regression on data series

func LinearRegression

func LinearRegression(s Series) (regressions Series, err error)

LinearRegression finds the least squares linear regression on data series

Example
data := []Coordinate{
	{1, 2.3},
	{2, 3.3},
	{3, 3.7},
}

r, _ := LinearRegression(data)
fmt.Println(r)
Output:

[{1 2.400000000000001} {2 3.1} {3 3.7999999999999994}]

func LogarithmicRegression

func LogarithmicRegression(s Series) (regressions Series, err error)

LogarithmicRegression returns an logarithmic regression on data series

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL