regression

package module
v0.0.0-...-099cc9f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 27, 2022 License: MIT Imports: 4 Imported by: 0

README

regression

Go Reference Go Report Card CircleCI License

Multivariable Linear Regression in Go (golang)

This is a fork of github.com/sajari/regression.

Differences are:

  • no output code, if you want to dump anything to stdout, it's your job,
  • better performance & lower memory usage,
  • ability to retrain with new datapoints.

installation

$ go get github.com/cocoonspace/regression

Supports Go 1.18+

example usage

Import the package, create a regression and add data to it. You can use as many variables as you like, in the below example there are 3 variables for each observation.

package main

import (
	"fmt"

	"github.com/cocoonspace/regression"
)

func main() {
	r := &regression.Regression{}
	r.Train(
		regression.DataPoint{Observed:11.2, Variables:[]float64{587000, 16.5, 6.2}},
		regression.DataPoint{Observed:13.4, Variables:[]float64{643000, 20.5, 6.4}},
		regression.DataPoint{Observed:40.7, Variables:[]float64{635000, 26.3, 9.3}},
		regression.DataPoint{Observed:5.3, Variables:[]float64{692000, 16.5, 5.3}},
		regression.DataPoint{Observed:24.8, Variables:[]float64{1248000, 19.2, 7.3}},
		regression.DataPoint{Observed:12.7, Variables:[]float64{643000, 16.5, 5.9}},
		regression.DataPoint{Observed:20.9, Variables:[]float64{1964000, 20.2, 6.4}},
		regression.DataPoint{Observed:35.7, Variables:[]float64{1531000, 21.3, 7.6}},
		regression.DataPoint{Observed:8.7, Variables:[]float64{713000, 17.2, 4.9}},
		regression.DataPoint{Observed:9.6, Variables:[]float64{749000, 14.3, 6.4}},
		regression.DataPoint{Observed:14.5, Variables:[]float64{7895000, 18.1, 6}},
		regression.DataPoint{Observed:26.9, Variables:[]float64{762000, 23.1, 7.4}},
		regression.DataPoint{Observed:15.7, Variables:[]float64{2793000, 19.1, 5.8}},
		regression.DataPoint{Observed:36.2, Variables:[]float64{741000, 24.7, 8.6}},
		regression.DataPoint{Observed:18.1, Variables:[]float64{625000, 18.6, 6.5}},
		regression.DataPoint{Observed:28.9, Variables:[]float64{854000, 24.9, 8.3}},
		regression.DataPoint{Observed:14.9, Variables:[]float64{716000, 17.9, 6.7}},
		regression.DataPoint{Observed:25.8, Variables:[]float64{921000, 22.4, 8.6}},
		regression.DataPoint{Observed:21.7, Variables:[]float64{595000, 20.2, 8.4}},
		regression.DataPoint{Observed:25.7, Variables:[]float64{3353000, 16.9, 6.7}},
	)
	r.Run()
}

Note: You can also add data points one by one.

Once calculated you can print the data, look at the R^2, Variance, residuals, etc. You can also access the coefficients directly to use elsewhere, e.g.

// Get the coefficient for the "Inhabitants" variable 0:
c := r.Coeff(0)

You can also use the model to predict new data points

prediction, err := r.Predict([]float64{587000, 16.5, 6.2})

Feature crosses are supported so your model can capture fixed non-linear relationships


r.Train(
  regression.DataPoint{Observed:11.2, Variables:[]float64{587000, 16.5, 6.2}},
)
//Add a new feature which is the first variable (index 0) to the power of 2
r.AddCross(PowCross(0, 2))
r.Run()

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrNotEnoughData signals that there weren't enough datapoint to train the model.
	ErrNotEnoughData = errors.New("not enough data points")
	// ErrTooManyVars signals that there are too many variables for the number of observations being made.
	ErrTooManyVars = errors.New("not enough observations to support this many variables")
	// ErrRegressionRun signals that the Run method has not been run yet.
	ErrRegressionRun = errors.New("regression has not run yet")
)

Functions

func MultiplierCross

func MultiplierCross(vars ...int) featureCross

Feature cross based on the multiplication of multiple inputs.

func PowCross

func PowCross(i int, power float64) featureCross

Feature cross based on computing the power of an input.

Types

type DataPoint

type DataPoint struct {
	Observed  float64
	Variables []float64
	Crosses   []float64
	Predicted float64
	Error     float64
}

func MakeDataPoints

func MakeDataPoints(a [][]float64, obsIndex int) []DataPoint

MakeDataPoints makes a `[]DataPoint` from a `[][]float64`. The expected fomat for the input is a row-major [][]float64. That is to say the first slice represents a row, and the second represents the cols. Furthermore it is expected that all the col slices are of the same length. The obsIndex parameter indicates which column should be used

type DataPoints

type DataPoints []DataPoint

DataPoints is a slice of DataPoint This type allows for easier construction of training data points.

type Regression

type Regression struct {
	Data []DataPoint

	R2                float64
	VarianceObserved  float64
	VariancePredicted float64

	Ready bool
	// contains filtered or unexported fields
}

Regression is the exposed data structure for interacting with the API.

func (*Regression) AddCross

func (r *Regression) AddCross(cross featureCross)

AddCross registers a feature cross to be applied to the data points.

func (*Regression) Coeff

func (r *Regression) Coeff(i int) float64

Coeff returns the calculated coefficient for variable i.

func (*Regression) GetCoeffs

func (r *Regression) GetCoeffs() []float64

GetCoeffs returns the calculated coefficients. The element at index 0 is the offset.

func (*Regression) Predict

func (r *Regression) Predict(vars []float64) (float64, error)

Predict updates the "Predicted" value for the inputed features.

func (*Regression) Run

func (r *Regression) Run() error

Run determines if there is enough data present to run the regression and whether or not the training has already been completed. Once the above checks have passed feature crosses are applied if any and the model is trained using QR decomposition.

func (*Regression) Train

func (r *Regression) Train(d ...DataPoint)

Train the regression with some data points.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL