regression

package module
Version: v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 19, 2019 License: MIT Imports: 6 Imported by: 73

README

regression

GoDoc Go Report Card Build Status License

Multivariable Linear Regression in Go (golang)

installation

$ go get github.com/sajari/regression

Supports Go 1.8+

example usage

Import the package, create a regression and add data to it. You can use as many variables as you like, in the below example there are 3 variables for each observation.

package main

import (
	"fmt"

	"github.com/sajari/regression"
)

func main() {
	r := new(regression.Regression)
	r.SetObserved("Murders per annum per 1,000,000 inhabitants")
	r.SetVar(0, "Inhabitants")
	r.SetVar(1, "Percent with incomes below $5000")
	r.SetVar(2, "Percent unemployed")
	r.Train(
		regression.DataPoint(11.2, []float64{587000, 16.5, 6.2}),
		regression.DataPoint(13.4, []float64{643000, 20.5, 6.4}),
		regression.DataPoint(40.7, []float64{635000, 26.3, 9.3}),
		regression.DataPoint(5.3, []float64{692000, 16.5, 5.3}),
		regression.DataPoint(24.8, []float64{1248000, 19.2, 7.3}),
		regression.DataPoint(12.7, []float64{643000, 16.5, 5.9}),
		regression.DataPoint(20.9, []float64{1964000, 20.2, 6.4}),
		regression.DataPoint(35.7, []float64{1531000, 21.3, 7.6}),
		regression.DataPoint(8.7, []float64{713000, 17.2, 4.9}),
		regression.DataPoint(9.6, []float64{749000, 14.3, 6.4}),
		regression.DataPoint(14.5, []float64{7895000, 18.1, 6}),
		regression.DataPoint(26.9, []float64{762000, 23.1, 7.4}),
		regression.DataPoint(15.7, []float64{2793000, 19.1, 5.8}),
		regression.DataPoint(36.2, []float64{741000, 24.7, 8.6}),
		regression.DataPoint(18.1, []float64{625000, 18.6, 6.5}),
		regression.DataPoint(28.9, []float64{854000, 24.9, 8.3}),
		regression.DataPoint(14.9, []float64{716000, 17.9, 6.7}),
		regression.DataPoint(25.8, []float64{921000, 22.4, 8.6}),
		regression.DataPoint(21.7, []float64{595000, 20.2, 8.4}),
		regression.DataPoint(25.7, []float64{3353000, 16.9, 6.7}),
	)
	r.Run()

	fmt.Printf("Regression formula:\n%v\n", r.Formula)
	fmt.Printf("Regression:\n%s\n", r)
}

Note: You can also add data points one by one.

Once calculated you can print the data, look at the R^2, Variance, residuals, etc. You can also access the coefficients directly to use elsewhere, e.g.

// Get the coefficient for the "Inhabitants" variable 0:
c := r.Coeff(0)

You can also use the model to predict new data points

prediction, err := r.Predict([]float64{587000, 16.5, 6.2})

Feature crosses are supported so your model can capture fixed non-linear relationships


r.Train(
  regression.DataPoint(11.2, []float64{587000, 16.5, 6.2}),
)
//Add a new feature which is the first variable (index 0) to the power of 2
r.AddCross(PowCross(0, 2))
r.Run()

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrNotEnoughData signals that there weren't enough datapoint to train the model.
	ErrNotEnoughData = errors.New("not enough data points")
	// ErrTooManyVars signals that there are too many variables for the number of observations being made.
	ErrTooManyVars = errors.New("not enough observations to to support this many variables")
	// ErrRegressionRun signals that the Run method has already been called on the trained dataset.
	ErrRegressionRun = errors.New("regression has already been run")
)

Functions

func DataPoint

func DataPoint(obs float64, vars []float64) *dataPoint

DataPoint creates a well formed *datapoint used for training.

func MakeDataPoints

func MakeDataPoints(a [][]float64, obsIndex int) []*dataPoint

MakeDataPoints makes a `[]*dataPoint` from a `[][]float64`. The expected fomat for the input is a row-major [][]float64. That is to say the first slice represents a row, and the second represents the cols. Furthermore it is expected that all the col slices are of the same length. The obsIndex parameter indicates which column should be used

func MultiplierCross

func MultiplierCross(vars ...int) featureCross

Feature cross based on the multiplication of multiple inputs.

func PowCross

func PowCross(i int, power float64) featureCross

Feature cross based on computing the power of an input.

Types

type DataPoints

type DataPoints []*dataPoint

DataPoints is a slice of *dataPoint This type allows for easier construction of training data points.

type Regression

type Regression struct {
	R2                float64
	Varianceobserved  float64
	VariancePredicted float64

	Formula string
	// contains filtered or unexported fields
}

Regression is the exposed data structure for interacting with the API.

func (*Regression) AddCross

func (r *Regression) AddCross(cross featureCross)

AddCross registers a feature cross to be applied to the data points.

func (*Regression) Coeff

func (r *Regression) Coeff(i int) float64

Coeff returns the calculated coefficient for variable i.

func (*Regression) GetCoeffs added in v1.0.1

func (r *Regression) GetCoeffs() []float64

GetCoeffs returns the calculated coefficients. The element at index 0 is the offset.

func (*Regression) GetObserved

func (r *Regression) GetObserved() string

GetObserved gets the name of the observed value.

func (*Regression) GetVar

func (r *Regression) GetVar(i int) string

GetVar gets the name of variable i

func (*Regression) Predict

func (r *Regression) Predict(vars []float64) (float64, error)

Predict updates the "Predicted" value for the inputed features.

func (*Regression) Run

func (r *Regression) Run() error

Run determines if there is enough data present to run the regression and whether or not the training has already been completed. Once the above checks have passed feature crosses are applied if any and the model is trained using QR decomposition.

func (*Regression) SetObserved

func (r *Regression) SetObserved(name string)

SetObserved sets the name of the observed value.

func (*Regression) SetVar

func (r *Regression) SetVar(i int, name string)

SetVar sets the name of variable i.

func (*Regression) String

func (r *Regression) String() string

String satisfies the stringer interface to display a regression as a string.

func (*Regression) Train

func (r *Regression) Train(d ...*dataPoint)

Train the regression with some data points.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL