# regression

package module
Version: v1.0.1 Latest Latest Go to latest
Published: Nov 19, 2019 License: MIT

## README ¶

### regression

Multivariable Linear Regression in Go (golang)

#### installation

``````\$ go get github.com/sajari/regression
``````

Supports Go 1.8+

#### example usage

Import the package, create a regression and add data to it. You can use as many variables as you like, in the below example there are 3 variables for each observation.

``````package main

import (
"fmt"

"github.com/sajari/regression"
)

func main() {
r := new(regression.Regression)
r.SetObserved("Murders per annum per 1,000,000 inhabitants")
r.SetVar(0, "Inhabitants")
r.SetVar(1, "Percent with incomes below \$5000")
r.SetVar(2, "Percent unemployed")
r.Train(
regression.DataPoint(11.2, []float64{587000, 16.5, 6.2}),
regression.DataPoint(13.4, []float64{643000, 20.5, 6.4}),
regression.DataPoint(40.7, []float64{635000, 26.3, 9.3}),
regression.DataPoint(5.3, []float64{692000, 16.5, 5.3}),
regression.DataPoint(24.8, []float64{1248000, 19.2, 7.3}),
regression.DataPoint(12.7, []float64{643000, 16.5, 5.9}),
regression.DataPoint(20.9, []float64{1964000, 20.2, 6.4}),
regression.DataPoint(35.7, []float64{1531000, 21.3, 7.6}),
regression.DataPoint(8.7, []float64{713000, 17.2, 4.9}),
regression.DataPoint(9.6, []float64{749000, 14.3, 6.4}),
regression.DataPoint(14.5, []float64{7895000, 18.1, 6}),
regression.DataPoint(26.9, []float64{762000, 23.1, 7.4}),
regression.DataPoint(15.7, []float64{2793000, 19.1, 5.8}),
regression.DataPoint(36.2, []float64{741000, 24.7, 8.6}),
regression.DataPoint(18.1, []float64{625000, 18.6, 6.5}),
regression.DataPoint(28.9, []float64{854000, 24.9, 8.3}),
regression.DataPoint(14.9, []float64{716000, 17.9, 6.7}),
regression.DataPoint(25.8, []float64{921000, 22.4, 8.6}),
regression.DataPoint(21.7, []float64{595000, 20.2, 8.4}),
regression.DataPoint(25.7, []float64{3353000, 16.9, 6.7}),
)
r.Run()

fmt.Printf("Regression formula:\n%v\n", r.Formula)
fmt.Printf("Regression:\n%s\n", r)
}
``````

Note: You can also add data points one by one.

Once calculated you can print the data, look at the R^2, Variance, residuals, etc. You can also access the coefficients directly to use elsewhere, e.g.

``````// Get the coefficient for the "Inhabitants" variable 0:
c := r.Coeff(0)
``````

You can also use the model to predict new data points

``````prediction, err := r.Predict([]float64{587000, 16.5, 6.2})
``````

Feature crosses are supported so your model can capture fixed non-linear relationships

``````
r.Train(
regression.DataPoint(11.2, []float64{587000, 16.5, 6.2}),
)
//Add a new feature which is the first variable (index 0) to the power of 2
r.Run()

``````

## Documentation ¶

### Constants ¶

This section is empty.

### Variables ¶

View Source
```var (
// ErrNotEnoughData signals that there weren't enough datapoint to train the model.
ErrNotEnoughData = errors.New("not enough data points")
// ErrTooManyVars signals that there are too many variables for the number of observations being made.
ErrTooManyVars = errors.New("not enough observations to to support this many variables")
// ErrRegressionRun signals that the Run method has already been called on the trained dataset.
ErrRegressionRun = errors.New("regression has already been run")
)```

### Functions ¶

#### func DataPoint ¶

`func DataPoint(obs float64, vars []float64) *dataPoint`

DataPoint creates a well formed *datapoint used for training.

#### func MakeDataPoints ¶

`func MakeDataPoints(a [][]float64, obsIndex int) []*dataPoint`

MakeDataPoints makes a `[]*dataPoint` from a `[][]float64`. The expected fomat for the input is a row-major [][]float64. That is to say the first slice represents a row, and the second represents the cols. Furthermore it is expected that all the col slices are of the same length. The obsIndex parameter indicates which column should be used

#### func MultiplierCross ¶

`func MultiplierCross(vars ...int) featureCross`

Feature cross based on the multiplication of multiple inputs.

#### func PowCross ¶

`func PowCross(i int, power float64) featureCross`

Feature cross based on computing the power of an input.

### Types ¶

#### type DataPoints ¶

`type DataPoints []*dataPoint`

DataPoints is a slice of *dataPoint This type allows for easier construction of training data points.

#### type Regression ¶

```type Regression struct {
R2                float64
Varianceobserved  float64
VariancePredicted float64

Formula string
// contains filtered or unexported fields
}```

Regression is the exposed data structure for interacting with the API.

`func (r *Regression) AddCross(cross featureCross)`

AddCross registers a feature cross to be applied to the data points.

#### func (*Regression) Coeff ¶

`func (r *Regression) Coeff(i int) float64`

Coeff returns the calculated coefficient for variable i.

#### func (*Regression) GetCoeffs ¶ added in v1.0.1

`func (r *Regression) GetCoeffs() []float64`

GetCoeffs returns the calculated coefficients. The element at index 0 is the offset.

#### func (*Regression) GetObserved ¶

`func (r *Regression) GetObserved() string`

GetObserved gets the name of the observed value.

#### func (*Regression) GetVar ¶

`func (r *Regression) GetVar(i int) string`

GetVar gets the name of variable i

#### func (*Regression) Predict ¶

`func (r *Regression) Predict(vars []float64) (float64, error)`

Predict updates the "Predicted" value for the inputed features.

#### func (*Regression) Run ¶

`func (r *Regression) Run() error`

Run determines if there is enough data present to run the regression and whether or not the training has already been completed. Once the above checks have passed feature crosses are applied if any and the model is trained using QR decomposition.

#### func (*Regression) SetObserved ¶

`func (r *Regression) SetObserved(name string)`

SetObserved sets the name of the observed value.

#### func (*Regression) SetVar ¶

`func (r *Regression) SetVar(i int, name string)`

SetVar sets the name of variable i.

#### func (*Regression) String ¶

`func (r *Regression) String() string`

String satisfies the stringer interface to display a regression as a string.

#### func (*Regression) Train ¶

`func (r *Regression) Train(d ...*dataPoint)`

Train the regression with some data points.