Back to godoc.org

Package neighbors

v0.0.0-...-a35b0cb
Latest Go to latest

The latest major version is .

Published: Feb 10, 2020 | License: MIT | Module: github.com/RobinRCM/sklearn

Overview

Package neighbors implements the k-nearest neighbors algorithm. it contains NearestCentroid, KNeighborsClassifier and KNeighborsRegressor

Index

Examples

func EuclideanDistance

func EuclideanDistance(a, b mat.Vector) float64

EuclideanDistance is a Distancer

Example

Code:

a, b := mat.NewVecDense(4, []float64{0, 0, 0, 0}), mat.NewVecDense(4, []float64{1, 1, 0, 1})
fmt.Printf("%.8f\n", EuclideanDistance(a, b))
1.73205081

func MinkowskiDistanceP

func MinkowskiDistanceP(a, b mat.Vector, p float64) float64

MinkowskiDistanceP ...

func NewKNeighborsRegressor

func NewKNeighborsRegressor(K int, Weights interface{}) base.Predicter

NewKNeighborsRegressor returns an initialized *KNeighborsRegressor Weights may be "uniform", "distance" or func(dist []float64) []float64

type Distance

type Distance func(a, b mat.Vector) float64

Distance has Distance(Vector,Vector)float64

func MinkowskiDistance

func MinkowskiDistance(p float64) Distance

MinkowskiDistance ...

Example

Code:

a, b := mat.NewVecDense(4, []float64{0, 0, 0, 0}), mat.NewVecDense(4, []float64{1, 1, 0, 1})
fmt.Printf("%.8f\n", MinkowskiDistance(1)(a, b))
fmt.Printf("%.8f\n", MinkowskiDistance(2)(a, b))
fmt.Printf("%.8f\n", MinkowskiDistance(math.Inf(1))(a, b))
3.00000000
1.73205081
1.00000000

type InnerNode

type InnerNode struct {
	Node
	// contains filtered or unexported fields
}

InnerNode ...

func (*InnerNode) IsLeaf

func (*InnerNode) IsLeaf() bool

IsLeaf ...

type KDTree

type KDTree struct {
	Data        *mat.Dense
	LeafSize    int
	Maxes, Mins []float64
	Tree        Node
}

KDTree for quick nearest-neighbor lookup This class provides an index into a set of k-dimensional points which can be used to rapidly look up the nearest neighbors of any point. Parameters Data : (N,K) mat.Dense The data points to be indexed. This array is not copied, and

so modifying this data will result in bogus results.

Leafsize : int, optional The number of points at which the algorithm switches over to

brute-force.  Has to be positive.

Notes The algorithm used is described in Maneewongvatana and Mount 1999. The general idea is that the kd-tree is a binary tree, each of whose nodes represents an axis-aligned hyperrectangle. Each node specifies an axis and splits the set of points based on whether their coordinate along that axis is greater than or less than a particular value. During construction, the axis and splitting point are chosen by the "sliding midpoint" rule, which ensures that the cells do not all become long and thin. The tree can be queried for the r closest neighbors of any given point (optionally returning only those within some maximum distance of the point). It can also be queried, with a substantial gain in efficiency, for the r approximate closest neighbors. For large dimensions (20 is already large) do not expect this to run significantly faster than brute force. High-dimensional nearest-neighbor queries are a substantial open problem in computer science. The tree also supports all-neighbors queries, both with arrays of points and with other kd-trees. These do use a reasonably efficient algorithm, but the kd-tree is not necessarily the best data structure for this sort of calculation.

Example

Code:

// v https://github.com/scipy/scipy/blob/v1.1.0/scipy/spatial/kdtree.py
X := mat.NewDense(30, 2, nil)
for i := 0; i < 5; i++ {
	for j := 0; j < 6; j++ {
		X.Set(i*6+j, 0, float64(i))
		X.Set(i*6+j, 1, float64(j+2))
	}
}
//fmt.Println(mat.Formatted(X))
tree := NewKDTree(X, 1)
pts := mat.NewDense(2, 2, []float64{0, 0, 2.1, 2.9})
distances, indices := tree.Query(pts, 1, 1e-15, 2, math.Inf(1))
fmt.Printf("%.6f\n", mat.Formatted(distances.T()))
fmt.Println(mat.Formatted(indices.T()))
[2.000000  0.141421]
[ 0  13]

func NewKDTree

func NewKDTree(data mat.Matrix, LeafSize int) *KDTree

NewKDTree ...

func (*KDTree) Query

func (tr *KDTree) Query(X mat.Matrix, k int, eps, p, distanceUpperBound float64) (dd, ii *mat.Dense)

Query ... Query the kd-tree for nearest neighbors Parameters ---------- x : array_like, last dimension self.m

An array of points to query.

k : int, optional

The number of nearest neighbors to return.

eps : nonnegative float, optional

Return approximate nearest neighbors; the kth returned value
is guaranteed to be no further than (1+eps) times the
distance to the real kth nearest neighbor.

p : float, 1<=p<=infinity, optional

Which Minkowski p-norm to use.
1 is the sum-of-absolute-values "Manhattan" distance
2 is the usual Euclidean distance
infinity is the maximum-coordinate-difference distance

distance_upper_bound : nonnegative float, optional

Return only neighbors within this distance. This is used to prune
tree searches, so if you are doing a series of nearest-neighbor
queries, it may help to supply the distance to the nearest neighbor
of the most recent point.

Returns ------- d : float or array of floats

The distances to the nearest neighbors.
If x has shape tuple+(self.m,), then d has shape tuple if
k is one, or tuple+(k,) if k is larger than one. Missing
neighbors (e.g. when k > n or distance_upper_bound is
given) are indicated with infinite distances.  If k is None,
then d is an object array of shape tuple, containing lists
of distances. In either case the hits are sorted by distance
(nearest first).

i : integer or array of integers

The locations of the neighbors in self.data. i is the same
shape as d.

type KNeighborsClassifier

type KNeighborsClassifier struct {
	base.Predicter
	NearestNeighbors
	K        int
	Weight   string
	Scale    bool
	Distance Distance
	// Runtime members
	Xscaled, Y *mat.Dense
	Classes    [][]float64
	// contains filtered or unexported fields
}

KNeighborsClassifier is a Regression based on k-nearest neighbors. The target is predicted by local interpolation of the targets associated of the nearest neighbors in the training set.

Example

Code:

X := mat.NewDense(4, 1, []float64{0, 1, 2, 3})
Y := mat.NewDense(4, 1, []float64{0, 0, 1, 1})
neigh := NewKNeighborsClassifier(3, "uniform")
neigh.Fit(X, Y)
Xtest := mat.NewDense(1, 1, []float64{1.1})
Ypred := mat.NewDense(1, 1, nil)
neigh.Predict(Xtest, Ypred)
fmt.Println(mat.Formatted(Ypred))
Xtest.Set(0, 0, 0.9)

Yprob := mat.NewDense(1, 2, nil)
neigh.PredictProba(Xtest, Yprob)
fmt.Printf("%.8f\n", mat.Formatted(Yprob))
[0]
[0.66666667  0.33333333]

func NewKNeighborsClassifier

func NewKNeighborsClassifier(K int, Weights string) *KNeighborsClassifier

NewKNeighborsClassifier returns an initialized *KNeighborsClassifier

func (*KNeighborsClassifier) Fit

func (m *KNeighborsClassifier) Fit(Xmatrix, Ymatrix mat.Matrix) base.Fiter

Fit ...

func (*KNeighborsClassifier) GetNOutputs

func (m *KNeighborsClassifier) GetNOutputs() int

GetNOutputs returns output columns number for Y to pass to predict

func (*KNeighborsClassifier) Predict

func (m *KNeighborsClassifier) Predict(X mat.Matrix, Ymutable mat.Mutable) *mat.Dense

Predict for KNeighborsClassifier

func (*KNeighborsClassifier) PredictProba

func (m *KNeighborsClassifier) PredictProba(X, Y *mat.Dense) *KNeighborsClassifier

PredictProba for KNeighborsClassifier

func (*KNeighborsClassifier) Score

func (m *KNeighborsClassifier) Score(X, Y mat.Matrix) float64

Score for KNeighborsClassifier

type KNeighborsRegressor

type KNeighborsRegressor struct {
	NearestNeighbors
	K int
	// Weights may be "uniform", "distance" or func(dstWeights, srcDists []float64)
	Weights  interface{}
	Scale    bool
	Distance Distance
	// Runtime members
	Xscaled, Y *mat.Dense
}

KNeighborsRegressor is a Regression based on k-nearest neighbors. The target is predicted by local interpolation of the targets associated of the nearest neighbors in the training set.

Example

ExampleKNeighborsRegressor from http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#sklearn.neighbors.KNeighborsRegressor

Code:

X := mat.NewDense(4, 1, []float64{0, 1, 2, 3})
Y := mat.NewDense(4, 1, []float64{0, 0, 1, 1})
neigh := NewKNeighborsRegressor(2, "distance")
neigh.Fit(X, Y)
Xtest := mat.NewDense(1, 1, []float64{1.5})
Ypred := mat.NewDense(1, 1, nil)
neigh.Predict(Xtest, Ypred)
fmt.Println(mat.Formatted(Ypred))
[0.5]

func (*KNeighborsRegressor) Fit

func (m *KNeighborsRegressor) Fit(Xmatrix, Ymatrix mat.Matrix) base.Fiter

Fit ...

func (*KNeighborsRegressor) GetNOutputs

func (m *KNeighborsRegressor) GetNOutputs() int

GetNOutputs return Y width

func (*KNeighborsRegressor) IsClassifier

func (*KNeighborsRegressor) IsClassifier() bool

IsClassifier returns false for KNeighborsRegressor

func (*KNeighborsRegressor) Predict

func (m *KNeighborsRegressor) Predict(X mat.Matrix, Ymutable mat.Mutable) *mat.Dense

Predict ...

func (*KNeighborsRegressor) PredicterClone

func (m *KNeighborsRegressor) PredicterClone() base.Predicter

PredicterClone return a (possibly unfitted) copy of predicter

func (*KNeighborsRegressor) Score

func (m *KNeighborsRegressor) Score(X, Y mat.Matrix) float64

Score for KNeighborsRegressor

type LeafNode

type LeafNode struct {
	Node
	// contains filtered or unexported fields
}

LeafNode ...

func (*LeafNode) IsLeaf

func (*LeafNode) IsLeaf() bool

IsLeaf ...

type NearestCentroid

type NearestCentroid struct {
	base.Predicter
	Metric          string
	ShrinkThreshold float64
	// runtime filled members
	NearestNeighbors
	Classes    [][]float64
	ClassCount [][]int
	Centroids  *mat.Dense
}

NearestCentroid is a Regression based on k-nearest neighbors. The target is predicted by local interpolation of the targets associated of the nearest neighbors in the training set.

Example

Code:

// Example adapted from from http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestCentroid.html#sklearn.neighbors.NearestCentroid
X := mat.NewDense(6, 2, []float64{-1, -1, -2, -1, -3, -2, 1, 1, 2, 1, 3, 2})
Y := mat.NewDense(6, 1, []float64{1, 1, 1, 2, 2, 2})
clf := NewNearestCentroid("euclidean", 0.)
clf.Fit(X, Y)
Xtest := mat.NewDense(1, 2, []float64{-0.8, -1})

Ypred := mat.NewDense(1, 1, nil)
clf.Predict(Xtest, Ypred)
fmt.Println(mat.Formatted(Ypred))
[1]

func NewNearestCentroid

func NewNearestCentroid(metric string, shrinkThreshold float64) *NearestCentroid

NewNearestCentroid ... if Metric is "manhattan", centroids are computed using median else mean

func (*NearestCentroid) Fit

func (m *NearestCentroid) Fit(Xmatrix, Ymatrix mat.Matrix) base.Fiter

Fit ...

func (*NearestCentroid) Predict

func (m *NearestCentroid) Predict(X mat.Matrix, Ymutable mat.Mutable) *mat.Dense

Predict for NearestCentroid

func (*NearestCentroid) PredictProba

func (m *NearestCentroid) PredictProba(X mat.Matrix, Y *mat.Dense) *NearestCentroid

PredictProba for NearestCentroid

func (*NearestCentroid) Score

func (m *NearestCentroid) Score(X, Y mat.Matrix) float64

Score for NearestCentroid

type NearestNeighbors

type NearestNeighbors struct {
	Algorithm string
	Metric    string
	P         float64
	NJobs     int
	LeafSize  int
	// Runtime filled members
	Distance func(a, b mat.Vector) float64
	X, Y     *mat.Dense
	Tree     *KDTree
}

NearestNeighbors is the unsupervised alog implementing search of k nearest neighbors Algorithm is one of 'auto', 'ball_tree', 'kd_tree', 'brute' defaults to "auto"

Metric = 'cityblock', 'cosine', 'euclidean', 'l1', 'l2', 'manhattan' defaults to euclidean (= minkowski with P=2) P is power for 'minkowski' NJobs: number of concurrent jobs. NJobs<0 means runtime.NumCPU() default to -1

Example

Code:

// test from http://scikit-learn.org/stable/modules/neighbors.html
X := mat.NewDense(6, 2, []float64{-1, -1, -2, -1, -3, -2, 1, 1, 2, 1, 3, 2})
nbrs := NewNearestNeighbors()
nbrs.Fit(X, mat.Matrix(nil))
distances, indices := nbrs.KNeighbors(X, 2)
fmt.Printf("indices:\n%v\n", mat.Formatted(indices))
fmt.Printf("distances:\n%v\n", mat.Formatted(distances))
	indices:
⎡0  1⎤
⎢1  0⎥
⎢2  1⎥
⎢3  4⎥
⎢4  3⎥
⎣5  4⎦
distances:
⎡                 0                   1⎤
⎢                 0                   1⎥
⎢                 0  1.4142135623730951⎥
⎢                 0                   1⎥
⎢                 0                   1⎥
⎣                 0  1.4142135623730951⎦

func NewNearestNeighbors

func NewNearestNeighbors() *NearestNeighbors

NewNearestNeighbors returns an *NearestNeighbors

func (*NearestNeighbors) Fit

func (m *NearestNeighbors) Fit(X, Y mat.Matrix)

Fit for NearestNeighbors. Y is unused

func (*NearestNeighbors) KNeighbors

func (m *NearestNeighbors) KNeighbors(X mat.Matrix, NNeighbors int) (distances, indices *mat.Dense)

KNeighbors returns distances and indices of first NNeighbors

func (*NearestNeighbors) KNeighborsGraph

func (m *NearestNeighbors) KNeighborsGraph(X *mat.Dense, NNeighbors int, mode string, includeSelf bool) (graph *mat.Dense)

KNeighborsGraph Computes the (weighted) graph of k-Neighbors for points in X mode : {‘connectivity’, ‘distance’}, optional

Type of returned matrix: ‘connectivity’ will return the connectivity matrix with ones and zeros, in ‘distance’ the edges are Euclidean distance between points.

Returns: A : shape = [n_samples, n_samples_fit]

n_samples_fit is the number of samples in the fitted data A[i, j] is assigned the weight of edge that connects i to j.
Example

Code:

X := mat.NewDense(3, 1, []float64{0, 3, 1})
neigh := NewNearestNeighbors()
neigh.Fit(X, mat.Matrix(nil))
A := neigh.KNeighborsGraph(X, 2, "connectivity", true)
fmt.Println(mat.Formatted(A))
⎡1  0  1⎤
⎢0  1  1⎥
⎣1  0  1⎦

func (*NearestNeighbors) RadiusNeighbors

func (m *NearestNeighbors) RadiusNeighbors(X *mat.Dense, radius float64) (distances [][]float64, indices [][]int)

RadiusNeighbors Finds the neighbors within a given radius of a point or points. Return the indices and distances of each point from the dataset lying in a ball with size ``radius`` around the points of the query array. Points lying on the boundary are included in the results. The result points are *not* necessarily sorted by distance to their query point. Parameters ---------- X : array-like, (n_samples, n_features), optional

The query point or points.
If not provided, neighbors of each indexed point are returned.
In this case, the query point is not considered its own neighbor.

radius : float

Limiting distance of neighbors to return.
(default is the value passed to the constructor).
Example

Code:

// adapted example from http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.RadiusNeighborsRegressor.html#sklearn.neighbors.RadiusNeighborsRegressor
samples := mat.NewDense(3, 3, []float64{0, 0, 0, 0, .5, 0, 1, 1, .5})
neigh := NewNearestNeighbors()
neigh.Fit(samples, mat.Matrix(nil))
distances, indices := neigh.RadiusNeighbors(mat.NewDense(1, 3, []float64{1, 1, 1}), 1.6)
fmt.Println(distances[0])
fmt.Println(indices[0])
// unlike the python example, distances are sorted by increasing distances
[0.5 1.5]
[2 1]

type Node

type Node interface {
	IsLeaf() bool
}

Node ...

type Rectangle

type Rectangle struct{ Maxes, Mins []float64 }

Rectangle Hyperrectangle class. Represents a Cartesian product of intervals.

func NewRectangle

func NewRectangle(Maxes, Mins []float64) *Rectangle

NewRectangle ...

func (*Rectangle) MaxDistancePoint

func (r *Rectangle) MaxDistancePoint(x []float64, p float64) float64

MaxDistancePoint return the maximum distance between input and points in the hyperrectangle. Parameters x : array_like Input array. p : float, optional Input.

func (*Rectangle) MaxDistanceRectangle

func (r *Rectangle) MaxDistanceRectangle(other *Rectangle, p float64) float64

MaxDistanceRectangle compute the maximum distance between points in the two hyperrectangles. Parameters other : hyperrectangle Input. p : float, optional Input.

func (*Rectangle) MinDistancePoint

func (r *Rectangle) MinDistancePoint(x []float64, p float64) float64

MinDistancePoint Return the minimum distance between input and points in the hyperrectangle. Parameters x : array_like Input. p : float, optional Input.

func (*Rectangle) MinDistanceRectangle

func (r *Rectangle) MinDistanceRectangle(other *Rectangle, p float64) float64

MinDistanceRectangle compute the minimum distance between points in the two hyperrectangles. Parameters other : hyperrectangle Input. p : float Input.

func (*Rectangle) Split

func (r *Rectangle) Split(d int, split float64) (less, greater *Rectangle)

Split Produce two hyperrectangles by splitting. In general, if you need to compute maximum and minimum distances to the children, it can be done more efficiently by updating the maximum and minimum distances to the parent. Parameters ---------- d : int

Axis to split hyperrectangle along.

split : float

Position along axis `d` to split at.

func (*Rectangle) String

func (r *Rectangle) String() string

String ...

func (*Rectangle) Volume

func (r *Rectangle) Volume() float64

Volume ...

Package Files

Documentation was rendered with GOOS=linux and GOARCH=amd64.

Jump to identifier

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to identifier