models

package
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 3, 2026 License: BSD-2-Clause Imports: 7 Imported by: 0

Documentation

Overview

Package models provides outlier detection models. ABOD: Angle-Based Outlier Detection. Reference: Kriegel, H.P. and Zimek, A., 2008, August. Angle-based outlier detection in high-dimensional data. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 444-452).

Package models provides outlier detection algorithms. It is inspired by and based on the design of PyOD (Python Outlier Detection).

Package models provides outlier detection models. COF: Connectivity-Based Outlier Factor. Reference: Tang, J., Chen, Z., Fu, A.W.C. and Cheung, D.W., 2002. Enhancing effectiveness of outlier detections for low density patterns. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 535-548). Springer, Berlin, Heidelberg.

Package models provides outlier detection models. COPOD: Copula-Based Outlier Detection. Reference: Li, Z., Zhao, Y., Botta, N., Ionescu, C. and Hu, X., 2020. COPOD: copula-based outlier detection. In 2020 IEEE International Conference on Data Mining (ICDM) (pp. 1118-1123). IEEE.

Package models provides outlier detection algorithms.

Package models provides outlier detection algorithms.

Package models provides outlier detection algorithms.

Package models provides outlier detection algorithms.

Package models provides outlier detection models. LODA: Lightweight on-line detector of anomalies. Reference: Pevny, T., 2016. Loda: Lightweight on-line detector of anomalies. Machine Learning, 102(2), pp.275-304.

Package models provides outlier detection algorithms.

Package models provides outlier detection models. MAD: Median Absolute Deviation for univariate outlier detection. Reference: Iglewicz, B. and Hoaglin, D.C., 1993. How to detect and handle outliers (Vol. 16). Asq Press.

Package models provides outlier detection algorithms.

Package models provides outlier detection models. SOD: Subspace Outlier Detection. Reference: Kriegel, H.P., Kröger, P., Schubert, E. and Zimek, A., 2009. Outlier detection in axis-parallel subspaces of high dimensional data. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 831-838). Springer, Berlin, Heidelberg.

Package models provides outlier detection models. SOS: Stochastic Outlier Selection. Reference: Janssens, J.H.M., Huszar, F., Postma, E.O. and van den Herik, H.J., 2012. Stochastic outlier selection. Tilburg centre for Creative Computing, techreport 2012(1).

Index

Constants

This section is empty.

Variables

View Source
var ErrInvalidContamination = errors.New("contamination must be in (0, 0.5]")

ErrInvalidContamination is returned when contamination is not in valid range

View Source
var ErrInvalidData = errors.New("invalid input data")

ErrInvalidData is returned when input data is invalid

View Source
var ErrNotFitted = errors.New("detector has not been fitted")

ErrNotFitted is returned when trying to use a detector before fitting

View Source
var ErrNotUnivariate = errors.New("MAD is only for univariate data")

ErrNotUnivariate is returned when MAD is used with multivariate data.

Functions

func GetMatrixShape

func GetMatrixShape(X Matrix) (nSamples, nFeatures int)

GetMatrixShape returns the dimensions of a matrix

func Percentile

func Percentile(data Vector, p float64) float64

Percentile calculates the p-th percentile of the data

func ValidateMatrix

func ValidateMatrix(X Matrix) error

ValidateMatrix checks if the input matrix is valid

Types

type ABOD

type ABOD struct {
	BaseDetector
	// contains filtered or unexported fields
}

ABOD implements Angle-Based Outlier Detection. For an observation, the variance of its weighted cosine scores to all neighbors is used as the outlying score. Lower variance indicates outliers.

func NewABOD

func NewABOD(opts *ABODOptions) *ABOD

NewABOD creates a new ABOD detector with the given options.

func (*ABOD) DecisionFunction

func (a *ABOD) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction computes the anomaly score for new samples.

func (*ABOD) Fit

func (a *ABOD) Fit(X Matrix) error

Fit trains the ABOD detector on the given data.

func (*ABOD) Predict

func (a *ABOD) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the given samples.

func (*ABOD) PredictProba

func (a *ABOD) PredictProba(X Matrix) (Matrix, error)

PredictProba returns the probability of each sample being an outlier.

type ABODOptions

type ABODOptions struct {
	// Contamination is the proportion of outliers in the data set (default: 0.1)
	Contamination float64
	// NNeighbors is the number of neighbors for fast ABOD (default: 5)
	NNeighbors int
	// Method specifies the computation method: "fast" or "default" (default: "fast")
	Method string
}

ABODOptions contains options for the ABOD detector.

func DefaultABODOptions

func DefaultABODOptions() *ABODOptions

DefaultABODOptions returns default options for ABOD.

type BaseDetector

type BaseDetector struct {
	// Contamination is the proportion of outliers in the data set
	Contamination float64

	// DecisionScores_ are the outlier scores of the training data
	DecisionScores_ Vector

	// Threshold_ is the threshold for generating binary labels
	Threshold_ float64

	// Labels_ are the binary labels of the training data
	Labels_ []int

	// Fitted indicates whether the detector has been fitted
	Fitted bool
	// contains filtered or unexported fields
}

BaseDetector provides common functionality for all detectors

func NewBaseDetector

func NewBaseDetector(contamination float64) (*BaseDetector, error)

NewBaseDetector creates a new BaseDetector with the given contamination

func (*BaseDetector) GetDecisionScores

func (b *BaseDetector) GetDecisionScores() Vector

GetDecisionScores returns the decision scores of the training data

func (*BaseDetector) GetLabels

func (b *BaseDetector) GetLabels() []int

GetLabels returns the binary labels of the training data

func (*BaseDetector) GetThreshold

func (b *BaseDetector) GetThreshold() float64

GetThreshold returns the threshold for outlier detection

func (*BaseDetector) IsFitted

func (b *BaseDetector) IsFitted() bool

IsFitted returns whether the detector has been fitted

func (*BaseDetector) PredictFromScores

func (b *BaseDetector) PredictFromScores(scores Vector) []int

PredictFromScores generates binary predictions from anomaly scores

func (*BaseDetector) PredictProbaFromScores

func (b *BaseDetector) PredictProbaFromScores(scores Vector, method string) (Matrix, error)

PredictProbaFromScores calculates probability estimates from scores

func (*BaseDetector) ProcessDecisionScores

func (b *BaseDetector) ProcessDecisionScores()

ProcessDecisionScores calculates the threshold and labels based on decision scores

type COF

type COF struct {
	BaseDetector
	// contains filtered or unexported fields
}

COF implements the Connectivity-Based Outlier Factor algorithm. COF uses the ratio of average chaining distance of data point and the average of average chaining distance of k nearest neighbors.

func NewCOF

func NewCOF(opts *COFOptions) *COF

NewCOF creates a new COF detector with the given options.

func (*COF) DecisionFunction

func (c *COF) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction computes the anomaly score for new samples.

func (*COF) Fit

func (c *COF) Fit(X Matrix) error

Fit trains the COF detector on the given data.

func (*COF) Predict

func (c *COF) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the given samples.

func (*COF) PredictProba

func (c *COF) PredictProba(X Matrix) (Matrix, error)

PredictProba returns the probability of each sample being an outlier.

type COFOptions

type COFOptions struct {
	// Contamination is the proportion of outliers in the data set (default: 0.1)
	Contamination float64
	// NNeighbors is the number of neighbors to use (default: 20)
	NNeighbors int
	// Method specifies the computation method: "fast" or "memory" (default: "fast")
	Method string
}

COFOptions contains options for the COF detector.

func DefaultCOFOptions

func DefaultCOFOptions() *COFOptions

DefaultCOFOptions returns default options for COF.

type COPOD

type COPOD struct {
	BaseDetector
	// contains filtered or unexported fields
}

COPOD implements Copula-Based Outlier Detection. COPOD is a parameter-free, highly interpretable outlier detection algorithm based on empirical copula models.

func NewCOPOD

func NewCOPOD(opts *COPODOptions) *COPOD

NewCOPOD creates a new COPOD detector with the given options.

func (*COPOD) DecisionFunction

func (c *COPOD) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction computes the anomaly score for new samples.

func (*COPOD) Fit

func (c *COPOD) Fit(X Matrix) error

Fit trains the COPOD detector on the given data.

func (*COPOD) Predict

func (c *COPOD) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the given samples.

func (*COPOD) PredictProba

func (c *COPOD) PredictProba(X Matrix) (Matrix, error)

PredictProba returns the probability of each sample being an outlier.

type COPODOptions

type COPODOptions struct {
	// Contamination is the proportion of outliers in the data set (default: 0.1)
	Contamination float64
}

COPODOptions contains options for the COPOD detector.

func DefaultCOPODOptions

func DefaultCOPODOptions() *COPODOptions

DefaultCOPODOptions returns default options for COPOD.

type Detector

type Detector interface {
	// Fit trains the detector on the input data
	// X is a matrix of shape (n_samples, n_features)
	// y is optional and ignored in unsupervised methods
	Fit(X Matrix, y Vector) error

	// Predict returns binary labels for the input data
	// 0 for inliers, 1 for outliers
	Predict(X Matrix) ([]int, error)

	// DecisionFunction returns raw anomaly scores
	// Higher scores indicate more abnormal samples
	DecisionFunction(X Matrix) (Vector, error)

	// PredictProba returns probability estimates
	// Returns a matrix of shape (n_samples, 2) with [P(inlier), P(outlier)]
	PredictProba(X Matrix, method string) (Matrix, error)

	// GetThreshold returns the threshold for outlier detection
	GetThreshold() float64

	// GetLabels returns the binary labels of the training data
	GetLabels() []int

	// GetDecisionScores returns the decision scores of the training data
	GetDecisionScores() Vector

	// IsFitted returns whether the detector has been fitted
	IsFitted() bool
}

Detector is the interface that all outlier detection models implement

type ECOD

type ECOD struct {
	*BaseDetector

	// Training data (stored for prediction)
	XTrain_ Matrix
}

ECOD implements Empirical Cumulative Distribution based Outlier Detection. ECOD is a parameter-free, highly interpretable outlier detection algorithm based on empirical CDF functions.

func NewECOD

func NewECOD(opts *ECODOptions) *ECOD

NewECOD creates a new ECOD detector

func (*ECOD) DecisionFunction

func (e *ECOD) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction returns the anomaly scores for the input samples

func (*ECOD) Fit

func (e *ECOD) Fit(X Matrix, y Vector) error

Fit trains the ECOD detector on the input data

func (*ECOD) Predict

func (e *ECOD) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the input data

func (*ECOD) PredictProba

func (e *ECOD) PredictProba(X Matrix, method string) (Matrix, error)

PredictProba returns probability estimates for the input data

type ECODOptions

type ECODOptions struct {
	Contamination float64 // Contamination rate (default: 0.1)
}

ECODOptions holds configuration options for ECOD

func DefaultECODOptions

func DefaultECODOptions() *ECODOptions

DefaultECODOptions returns default options for ECOD

type HBOS

type HBOS struct {
	*BaseDetector

	// NBins is the number of bins for the histogram
	// Can be a fixed number or "auto" for automatic selection
	NBins int

	// Alpha is the regularizer for preventing overflow
	Alpha float64

	// Tol is the tolerance for samples falling outside bins
	Tol float64
	// contains filtered or unexported fields
}

HBOS implements Histogram-based Outlier Score algorithm. It assumes feature independence and calculates the degree of outlyingness by building histograms. Reference: Goldstein, M. and Dengel, A., 2012. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm.

func NewHBOS

func NewHBOS(opts *HBOSOptions) *HBOS

NewHBOS creates a new HBOS detector

func (*HBOS) DecisionFunction

func (h *HBOS) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction returns the anomaly scores for the input samples

func (*HBOS) Fit

func (h *HBOS) Fit(X Matrix, y Vector) error

Fit trains the HBOS detector on the input data

func (*HBOS) Predict

func (h *HBOS) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the input data

func (*HBOS) PredictProba

func (h *HBOS) PredictProba(X Matrix, method string) (Matrix, error)

PredictProba returns probability estimates for the input data

type HBOSOptions

type HBOSOptions struct {
	NBins         int     // Number of bins (default: 10)
	Alpha         float64 // Regularizer (default: 0.1)
	Tol           float64 // Tolerance (default: 0.5)
	Contamination float64 // Contamination rate (default: 0.1)
}

HBOSOptions holds configuration options for HBOS

func DefaultHBOSOptions

func DefaultHBOSOptions() *HBOSOptions

DefaultHBOSOptions returns default options for HBOS

type IForest

type IForest struct {
	*BaseDetector

	// NEstimators is the number of base estimators (trees)
	NEstimators int

	// MaxSamples is the number of samples to draw for each tree
	// If <= 1.0, it's treated as a fraction of the total samples
	MaxSamples int

	// MaxFeatures is the number of features for each tree
	MaxFeatures int

	// Bootstrap indicates whether to use bootstrap sampling
	Bootstrap bool

	// RandomState is the random seed
	RandomState int64
	// contains filtered or unexported fields
}

IForest implements Isolation Forest algorithm. The IsolationForest 'isolates' observations by randomly selecting a feature and then randomly selecting a split value between the maximum and minimum values of the selected feature.

func NewIForest

func NewIForest(opts *IForestOptions) *IForest

NewIForest creates a new Isolation Forest detector

func (*IForest) DecisionFunction

func (f *IForest) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction returns the anomaly scores for the input samples

func (*IForest) Fit

func (f *IForest) Fit(X Matrix, y Vector) error

Fit trains the Isolation Forest on the input data

func (*IForest) Predict

func (f *IForest) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the input data

func (*IForest) PredictProba

func (f *IForest) PredictProba(X Matrix, method string) (Matrix, error)

PredictProba returns probability estimates for the input data

type IForestOptions

type IForestOptions struct {
	NEstimators   int     // Number of trees (default: 100)
	MaxSamples    int     // Samples per tree (default: 256)
	MaxFeatures   int     // Features per tree (default: all)
	Bootstrap     bool    // Use bootstrap sampling (default: false)
	RandomState   int64   // Random seed (default: 0 = random)
	Contamination float64 // Contamination rate (default: 0.1)
}

IForestOptions holds configuration options for IForest

func DefaultIForestOptions

func DefaultIForestOptions() *IForestOptions

DefaultIForestOptions returns default options for IForest

type KNN

type KNN struct {
	*BaseDetector

	// NNeighbors is the number of neighbors to use
	NNeighbors int

	// Method defines how to calculate the outlier score:
	// "largest": use the distance to the kth neighbor
	// "mean": use the average of all k neighbors distances
	// "median": use the median of the distances to k neighbors
	Method string

	// Metric defines the distance metric: "euclidean", "manhattan", "minkowski"
	Metric string

	// P is the parameter for Minkowski distance
	P float64

	// Training data
	X_ Matrix
}

KNN implements k-Nearest Neighbors based outlier detection. For an observation, its distance to its kth nearest neighbor could be viewed as the outlying score.

func NewKNN

func NewKNN(opts *KNNOptions) *KNN

NewKNN creates a new KNN detector

func (*KNN) DecisionFunction

func (k *KNN) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction returns the anomaly scores for the input samples

func (*KNN) Fit

func (k *KNN) Fit(X Matrix, y Vector) error

Fit trains the KNN detector on the input data

func (*KNN) Predict

func (k *KNN) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the input data

func (*KNN) PredictProba

func (k *KNN) PredictProba(X Matrix, method string) (Matrix, error)

PredictProba returns probability estimates for the input data

type KNNOptions

type KNNOptions struct {
	NNeighbors    int     // Number of neighbors (default: 5)
	Method        string  // "largest", "mean", or "median" (default: "largest")
	Metric        string  // Distance metric (default: "euclidean")
	P             float64 // Minkowski p parameter (default: 2)
	Contamination float64 // Contamination rate (default: 0.1)
}

KNNOptions holds configuration options for KNN

func DefaultKNNOptions

func DefaultKNNOptions() *KNNOptions

DefaultKNNOptions returns default options for KNN

type LODA

type LODA struct {
	BaseDetector
	// contains filtered or unexported fields
}

LODA implements the Lightweight On-line Detector of Anomalies. LODA is an ensemble method that combines sparse random projections with one-dimensional histograms.

func NewLODA

func NewLODA(opts *LODAOptions) *LODA

NewLODA creates a new LODA detector with the given options.

func (*LODA) DecisionFunction

func (l *LODA) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction computes the anomaly score for new samples.

func (*LODA) Fit

func (l *LODA) Fit(X Matrix) error

Fit trains the LODA detector on the given data.

func (*LODA) Predict

func (l *LODA) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the given samples.

func (*LODA) PredictProba

func (l *LODA) PredictProba(X Matrix) (Matrix, error)

PredictProba returns the probability of each sample being an outlier.

type LODAOptions

type LODAOptions struct {
	// Contamination is the proportion of outliers in the data set (default: 0.1)
	Contamination float64
	// NBins is the number of histogram bins (default: 10). Use "auto" for automatic selection.
	NBins int
	// AutoBins determines whether to use automatic bin selection (default: false)
	AutoBins bool
	// NRandomCuts is the number of random cuts/projections (default: 100)
	NRandomCuts int
	// RandomState for reproducibility (default: nil for random)
	RandomState *rand.Rand
}

LODAOptions contains options for the LODA detector.

func DefaultLODAOptions

func DefaultLODAOptions() *LODAOptions

DefaultLODAOptions returns default options for LODA.

type LOF

type LOF struct {
	*BaseDetector

	// NNeighbors is the number of neighbors to use
	NNeighbors int

	// Metric defines the distance metric: "euclidean", "manhattan", "minkowski"
	Metric string

	// P is the parameter for Minkowski distance
	P float64

	// Training data
	X_ Matrix
	// contains filtered or unexported fields
}

LOF implements Local Outlier Factor algorithm. It measures the local deviation of density of a given sample with respect to its neighbors.

func NewLOF

func NewLOF(opts *LOFOptions) *LOF

NewLOF creates a new LOF detector

func (*LOF) DecisionFunction

func (l *LOF) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction returns the anomaly scores for the input samples

func (*LOF) Fit

func (l *LOF) Fit(X Matrix, y Vector) error

Fit trains the LOF detector on the input data

func (*LOF) Predict

func (l *LOF) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the input data

func (*LOF) PredictProba

func (l *LOF) PredictProba(X Matrix, method string) (Matrix, error)

PredictProba returns probability estimates for the input data

type LOFOptions

type LOFOptions struct {
	NNeighbors    int     // Number of neighbors (default: 20)
	Metric        string  // Distance metric (default: "euclidean")
	P             float64 // Minkowski p parameter (default: 2)
	Contamination float64 // Contamination rate (default: 0.1)
}

LOFOptions holds configuration options for LOF

func DefaultLOFOptions

func DefaultLOFOptions() *LOFOptions

DefaultLOFOptions returns default options for LOF

type MAD

type MAD struct {
	BaseDetector
	// contains filtered or unexported fields
}

MAD implements Median Absolute Deviation for univariate outlier detection. MAD measures the distances of data points from the median in terms of median distance using modified z-scores.

func NewMAD

func NewMAD(opts *MADOptions) *MAD

NewMAD creates a new MAD detector with the given options.

func (*MAD) DecisionFunction

func (m *MAD) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction computes the anomaly score for new samples.

func (*MAD) Fit

func (m *MAD) Fit(X Matrix) error

Fit trains the MAD detector on the given data.

func (*MAD) Predict

func (m *MAD) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the given samples.

func (*MAD) PredictProba

func (m *MAD) PredictProba(X Matrix) (Matrix, error)

PredictProba returns the probability of each sample being an outlier.

type MADOptions

type MADOptions struct {
	// Threshold is the modified z-score threshold (default: 3.5)
	Threshold float64
	// Contamination is the proportion of outliers in the data set (default: 0.1)
	Contamination float64
}

MADOptions contains options for the MAD detector.

func DefaultMADOptions

func DefaultMADOptions() *MADOptions

DefaultMADOptions returns default options for MAD.

type Matrix

type Matrix [][]float64

Matrix represents a 2D slice of float64 values (n_samples x n_features)

func TransposeMatrix

func TransposeMatrix(X Matrix) Matrix

TransposeMatrix transposes a matrix

type PCA

type PCA struct {
	*BaseDetector

	// NComponents is the number of principal components to keep
	NComponents int

	// NSelectedComponents is the number of components used for scoring
	// If 0, uses all components
	NSelectedComponents int

	// Weighted indicates whether to weight components by explained variance
	Weighted bool

	// Standardization indicates whether to standardize data
	Standardization bool
	// contains filtered or unexported fields
}

PCA implements Principal Component Analysis based outlier detection. Outlier scores are computed as the sum of weighted distances from samples to the principal component hyperplanes.

func NewPCA

func NewPCA(opts *PCAOptions) *PCA

NewPCA creates a new PCA detector

func (*PCA) DecisionFunction

func (p *PCA) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction returns the anomaly scores for the input samples

func (*PCA) Fit

func (p *PCA) Fit(X Matrix, y Vector) error

Fit trains the PCA detector on the input data

func (*PCA) Predict

func (p *PCA) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the input data

func (*PCA) PredictProba

func (p *PCA) PredictProba(X Matrix, method string) (Matrix, error)

PredictProba returns probability estimates for the input data

type PCAOptions

type PCAOptions struct {
	NComponents         int     // Number of components (default: 0 = all)
	NSelectedComponents int     // Components for scoring (default: 0 = all)
	Weighted            bool    // Weight by variance (default: true)
	Standardization     bool    // Standardize data (default: true)
	Contamination       float64 // Contamination rate (default: 0.1)
}

PCAOptions holds configuration options for PCA

func DefaultPCAOptions

func DefaultPCAOptions() *PCAOptions

DefaultPCAOptions returns default options for PCA

type SOD

type SOD struct {
	BaseDetector
	// contains filtered or unexported fields
}

SOD implements Subspace Outlier Detection. SOD explores the axis-parallel subspace spanned by the data object's neighbors and determines how much the object deviates from the neighbors.

func NewSOD

func NewSOD(opts *SODOptions) *SOD

NewSOD creates a new SOD detector with the given options.

func (*SOD) DecisionFunction

func (s *SOD) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction computes the anomaly score for new samples.

func (*SOD) Fit

func (s *SOD) Fit(X Matrix) error

Fit trains the SOD detector on the given data.

func (*SOD) Predict

func (s *SOD) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the given samples.

func (*SOD) PredictProba

func (s *SOD) PredictProba(X Matrix) (Matrix, error)

PredictProba returns the probability of each sample being an outlier.

type SODOptions

type SODOptions struct {
	// Contamination is the proportion of outliers in the data set (default: 0.1)
	Contamination float64
	// NNeighbors is the number of neighbors for kNN (default: 20)
	NNeighbors int
	// RefSet is the number of shared nearest neighbors for reference set (default: 10)
	RefSet int
	// Alpha is the lower limit for selecting subspace (default: 0.8)
	Alpha float64
}

SODOptions contains options for the SOD detector.

func DefaultSODOptions

func DefaultSODOptions() *SODOptions

DefaultSODOptions returns default options for SOD.

type SOS

type SOS struct {
	BaseDetector
	// contains filtered or unexported fields
}

SOS implements Stochastic Outlier Selection. SOS uses the concept of affinity to quantify the relationship between data points. A point is an outlier when all other points have insufficient affinity with it.

func NewSOS

func NewSOS(opts *SOSOptions) *SOS

NewSOS creates a new SOS detector with the given options.

func (*SOS) DecisionFunction

func (s *SOS) DecisionFunction(X Matrix) (Vector, error)

DecisionFunction computes the anomaly score for new samples.

func (*SOS) Fit

func (s *SOS) Fit(X Matrix) error

Fit trains the SOS detector on the given data.

func (*SOS) Predict

func (s *SOS) Predict(X Matrix) ([]int, error)

Predict returns binary labels for the given samples.

func (*SOS) PredictProba

func (s *SOS) PredictProba(X Matrix) (Matrix, error)

PredictProba returns the probability of each sample being an outlier.

type SOSOptions

type SOSOptions struct {
	// Contamination is the proportion of outliers in the data set (default: 0.1)
	Contamination float64
	// Perplexity is a smooth measure of effective number of neighbors (default: 4.5)
	Perplexity float64
	// Eps is the tolerance threshold for binary search (default: 1e-5)
	Eps float64
}

SOSOptions contains options for the SOS detector.

func DefaultSOSOptions

func DefaultSOSOptions() *SOSOptions

DefaultSOSOptions returns default options for SOS.

type Vector

type Vector []float64

Vector represents a 1D slice of float64 values

func GetColumn

func GetColumn(X Matrix, col int) Vector

GetColumn extracts a column from a matrix

func InvertOrder

func InvertOrder(scores Vector) Vector

InvertOrder inverts the order of scores (smallest becomes largest) This is useful for combining detectors with different score orderings

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL