Documentation ¶
Overview ¶
Package rpca implements anomaly detection using Robust Principle Component Analysis (http://techblog.netflix.com/2015/02/rad-outlier-detection-on-big-data.html). It is a port of RPCA provided by Netflix as part of their Surus project (https://github.com/Netflix/Surus). It takes bits and pieces of Netflix's RAD implementations written in R, C++, Java, and Javascript.
Index ¶
- Constants
- func AutoDiff(active bool) func(*rpcaConfig) error
- func ForceDiff(active bool) func(*rpcaConfig) error
- func Frequency(freq int) func(*rpcaConfig) error
- func LPenalty(penalty float64) func(*rpcaConfig) error
- func SPenalty(penalty float64) func(*rpcaConfig) error
- func Scale(active bool) func(*rpcaConfig) error
- func Verbose(active bool) func(*rpcaConfig) error
- type Anomalies
Constants ¶
const MAX_ITERS int = 1000
The maximum number of iterations before we give up trying to converge.
Variables ¶
This section is empty.
Functions ¶
func AutoDiff ¶
Whether or not to detect if the given time series contains a significant global trend that should be removed before anomaly detection. Trend detection is done with the Augmented Dickey-Fuller test. Note that auto-differencing will change the nature of the detected anomalies. If the time series is not detrended, a lasting mean-shift in the time series (for example, a large, sustained increase) will result in a number of consecutive points after the shift being identified as anomalous. If the time series is detrended, only the single point that marks the beginning of the shift will be identified as anomalous.
func ForceDiff ¶
If true, skip the Augmented Dickey-Fuller test and always auto-difference the given time series.
func Frequency ¶
Frequency informs the algorithm of the major frequency of the time series to use for analysis. For example, if you have 56 points of daily measurements, the major frequency is likely 7, which would capture the weekly trend. Note that due to the nature of the algorithm, the length of the provided time series must be divisible by the frequency.
func LPenalty ¶
A scalar for the amount of thresholding to use when determining the low rank approximation of the given time series. The default values are chosen to correspond to the smart thresholding values described in Zhou's Stable Principal Component Pursuit.
func SPenalty ¶
A scalar for the amount of thresholding to use when determining the separation between noise and sparse outliers. The default values are chosen to correspond to the smart thresholding values described in Zhou's Stable Principal Component Pursuit.
Types ¶
type Anomalies ¶
type Anomalies struct { // A slice of booleans indicating which values in the provided time series // were anomalous. Positions []bool // Values is a slice of floats indicating exactly how anomlous each point in // the provided time series was. Points that were not anomalous have a value // of zero. Points that were anomalously low have negative values, while // points that were anomalously high have positive values. Values []float64 // Part of the RPCA process requires normalizing the given time series by // subtracting the mean and dividing by the standard deviation (Z scoring) // before detecting anomalies. The anomalousness of each point is computed in // this Z-scored space before being transformed back into the domain of the // given time series. Sometimes, it's useful to have the normalized values, // for example, when comparing anomalies across time series. NormedValues []float64 }
func FindAnomalies ¶
FindAnomalies is the primary function to use when using this package. It takes a slice of floats and any number of options. Passing options may look a little funny. This is because this package uses functional arguments to make the API easier to use (more on functional arguments here: http://dave.cheney.net/2014/10/17/functional-options-for-friendly-apis). Basically, all options have default values, and to change that value, pass options like so:
anoms := rpca.FindAnomalies(series, rpca.Frequency(7), rpca.AutoDiff(true))
The interface is designed to match that of Netflix's anomaly detection R package.