goMidas
Go implementation of C++ code by Siddharth Bhatia
Installation
You can install and use the package by cloning this repository in your project folder:
git clone https://github.com/ritesh99rakesh/goMidas.git
Table of Contents
Features
- Finds Anomalies in Dynamic/Time-Evolving Graphs
- Detects Microcluster Anomalies (suddenly arriving groups of
suspiciously similar edges e.g. DoS attack)
- Theoretical Guarantees on False Positive Probability
- Constant Memory (independent of graph size)
- Constant Update Time (real-time anomaly detection to minimize harm)
- Up to 48% more accurate and 644 times faster than the state of the
art approaches
For more details, please read the paper - MIDAS: Microcluster-Based
Detector of Anomalies in Edge
Streams.
Siddharth Bhatia, Bryan Hooi, Minji Yoon, Kijung Shin, Christos
Faloutsos. AAAI 2020.
Use Cases
- Intrusion Detection
- Fake Ratings
- Financial Fraud
Usage
There are two ways to use goMidas:
-
Use the Midas
and MidasR
functions in the package in this repository:
Example:
package main
import (
"fmt"
goMidas "github.com/ritesh99rakesh/goMidas"
)
func main() {
src := []int{2, 2, 3, 3, 5, 5, 7, 11, 1, 2}
dst := []int{3, 3, 4, 4, 9, 9, 73, 74, 75, 76}
times := []int{1, 1, 2, 2 ,2, 2, 2, 2, 2, 2}
numRows := 4
numBuckets := 769
// get anomaly scores
anomalyScore := goMidas.Midas(src, dst, times, numRows, numBuckets)
fmt.Println(anomalyScore)
}
-
Use the main.go
provided in example
folder in this repository to run MIDAS algorithm from command line.
The file main.go
just requires you to specify the data.csv
file containing containing src
, dst
and timestamps
.
You just have to run:
go run main.go -input <input-file> -<other-optional-arguments>
Complete details for the available arguments:
-alpha float
Alpha: Temporal Decay Factor. Default is 0.6 (default 0.6)
-buckets int
Number of buckets. Default is 769 (default 769)
-input string
Input File. (Required)
-norelations
To run Midas instead of Midas-R.
-output string
Output File. Default is scores.txt (default "scores.txt")
-rows int
Number of rows/hash functions. Default is 2 (default 2)
-undirected
If graph is undirected.
For more details and to know how to find AUC of the anomaly scores, refer example
folder in this repository.
Datasets
- DARPA:
Original
Format,
MIDAS
format
- TwitterWorldCup2014
- TwitterSecurity
MIDAS in other Languages
- C++ by Siddharth
Bhatia
- Python by Ritesh Kumar
- Rust
by Scott Steele
- Ruby by Andrew
Kane
Online Articles
- KDnuggets: Introducing MIDAS: A New Baseline for Anomaly Detection
in
Graphs
- Towards Data Science: Controlling Fake News using Graphs and
Statistics
- Towards Data Science: Anomaly detection in dynamic graphs using
MIDAS
- Towards AI: Anomaly Detection with
MIDAS
Citation
If you use this code for your research, please consider citing our
paper.
@article{bhatia2019midas,
title={MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams},
author={Bhatia, Siddharth and Hooi, Bryan and Yoon, Minji and Shin, Kijung and Faloutsos, Christos},
journal={arXiv preprint arXiv:1911.04464},
year={2019}
}
Issues
If you find any problem with the code, please raise an issue.