package module
v0.0.0-...-e540a08 Latest Latest

This package is not in the latest version of its module.

Go to latest
Published: Feb 12, 2021 License: BSD-3-Clause Imports: 17 Imported by: 12



Gryffin (beta) Build Status GoDoc

Gryffin is a large scale web security scanning platform. It is not yet another scanner. It was written to solve two specific problems with existing scanners: coverage and scale.

Better coverage translates to fewer false negatives. Inherent scalability translates to capability of scanning, and supporting a large elastic application infrastructure. Simply put, the ability to scan 1000 applications today to 100,000 applications tomorrow by straightforward horizontal scaling.


Coverage has two dimensions - one during crawl and the other during fuzzing. In crawl phase, coverage implies being able to find as much of the application footprint. In scan phase, or while fuzzing, it implies being able to test each part of the application for an applied set of vulnerabilities in a deep.

Crawl Coverage

Today a large number of web applications are template-driven, meaning the same code or path generates millions of URLs. For a security scanner, it just needs one of the millions of URLs generated by the same code or path. Gryffin's crawler does just that.

Page Deduplication

At the heart of Gryffin is a deduplication engine that compares a new page with already seen pages. If the HTML structure of the new page is similar to those already seen, it is classified as a duplicate and not crawled further.

DOM Rendering and Navigation

A large number of applications today are rich applications. They are heavily driven by client-side JavaScript. In order to discover links and code paths in such applications, Gryffin's crawler uses PhantomJS for DOM rendering and navigation.

Scan Coverage

As Gryffin is a scanning platform, not a scanner, it does not have its own fuzzer modules, even for fuzzing common web vulnerabilities like XSS and SQL Injection.

It's not wise to reinvent the wheel where you do not have to. Gryffin at production scale at Yahoo uses open source and custom fuzzers. Some of these custom fuzzers might be open sourced in the future, and might or might not be part of the Gryffin repository.

For demonstration purposes, Gryffin comes integrated with sqlmap and arachni. It does not endorse them or any other scanner in particular.

The philosophy is to improve scan coverage by being able to fuzz for just what you need.


While Gryffin is available as a standalone package, it's primarily built for scale.

Gryffin is built on the publisher-subscriber model. Each component is either a publisher, or a subscriber, or both. This allows Gryffin to scale horizontally by simply adding more subscriber or publisher nodes.

Operating Gryffin

  1. Go - go1.13 or later
  2. PhantomJS, v2
  3. Sqlmap (for fuzzing SQLi)
  4. Arachni (for fuzzing XSS and web vulnerabilities)
  5. NSQ ,
    • running lookupd at port 4160,4161
    • running nsqd at port 4150,4151
    • with --max-msg-size=5000000
  6. Kibana and Elastic search, for dashboarding
go get -u



  1. Mobile browser user agent
  2. Preconfigured docker images
  3. Redis for sharing states across machines
  4. Instruction to run gryffin (distributed or standalone)
  5. Documentation for html-distance
  6. Implement a JSON serializable cookiejar.
  7. Identify duplicate url patterns based on simhash result.

Talks and Slides



Code licensed under the BSD-style license. See LICENSE file for terms.



Package gryffin is an application scanning infrastructure.



This section is empty.


This section is empty.


func GenRandomID

func GenRandomID() string

GenRandomID generates a random ID.

func SetLogWriter

func SetLogWriter(w io.Writer)

SetLogWriter sets the log writer.

func SetMemoryStore

func SetMemoryStore(m *GryffinStore)

SetMemoryStore sets the package internal global variable for the memory store.


type Fingerprint

type Fingerprint struct {
	Origin             uint64 // origin
	URL                uint64 // origin + path
	Request            uint64 // method, url, body
	RequestFull        uint64 // request + header
	ResponseSimilarity uint64

Fingerprint contains all the different types of hash for the Scan (Request & Response)

type Fuzzer

type Fuzzer interface {
	Fuzz(*Scan) (int, error)

Fuzzer runs the fuzzing.

type GryffinStore

type GryffinStore struct {
	Oracles map[string]*distance.Oracle
	Hashes  map[string]bool
	Hits    map[string]int
	Mu      sync.RWMutex
	// contains filtered or unexported fields

GryffinStore includes data and handles for Gryffin message processing,

func NewGryffinStore

func NewGryffinStore() *GryffinStore

func NewSharedGryffinStore

func NewSharedGryffinStore() *GryffinStore

func (*GryffinStore) GetRcvChan

func (s *GryffinStore) GetRcvChan() chan []byte

func (*GryffinStore) GetSndChan

func (s *GryffinStore) GetSndChan() chan []byte

func (*GryffinStore) Hit

func (s *GryffinStore) Hit(prefix string) bool

func (*GryffinStore) See

func (s *GryffinStore) See(prefix string, kind string, v uint64)

func (*GryffinStore) Seen

func (s *GryffinStore) Seen(prefix string, kind string, v uint64, r uint8) bool

type HTTPDoer

type HTTPDoer interface {
	Do(*http.Request) (*http.Response, error)

HTTPDoer interface is to be implemented by http.Client

type Job

type Job struct {
	ID             string
	DomainsAllowed []string // Domains that we would crawl

Job stores the job id and config (if any).

type LogMessage

type LogMessage struct {
	Service string
	Msg     string
	Method  string
	Url     string
	JobID   string

LogMessage contains the data fields to be marshalled as JSON for forwarding to the log processor.

type PublishMessage

type PublishMessage struct {
	F string // function, i.e. See or Seen
	T string // type (kind), i.e. oracle or hash
	K string // key
	V string // value

PublishMessage is the data in the messages handled by Gryffin.

type Renderer

type Renderer interface {
	GetRequestBody() <-chan *Scan
	GetLinks() <-chan *Scan

Renderer is an interface for implementation HTML DOM renderer and obtain the response body and links. Since DOM construction is very likely to be asynchronous, we return the channels to receive response and links.

type Scan

type Scan struct {
	// ID is a random ID to identify this particular scan.
	// if ID is empty, this scan should not be performed (but record for rate limiting).
	ID           string
	Job          *Job
	Request      *http.Request
	RequestBody  string
	Response     *http.Response
	ResponseBody string
	Cookies      []*http.Cookie
	Fingerprint  Fingerprint
	HitCount     int

A Scan consists of the job, target, request and response.

func NewScan

func NewScan(method, url, post string) *Scan

NewScan creates a scan.

func NewScanFromJson

func NewScanFromJson(b []byte) *Scan

NewScanFromJson creates a Scan from the passed JSON blob.

func (*Scan) CrawlAsync

func (s *Scan) CrawlAsync(r Renderer)

CrawlAsync run the crawling asynchronously.

func (*Scan) Error

func (s *Scan) Error(service string, err error)

TODO - LogFmt (fmt string) TODO - LogI (interface) Error logs the error for the given service.

func (*Scan) Fuzz

func (s *Scan) Fuzz(fuzzer Fuzzer) (int, error)

Fuzz runs the vulnerability fuzzer, return the issue count.

func (*Scan) IsDuplicatedPage

func (s *Scan) IsDuplicatedPage() bool

IsDuplicatedPage checks if we should proceed based on the Response

func (*Scan) IsScanAllowed

func (s *Scan) IsScanAllowed() bool

IsScanAllowed check if the request URL is allowed per Job.DomainsAllowed.

func (*Scan) Json

func (s *Scan) Json() []byte

Json serializes Scan as JSON.

func (*Scan) Log

func (s *Scan) Log(v interface{})

Log encodes the given argument as JSON and writes it to the log writer.

func (*Scan) Logf

func (s *Scan) Logf(format string, a ...interface{})

Logf logs using the given format string.

func (*Scan) Logm

func (s *Scan) Logm(service, msg string)

Logm sends a LogMessage to Log processor.

func (*Scan) Logmf

func (s *Scan) Logmf(service, format string, a ...interface{})

Logmf logs the message for the given service.

func (*Scan) MergeRequest

func (s *Scan) MergeRequest(req *http.Request)

MergeRequest merge the request field in scan with the existing one.

func (*Scan) Poke

func (s *Scan) Poke(client HTTPDoer) (err error)

Poke checks if the target is up.

func (*Scan) RateLimit

func (s *Scan) RateLimit() int

RateLimit checks whether we are under the allowed rate for crawling the site. It returns a delay time to wait to check for ReadyToCrawl again.

func (*Scan) ReadResponseBody

func (s *Scan) ReadResponseBody()

ReadResponseBody read Response.Body and fill it to ReadResponseBody. It will also reconstruct the io.ReaderCloser stream.

func (*Scan) ShouldCrawl

func (s *Scan) ShouldCrawl() bool

ShouldCrawl checks if the links should be queued for next crawl.

func (*Scan) Spawn

func (s *Scan) Spawn() *Scan

Spawn spawns a new scan object with a different ID.

func (*Scan) UpdateFingerprint

func (s *Scan) UpdateFingerprint()

UpdateFingerprint updates the fingerprint field.

type SerializableRequest

type SerializableRequest struct {
	Cancel string

SerializableRequest is a Scan extended with serializable request field.

type SerializableResponse

type SerializableResponse struct {
	Request *SerializableRequest

SerializableResponse is a Scan extended with serializable response field.

type SerializableScan

type SerializableScan struct {
	Request  *SerializableRequest
	Response *SerializableResponse

SerializableScan is a Scan extended with serializable request and response fields.


Path Synopsis
Package data provides an interface for common data store operations.
Package data provides an interface for common data store operations.
Package distance is a go library for computing the proximity of the HTML pages.
Package distance is a go library for computing the proximity of the HTML pages.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL