gryffin

package module

v0.0.0-...-e540a08 Latest Latest Go to latest Published: Feb 12, 2021 License: BSD-3-Clause Imports: 17 Imported by: 12

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/yahoo/gryffin

Links

Open Source Insights

README ¶

ARCHIVED

Gryffin (beta)

Gryffin is a large scale web security scanning platform. It is not yet another scanner. It was written to solve two specific problems with existing scanners: coverage and scale.

Better coverage translates to fewer false negatives. Inherent scalability translates to capability of scanning, and supporting a large elastic application infrastructure. Simply put, the ability to scan 1000 applications today to 100,000 applications tomorrow by straightforward horizontal scaling.

Coverage

Coverage has two dimensions - one during crawl and the other during fuzzing. In crawl phase, coverage implies being able to find as much of the application footprint. In scan phase, or while fuzzing, it implies being able to test each part of the application for an applied set of vulnerabilities in a deep.

Crawl Coverage

Today a large number of web applications are template-driven, meaning the same code or path generates millions of URLs. For a security scanner, it just needs one of the millions of URLs generated by the same code or path. Gryffin's crawler does just that.

Page Deduplication

At the heart of Gryffin is a deduplication engine that compares a new page with already seen pages. If the HTML structure of the new page is similar to those already seen, it is classified as a duplicate and not crawled further.

A large number of applications today are rich applications. They are heavily driven by client-side JavaScript. In order to discover links and code paths in such applications, Gryffin's crawler uses PhantomJS for DOM rendering and navigation.

Scan Coverage

As Gryffin is a scanning platform, not a scanner, it does not have its own fuzzer modules, even for fuzzing common web vulnerabilities like XSS and SQL Injection.

It's not wise to reinvent the wheel where you do not have to. Gryffin at production scale at Yahoo uses open source and custom fuzzers. Some of these custom fuzzers might be open sourced in the future, and might or might not be part of the Gryffin repository.

For demonstration purposes, Gryffin comes integrated with sqlmap and arachni. It does not endorse them or any other scanner in particular.

The philosophy is to improve scan coverage by being able to fuzz for just what you need.

Scale

While Gryffin is available as a standalone package, it's primarily built for scale.

Gryffin is built on the publisher-subscriber model. Each component is either a publisher, or a subscriber, or both. This allows Gryffin to scale horizontally by simply adding more subscriber or publisher nodes.

Operating Gryffin

Pre-requisites

Go - go1.13 or later
PhantomJS, v2
Sqlmap (for fuzzing SQLi)
Arachni (for fuzzing XSS and web vulnerabilities)
NSQ ,
- running lookupd at port 4160,4161
- running nsqd at port 4150,4151
- with --max-msg-size=5000000
Kibana and Elastic search, for dashboarding
- listening to JSON over port 5000
- Preconfigured docker image available in https://hub.docker.com/r/yukinying/elk/

Installation

go get -u github.com/yahoo/gryffin/...

Run

(WIP)

TODO

Mobile browser user agent
Preconfigured docker images
Redis for sharing states across machines
Instruction to run gryffin (distributed or standalone)
Documentation for html-distance
Implement a JSON serializable cookiejar.
Identify duplicate url patterns based on simhash result.

Talks and Slides

AppsecUSA 2015: abstract, slide, recording

Credits

Adonis Fung @ Yahoo, for the asynchronous phantomjs based crawler and DOM event navigator.
Simhash algorithm by Moses Charikar
Simhash implementation provided by mfonda/simhash.
Sqlmap
Arachni

Licence

Code licensed under the BSD-style license. See LICENSE file for terms.

Documentation ¶

Overview ¶

Package gryffin is an application scanning infrastructure.

Index ¶

func GenRandomID() string
func SetLogWriter(w io.Writer)
func SetMemoryStore(m *GryffinStore)
type Fingerprint
type Fuzzer
type GryffinStore
- func NewGryffinStore() *GryffinStore
- func NewSharedGryffinStore() *GryffinStore
type HTTPDoer
type Job
type LogMessage
type PublishMessage
type Renderer
type Scan
- func NewScan(method, url, post string) *Scan
- func NewScanFromJson(b []byte) *Scan
type SerializableRequest
type SerializableResponse
type SerializableScan

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func GenRandomID ¶

func GenRandomID() string

GenRandomID generates a random ID.

func SetLogWriter ¶

func SetLogWriter(w io.Writer)

SetLogWriter sets the log writer.

func SetMemoryStore ¶

func SetMemoryStore(m *GryffinStore)

SetMemoryStore sets the package internal global variable for the memory store.

Types ¶

type Fingerprint ¶

type Fingerprint struct {
	Origin             uint64 // origin
	URL                uint64 // origin + path
	Request            uint64 // method, url, body
	RequestFull        uint64 // request + header
	ResponseSimilarity uint64
}

Fingerprint contains all the different types of hash for the Scan (Request & Response)

type Fuzzer ¶

type Fuzzer interface {
	Fuzz(*Scan) (int, error)
}

Fuzzer runs the fuzzing.

type GryffinStore ¶

type GryffinStore struct {
	Oracles map[string]*distance.Oracle
	Hashes  map[string]bool
	Hits    map[string]int
	Mu      sync.RWMutex
	// contains filtered or unexported fields
}

GryffinStore includes data and handles for Gryffin message processing,

func NewGryffinStore ¶

func NewGryffinStore() *GryffinStore

func NewSharedGryffinStore ¶

func NewSharedGryffinStore() *GryffinStore

func (*GryffinStore) GetRcvChan ¶

func (s *GryffinStore) GetRcvChan() chan []byte

func (*GryffinStore) GetSndChan ¶

func (s *GryffinStore) GetSndChan() chan []byte

func (*GryffinStore) Hit ¶

func (s *GryffinStore) Hit(prefix string) bool

func (*GryffinStore) See ¶

func (s *GryffinStore) See(prefix string, kind string, v uint64)

func (*GryffinStore) Seen ¶

func (s *GryffinStore) Seen(prefix string, kind string, v uint64, r uint8) bool

type HTTPDoer ¶

type HTTPDoer interface {
	Do(*http.Request) (*http.Response, error)
}

HTTPDoer interface is to be implemented by http.Client

type Job ¶

type Job struct {
	ID             string
	DomainsAllowed []string // Domains that we would crawl
}

Job stores the job id and config (if any).

type LogMessage ¶

type LogMessage struct {
	Service string
	Msg     string
	Method  string
	Url     string
	JobID   string
}

LogMessage contains the data fields to be marshalled as JSON for forwarding to the log processor.

type PublishMessage ¶

type PublishMessage struct {
	F string // function, i.e. See or Seen
	T string // type (kind), i.e. oracle or hash
	K string // key
	V string // value
}

PublishMessage is the data in the messages handled by Gryffin.

type Renderer ¶

type Renderer interface {
	Do(*Scan)
	GetRequestBody() <-chan *Scan
	GetLinks() <-chan *Scan
}

Renderer is an interface for implementation HTML DOM renderer and obtain the response body and links. Since DOM construction is very likely to be asynchronous, we return the channels to receive response and links.

type Scan ¶

type Scan struct {
	// ID is a random ID to identify this particular scan.
	// if ID is empty, this scan should not be performed (but record for rate limiting).
	ID           string
	Job          *Job
	Request      *http.Request
	RequestBody  string
	Response     *http.Response
	ResponseBody string
	Cookies      []*http.Cookie
	Fingerprint  Fingerprint
	HitCount     int
}

A Scan consists of the job, target, request and response.

func NewScan ¶

func NewScan(method, url, post string) *Scan

NewScan creates a scan.

func NewScanFromJson ¶

func NewScanFromJson(b []byte) *Scan

NewScanFromJson creates a Scan from the passed JSON blob.

func (*Scan) CrawlAsync ¶

func (s *Scan) CrawlAsync(r Renderer)

CrawlAsync run the crawling asynchronously.

func (*Scan) Error ¶

func (s *Scan) Error(service string, err error)

TODO - LogFmt (fmt string) TODO - LogI (interface) Error logs the error for the given service.

func (*Scan) Fuzz ¶

func (s *Scan) Fuzz(fuzzer Fuzzer) (int, error)

Fuzz runs the vulnerability fuzzer, return the issue count.

func (*Scan) IsDuplicatedPage ¶

func (s *Scan) IsDuplicatedPage() bool

IsDuplicatedPage checks if we should proceed based on the Response

func (*Scan) IsScanAllowed ¶

func (s *Scan) IsScanAllowed() bool

IsScanAllowed check if the request URL is allowed per Job.DomainsAllowed.

func (*Scan) Json ¶

func (s *Scan) Json() []byte

Json serializes Scan as JSON.

func (*Scan) Log ¶

func (s *Scan) Log(v interface{})

Log encodes the given argument as JSON and writes it to the log writer.

func (*Scan) Logf ¶

func (s *Scan) Logf(format string, a ...interface{})

Logf logs using the given format string.

func (*Scan) Logm ¶

func (s *Scan) Logm(service, msg string)

Logm sends a LogMessage to Log processor.

func (*Scan) Logmf ¶

func (s *Scan) Logmf(service, format string, a ...interface{})

Logmf logs the message for the given service.

func (*Scan) MergeRequest ¶

func (s *Scan) MergeRequest(req *http.Request)

MergeRequest merge the request field in scan with the existing one.

func (*Scan) Poke ¶

func (s *Scan) Poke(client HTTPDoer) (err error)

Poke checks if the target is up.

func (*Scan) RateLimit ¶

func (s *Scan) RateLimit() int

RateLimit checks whether we are under the allowed rate for crawling the site. It returns a delay time to wait to check for ReadyToCrawl again.

func (*Scan) ReadResponseBody ¶

func (s *Scan) ReadResponseBody()

ReadResponseBody read Response.Body and fill it to ReadResponseBody. It will also reconstruct the io.ReaderCloser stream.

func (*Scan) ShouldCrawl ¶

func (s *Scan) ShouldCrawl() bool

ShouldCrawl checks if the links should be queued for next crawl.

func (*Scan) Spawn ¶

func (s *Scan) Spawn() *Scan

Spawn spawns a new scan object with a different ID.

func (*Scan) UpdateFingerprint ¶

func (s *Scan) UpdateFingerprint()

UpdateFingerprint updates the fingerprint field.

type SerializableRequest ¶

type SerializableRequest struct {
	*http.Request
	Cancel string
}

SerializableRequest is a Scan extended with serializable request field.

type SerializableResponse ¶

type SerializableResponse struct {
	*http.Response
	Request *SerializableRequest
}

SerializableResponse is a Scan extended with serializable response field.

type SerializableScan ¶

type SerializableScan struct {
	*Scan
	Request  *SerializableRequest
	Response *SerializableResponse
}

SerializableScan is a Scan extended with serializable request and response fields.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cmd
gryffin-distributed
gryffin-standalone
data Package data provides an interface for common data store operations.	Package data provides an interface for common data store operations.
fuzzer
arachni
dummy
sqlmap
html-distance Package distance is a go library for computing the proximity of the HTML pages.	Package distance is a go library for computing the proximity of the HTML pages.
renderer
resource

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL