phishdetect

package module

v1.13.0 Latest Latest Go to latest Published: Nov 23, 2020 License: AGPL-3.0 Imports: 39 Imported by: 1

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/phishdetect/phishdetect

Links

Open Source Insights

README ¶

PhishDetect

NOTE: This project is experimental. It is not to be used yet, particularly with at-risk users.

PhishDetect is a library and a platform to detect potential phishing pages. It attempts doing so by identifying suspicious and malicious properties both in the domain names and URL provided, as well as in the HTML content of the page opened.

PhishDetect can take HTML strings as input, but it can also just be provided with an URL which will then be opened through a dedicated Docker container which automatically instruments a Google Chrome browser, whose behavior is monitored while navigating to the suspicious link.

Building

Install Docker Community Edition for Windows, Mac or Linux.

Particularly when using this with PhishDetect Node, you should be looking into installing Docker in Rootless Mode. You can find more information about this in the Node's documentation.

Download the Docker image from Docker Hub using:

$ docker pull phishdetect/phishdetect

You will also need to install Yara and its library. In order to do so, please follow the instructions provided by the official Yara Project documentation.

Now you can download the PhishDetect library:

$ go get -u github.com/phishdetect/phishdetect

For ease of versioning, you should consider using Go 1.11+ Modules in your own project.

Using PhishDetect as a library

Analyzing a link statically

You can then use it to analyze a URL or a domain like so:

package main

import (
    "fmt"
    "github.com/phishdetect/phishdetect"
)

func main() {
    // Instantiate an Analysis. The second argument is
    // an HTML string.
    a := phishdetect.NewAnalysis("example.com", "")
    // Perform the analysis of the URL/domain.
    a.AnalyzeURL()
    // Retrieve the name of the the impersonated brand.
    brand := a.Brands.GetBrand()

    // If the domain is recognized as safelisted, this
    // will show as true, otherwise as false.
    fmt.Println(a.Safelisted)
    // This is a total numeric value that is the sum of
    // all the score values of the warnings that were
    // matched during the analysis.
    fmt.Println(a.Score)
    // Print the brand. It will be an empty string if
    // no brand was identified.
    fmt.Println(brand)

    // Print all the matched warnings from the analysis.
    for _, warning := range a.Warnings {
        fmt.Println(warning.Description)
    }
}

Analyzing a link dynamically

If you want to analyze a URL by launching the dockerized Google Chrome:

package main

import (
    "fmt"
    "github.com/phishdetect/phishdetect"
)

func main() {
    url := "example.com"
    // Instantiate a new Browser.
    // The first argument is the URL to analyze.
    // The second argument is the path to the file where to save the screenshot.
    // The third argument is a boolean value to enable or disable routing through Tor.
    b := phishdetect.NewBrowser(url, "/path/to/screen.png", false, false, "")
    // Run the browser.
    b.Run()

    // Now we analyze the results.
    a := phishdetect.NewAnalysis(url, b.HTML)
    a.AnalyzeURL()
    // Analyze the HTML string.
    a.AnalyzeHTML()
    brand := a.Brands.GetBrand()

    // In addition to the results explained in the previous example, we have
    // soma additional information provided by the browser execution.
    // FinalURL will show the last visited URL by the browser. This might differ
    // from the original URL if the browser was redirected.
    fmt.Println(b.FinalURL)

    // Visits contains a list of all the URLs visited by the browser.
    // Normally 302 redirects or JavaScript redirects should appear (although in
    // the latter case, some might not appear if it took to long to load.)
    for _, visit := range b.Visits {
        fmt.Println(visit)
    }

    // In addition to the URL analysis warnings, we should also have any matched
    // HTML analysis warnings.
}

For more information, please refer to the Godoc.

Adding new Brands to the existing list

PhishDetect comes pre-compiled with a fixed set of brands. You might want to load custom ones from external sources. You can easily do so when creating a new Analysis.

import (
    "github.com/phishdetect/phishdetect"
    "github.com/phishdetect/phishdetect/brand"
)

func main() {
    // We create a new Brand.
    myBrand := brand.Brand{
        Name:       "MyBrand",
        Original:   []string{"MyBrand", "MyBrandProduct"},
        Safelist:  []string{"mybrand.com", "mybrand.net", "mybrand.org"},
        Suspicious: []string{"mybland.com", "mybrend.com", "mgbrand.com"},
    }

    // We instantiate a new analysis.
    a := phishdetect.NewAnalysis("example.com", "")
    // We access the list of brands from the current analysis and add a new one.
    a.Brands.AddBrand(myBrand)
    // Finally, we analyze the domain.
    a.AnalyzeURL()
}

Adding Yara rules to the HTML classifier

If you want to scan the visited page's HTML with Yara rules of your own, you just need to initialize PhishDetect's scanner using phishdetect.InitializeYara() and by providing the path (as a string) to either a Yara rule file or a folder containing Yara rule files (with .yar or .yara extensions).

For example:

err := phishdetect.InitializeYara(rulesPath)
if err != nil {
    log.Error("I failed to initialize the Yara scanner: ", err.Error())
}

This needs to be done only once (perhaps in your program's init() function). All following analysis will make use of the same initialized scanner.

Using PhishDetect CLI

Firstly, make sure you have Go 1.11+ installed. We require Go 1.11 or later versions because of the native support for Go Modules, which we use to manage dependencies. If it isn't available for your operating system of choice, we recommend trying gvm.

Now you can either install PhishDetect's command-line interface by simply launching:

go get github.com/phishdetect/phishdetect/cli

Or build the binary from the source code. In order to do so, proceed cloning the Git repository:

$ git clone github.com/phishdetect/phishdetect.git

Move to directory you just cloned and proceed with downloading the depedencies:

$ make deps

In order to build binaries for GNU/Linux:

$ make

Once the compilation is completed, you will find the command-line interface in the build/ folder.

Launch phishdetect-cli -h to view the help message:

Usage of phishdetect-cli:
      --api-version string    Specify which Docker API version to use (default "1.37")
      --brands string         Specify a folder containing YAML files with Brand specifications
      --container string      Specify a name for a docker image to use (default "phishdetect/phishdetect")
      --debug                 Enable debug logging
      --html string           Specify a path to save the HTML from the visited page
      --safebrowsing string   Specify a file path containing your Google SafeBrowsing API key
      --screen string         Specify the file path to store the screenshot
      --tor                   Route connection through the Tor network
      --url-only              Only perform URL analysis
      --yara string           Specify a path to a file or folder contaning Yara rules

Specify a URL and the preferred options and wait for the results to appear:

$ build/linux/phishdetect-cli -screen /tmp/screen.png -tor http://[REDACTED].com/Login
INFO[0000] Analyzing URL http://[REDACTED].com/Login
INFO[0000] Using User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.89 Safar$
INFO[0000] Using debug port: 9564
INFO[0000] Enabled route through the Tor network
INFO[0000] Started container with ID e43f6df4ab0fb8e29453df3ebaede0fe6a4bcbafa4fabaaa1da95573a28552ff
INFO[0000] Attempting to connect to debug port...
INFO[0001] Connection to debug port established!
INFO[0013] Saved screenshot at /tmp/screen.png
INFO[0013] Killed container with ID e43f6df4ab0fb8e29453df3ebaede0fe6a4bcbafa4fabaaa1da95573a28552ff
INFO[0013] Starting to analyze HTML...
INFO[0013] Matched password-input
INFO[0013] Matched suspicious-title
INFO[0014] Starting to analyze the URL...
INFO[0014] Matched suspicious-hostname
INFO[0014] Matched no-tls
INFO[0014] Visits:
INFO[0014]      - http://[REDACTED].com/Login
INFO[0014]      - http://[REDACTED].com/Login/
INFO[0014] Final URL: http://[REDACTED].com/Login/
INFO[0014] Safelisted: false
INFO[0014] Final score: 90
INFO[0014] Brand: tutanota
INFO[0014] Warnings:
INFO[0014]      - The page contains a password input         name=password-input score=10
INFO[0014]      - The page has a suspicious title            name=suspicious-title score=30
INFO[0014]      - The domain contains suspicious words       name=suspicious-hostname score=30
INFO[0014]      - The website is not using a secure transport layer (HTTPS)  name=no-tls score=20

License

PhishDetect is released under GNU Affero General Public License 3.0 and is copyrighted to Claudio Guarnieri.

Documentation ¶

Index ¶

Constants
Variables
func GetSHA256Hash(text string) string
func InitializeYara(yaraRulesPath string) error
func NormalizeURL(url string) string
func SliceContains(slice []string, item string) bool
func TextContains(text, pattern string) bool
type Analysis
- func NewAnalysis(url, html string) *Analysis
- func (a *Analysis) AnalyzeDomain() error
- func (a *Analysis) AnalyzeHTML() error
- func (a *Analysis) AnalyzePage(resources []Resource) error
- func (a *Analysis) AnalyzeURL() error
type Brands
- func NewBrands() *Brands
- func (b *Brands) AddBrand(brand *brand.Brand)
- func (b *Brands) GetBrand() string
- func (b *Brands) IsDomainSafelisted(domain, brandName string) bool
- func (b *Brands) IsLinkDangerous(link, brandName string) bool
type Browser
- func NewBrowser(url string, screenshotPath string, useTor bool, logEvents bool, ...) *Browser
- func (b *Browser) Run() error
type Check
- func GetDomainChecks() []Check
- func GetHTMLChecks() []Check
- func GetURLChecks() []Check
type CheckFunction
type Dialog
type Download
type Link
- func NewLink(urlString string) (*Link, error)
type LogCodec
- func (c *LogCodec) ReadResponse(resp *rpcc.Response) error
- func (c *LogCodec) WriteRequest(req *rpcc.Request) error
type Page
- func NewPage(html string, resources []Resource) (*Page, error)
- func (p *Page) GetEntities(entityType string) []soup.Root
- func (p *Page) GetInputs(inputType string) []soup.Root
- func (p *Page) GetTitle() string
type Resource
type Warning

Constants ¶

View Source

const BrowserEventWaitTime time.Duration = 15

BrowserEventWaitTime is the seconds we wait while attempting to fetch some events from DevTools, before failing.

View Source

const BrowserTimeout time.Duration = 1

BrowserTimeout is the minutes we will wait before declaring failed the connection to our debugged browser or to the URL failed.

View Source

const BrowserWaitTime time.Duration = 5

BrowserWaitTime is the seconds we will wait before fetching navigation results.

Variables ¶

View Source

var SafeBrowsingKey string

SafeBrowsingKey contains the API key to use Google SafeBrowsing API.

View Source

var YaraRules *yara.Rules

YaraRules will contain compiled Yara rules provided by InitializeYara.

Functions ¶

func GetSHA256Hash ¶ added in v1.8.0

func GetSHA256Hash(text string) string

GetSHA256Hash retrieves a SHA256 hash of a string.

func InitializeYara ¶

func InitializeYara(yaraRulesPath string) error

InitializeYara will load any rule files found at the specified path and compile them into a Rules object.

func NormalizeURL ¶

func NormalizeURL(url string) string

NormalizeURL fixes a URL that is e.g. missing a scheme, etc.

func SliceContains ¶

func SliceContains(slice []string, item string) bool

SliceContains checks whether a string is contained in a slice of strings.

func TextContains ¶

func TextContains(text, pattern string) bool

TextContains will determine if a substring is present in a string. It is case-insensitive.

Types ¶

type Analysis ¶

type Analysis struct {
	URL        string    `json:"url"`
	FinalURL   string    `json:"final_url"`
	HTML       string    `json:"html"`
	Warnings   []Warning `json:"warnings"`
	Score      int       `json:"score"`
	Safelisted bool      `json:"safelisted"`
	Dangerous  bool      `json:"dangerous"`
	Brands     *Brands   `json:"brands"`
}

Analysis contains information on the outcome of the URL and/or HTML analysis.

func NewAnalysis ¶

func NewAnalysis(url, html string) *Analysis

NewAnalysis instantiates a new Analysis struct.

func (*Analysis) AnalyzeDomain ¶

func (a *Analysis) AnalyzeDomain() error

AnalyzeDomain performs all the available checks to be run on a URL or domain.

func (*Analysis) AnalyzeHTML ¶

func (a *Analysis) AnalyzeHTML() error

AnalyzeHTML performs all the available checks to be run on an HTML string.

func (*Analysis) AnalyzePage ¶ added in v1.11.0

func (a *Analysis) AnalyzePage(resources []Resource) error

AnalyzePage performs all the available checks to be run on an HTML string as well as the provided list of resources (e.g. downloaded scripts).

func (*Analysis) AnalyzeURL ¶

func (a *Analysis) AnalyzeURL() error

AnalyzeURL performs all the available checks to be run on a URL or domain.

type Brands ¶

type Brands struct {
	Top  *brand.Brand
	List []*brand.Brand
}

Brands defines the attribute of our list of supported brands.

func NewBrands ¶

func NewBrands() *Brands

NewBrands instantiates a new Brands struct.

func (*Brands) AddBrand ¶

func (b *Brands) AddBrand(brand *brand.Brand)

AddBrand adds a new brand to the list.

func (*Brands) GetBrand ¶

func (b *Brands) GetBrand() string

GetBrand determines which among the marked brands is most likely the one impersonated by the page.

func (*Brands) IsDomainSafelisted ¶

func (b *Brands) IsDomainSafelisted(domain, brandName string) bool

IsDomainSafelisted checks if the specified domain is in any of the safelists of the supported brands.

func (*Brands) IsLinkDangerous ¶ added in v1.9.1

func (b *Brands) IsLinkDangerous(link, brandName string) bool

IsLinkDangerous checks if the specified link matches a brand's dangerous regexp.

type Browser ¶

type Browser struct {
	URL            string     `json:"url"`
	FinalURL       string     `json:"final_url"`
	Visits         []string   `json:"visits"`
	Resources      []Resource `json:"resources"`
	Downloads      []Download `json:"downloads"`
	Dialogs        []Dialog   `json:"dialogs"`
	HTML           string     `json:"html"`
	ScreenshotPath string     `json:"screenshot_path"`
	ScreenshotData string     `json:"screenshot_data"`
	UseTor         bool       `json:"use_tor"`
	DebugPort      int        `json:"debug_port"`
	DebugURL       string     `json:"debug_url"`
	LogEvents      bool       `json:"log_events"`
	UserAgent      string     `json:"user_agent"`
	ImageName      string     `json:"image_name"`
	ContainerID    string     `json:"container_id"`
}

Browser is a struct containing details over a browser navigation to a URL.

func NewBrowser ¶

func NewBrowser(url string, screenshotPath string, useTor bool, logEvents bool, imageName string) *Browser

NewBrowser instantiates a new Browser struct.

func (*Browser) Run ¶

func (b *Browser) Run() error

Run launches our browser and navigates to the specified URL.

type Check ¶

type Check struct {
	Call        CheckFunction
	Score       int
	Name        string
	Description string
}

Check defines the general proprties of a CheckFunction.

func GetDomainChecks ¶

func GetDomainChecks() []Check

GetDomainChecks returns a list of only the checks that work for domain names.

func GetHTMLChecks ¶

func GetHTMLChecks() []Check

GetHTMLChecks returns a list of all the available HTML checks.

func GetURLChecks ¶

func GetURLChecks() []Check

GetURLChecks returns a list of all the available URL checks.

type CheckFunction ¶

type CheckFunction func(*Link, *Page, *Brands) bool

CheckFunction defines the functions used to implement URL or HTML checks.

type Dialog ¶ added in v1.13.0

type Dialog struct {
	URL     string `json:"url"`
	Type    string `json:"type"`
	Message string `json:"message"`
}

Dialog contains details of JavaScript dialogs opened.

type Download ¶ added in v1.13.0

type Download struct {
	URL      string `json:"url"`
	FileName string `json:"file_name"`
}

Download contains details of files which were offered for download at the link.

type Link ¶

type Link struct {
	URL        string
	Scheme     string
	Domain     string
	Port       string
	TopDomain  string
	Path       string
	RawQuery   string
	Parameters map[string]string
}

Link defines details of a parsed URL.

func NewLink ¶

func NewLink(urlString string) (*Link, error)

NewLink instantiates a Link struct.

type LogCodec ¶ added in v1.13.0

type LogCodec struct {
	// contains filtered or unexported fields
}

Adapted from: https://pkg.go.dev/github.com/mafredri/cdp#example-package-Logging LogCodec captures the output from writing RPC requests and reading responses on the connection. It implements rpcc.Codec via WriteRequest and ReadResponse.

func (*LogCodec) ReadResponse ¶ added in v1.13.0

func (c *LogCodec) ReadResponse(resp *rpcc.Response) error

ReadResponse unmarshals from the connection into v whilst echoing what is read into a buffer for logging.

func (*LogCodec) WriteRequest ¶ added in v1.13.0

func (c *LogCodec) WriteRequest(req *rpcc.Request) error

WriteRequest marshals v into a buffer, writes its contents onto the connection and logs it.

type Page ¶

type Page struct {
	HTML      string
	Soup      soup.Root
	Text      string
	Resources []Resource
}

Page contains information on the HTML page.

func NewPage ¶

func NewPage(html string, resources []Resource) (*Page, error)

NewPage instantiates a new Page struct.

func (*Page) GetEntities ¶

func (p *Page) GetEntities(entityType string) []soup.Root

GetEntities returns any HTML entity of the specified type.

func (*Page) GetInputs ¶

func (p *Page) GetInputs(inputType string) []soup.Root

GetInputs returns any form input.

func (*Page) GetTitle ¶

func (p *Page) GetTitle() string

GetTitle returns the content of the <title> tag from the HTML page.

type Resource ¶ added in v1.8.0

type Resource struct {
	Status  int    `json:"status"`
	URL     string `json:"url"`
	Type    string `json:"type"`
	SHA256  string `json:"sha256"`
	Content string `json:"content"`
}

Resource contains details of a resource that was fetched.

type Warning ¶ added in v1.8.3

type Warning struct {
	Score       int    `json:"score"`
	Name        string `json:"name"`
	Description string `json:"description"`
}

Warning is a converstion of Check containing only results.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
brand
cli

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

PhishDetect

Table of Contents

Building

Using PhishDetect as a library

Analyzing a link statically

Analyzing a link dynamically

Adding new Brands to the existing list

Adding Yara rules to the HTML classifier

Using PhishDetect CLI

License

Documentation ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func GetSHA256Hash ¶ added in v1.8.0

func InitializeYara ¶

func NormalizeURL ¶

func SliceContains ¶

func TextContains ¶

Types ¶

type Analysis ¶

func NewAnalysis ¶

func (*Analysis) AnalyzeDomain ¶

func (*Analysis) AnalyzeHTML ¶

func (*Analysis) AnalyzePage ¶ added in v1.11.0

func (*Analysis) AnalyzeURL ¶

type Brands ¶

func NewBrands ¶

func (*Brands) AddBrand ¶

func (*Brands) GetBrand ¶

func (*Brands) IsDomainSafelisted ¶

func (*Brands) IsLinkDangerous ¶ added in v1.9.1

type Browser ¶

func NewBrowser ¶

func (*Browser) Run ¶

type Check ¶

func GetDomainChecks ¶

func GetHTMLChecks ¶

func GetURLChecks ¶

type CheckFunction ¶

type Dialog ¶ added in v1.13.0

type Download ¶ added in v1.13.0

type Link ¶

func NewLink ¶

type LogCodec ¶ added in v1.13.0

func (*LogCodec) ReadResponse ¶ added in v1.13.0

func (*LogCodec) WriteRequest ¶ added in v1.13.0

type Page ¶

func NewPage ¶

func (*Page) GetEntities ¶

func (*Page) GetInputs ¶

func (*Page) GetTitle ¶

type Resource ¶ added in v1.8.0

type Warning ¶ added in v1.8.3

Source Files ¶

Directories ¶