crawl

package
v0.0.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 25, 2021 License: Apache-2.0 Imports: 17 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Crawler

type Crawler struct {
	// contains filtered or unexported fields
}

Crawler implements the Tendermint p2p network crawler.

func NewCrawler

func NewCrawler(logger zerolog.Logger, cfg config.Config, db *gorm.DB) (*Crawler, error)

func (*Crawler) CrawlNode

func (c *Crawler) CrawlNode(p Peer)

CrawlNode performs the main crawling functionality for a Tendermint node. It accepts a node RPC address and attempts to ping that node's P2P address by using the RPC address and the default P2P port of 26656. If the P2P address cannot be reached, the node is deleted if it exists in the database. Otherwise, we attempt to get additional metadata aboout the node via it's RPC address and its set of peers. For every peer that doesn't exist in the node pool, it is added.

func (*Crawler) GetGeolocation

func (c *Crawler) GetGeolocation(addr string) (models.Location, error)

GetGeolocation returns a Location record containing geolocation information for a given node. It will first check to see if the location already exists in cache. If the record does not exist in the cache, a Node record is queried by the provided address. If that record does not exist, we perform a query against the ipstack API and write to the cache. An error is returned if the database query fails.

func (*Crawler) RecheckNodes

func (c *Crawler) RecheckNodes()

RecheckNodes starts a blocking process where every recheckInterval duration the crawler checks for all stale nodes that need to be rechecked. For each stale node, the node is added back into the node pool to be re-crawled and updated (or removed).

func (*Crawler) Start

func (c *Crawler) Start()

Start starts a blocking process in which a random node is selected from the node pool and crawled. For each successful crawl, it'll be persisted or updated and its peers will be added to the node pool if they do not already exist. This process continues indefinitely until all nodes are exhausted from the pool. When the pool is empty and after crawlInterval seconds since the last complete crawl, a random set of nodes from the DB are added to reseed the pool.

func (*Crawler) Stop

func (c *Crawler) Stop()

Stop signals to the crawler that it should halt and exit all spawned goroutines.

type NodePool

type NodePool struct {
	// contains filtered or unexported fields
}

NodePool implements an abstraction over a pool of nodes for which to crawl. It also contains a collection of nodes for which to reseed the pool when it's empty. Once the reseed list has reached capacity, a random node is removed when another is added.

func NewNodePool

func NewNodePool(reseedCap uint) *NodePool

func (*NodePool) AddNode

func (np *NodePool) AddNode(p Peer)

AddNode adds a node to the node pool by adding it to the internal node list. In addition, we attempt to add it to the internal reseed node list. If the reseed list is full, it replaces a random node in the reseed list, otherwise it is directly added to it.

func (*NodePool) DeleteNode

func (np *NodePool) DeleteNode(p Peer)

DeleteNode removes a node from the node pool if it exists.

func (*NodePool) HasNode

func (np *NodePool) HasNode(p Peer) bool

HasNode returns true if a node exists in the node pool and false otherwise.

func (*NodePool) RandomNode

func (np *NodePool) RandomNode() (Peer, bool)

RandomNode returns a random node, based on Golang's map semantics, from the pool.

func (*NodePool) Reseed

func (np *NodePool) Reseed()

Reseed seeds the node pool with all the nodes found in the internal reseed list.

func (*NodePool) Seed

func (np *NodePool) Seed(seeds []string)

Seed seeds the node pool with a given set of nodes. For every seed, we split it on a ';' delimiter to get the RPC address and the network (if provided).

func (*NodePool) Size

func (np *NodePool) Size() int

Size returns the size of the pool.

type Peer

type Peer struct {
	RPCAddr string
	Network string
}

Peer defines a node structure that exists in the NodePool. Every Peer should have an RPC address defined, but a network is not strictly required.

func (Peer) String

func (p Peer) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL