crawler

package
v0.0.0-...-e97be17 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 21, 2022 License: MIT Imports: 14 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	GraphAPI GraphAPI

	IndexAPI IndexAPI

	PrivateNetworkDetector crawler_pipeline.PrivateNetworkDetector

	URLGetter crawler_pipeline.URLGetter

	// An API for detecting the partition assignments for this service.
	PartitionDetector partition.Detector

	// A clock instance for generating time-related events. If not specified,
	// the default wall-clock will be used instead.
	Clock clock.Clock

	// The number of concurrent workers used for retrieving links.
	FetchWorkers int

	// The time between subsequent crawler passes.
	UpdateInterval time.Duration

	// The minimum amount of time before re-indexing an already-crawled link.
	ReIndexThreshold time.Duration

	// The logger to use. If not defined an output-discarding logger will
	// be used instead.
	Logger *logrus.Entry
}

Config encapsulates the settings for configuring the crawler service.

type GraphAPI

type GraphAPI interface {
	UpsertLink(link *graph.Link) error
	UpsertEdge(edge *graph.Edge) error
	RemoveStaleEdges(fromID uuid.UUID, updatedBefore time.Time) error
	Links(fromID, toID uuid.UUID, retrievedBefore time.Time) (graph.LinkIterator, error)
}

GraphAPI defines as set of API methods for accessing the graph.

type IndexAPI

type IndexAPI interface {
	Index(doc *indexer.Document) error
}

IndexAPI defines a set of API methods for indexing crawled documents.

type Service

type Service struct {
	// contains filtered or unexported fields
}

func NewService

func NewService(cfg Config) (*Service, error)

func (*Service) Name

func (svc *Service) Name() string

func (*Service) Run

func (svc *Service) Run(ctx context.Context) error

Directories

Path Synopsis
Package mocks is a generated GoMock package.
Package mocks is a generated GoMock package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL