Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct {
PrivateNetworkDetector PrivateNetworkDetector
URLGetter URLGetter
Graph MiniGraph
Indexer MiniIndexer
NumOfFetchWorkers int
}
Config serves as a configuration object for the crawler.
type Crawler ¶
type Crawler struct {
// contains filtered or unexported fields
}
Crawler executes a web crawler pipeline.
type MiniGraph ¶
type MiniGraph interface {
// UpsertLink creates a new or updates an existing link.
UpsertLink(link *graph.Link) error
// UpsertEdge creates a new or updates an existing edge.
UpsertEdge(edge *graph.Edge) error
// RemoveStaleEdges removes any edge that originates from a specific link ID
// and was updated before the specified [updatedBefore] time.
RemoveStaleEdges(fromID uuid.UUID, updatedBefore time.Time) error
}
MiniGraph should be implemented by objects that can upsert links and edges into a link graph instance. ie [graph updater objects].
type MiniIndexer ¶
type MiniIndexer interface {
// Index adds a new document or updates an existing index entry
// in case of an existing document.
Index(doc *index.Document) error
}
MiniIndexer should be implemented by objects that can index documents discovered by the crawler component. ie [text indexer objects].
type PrivateNetworkDetector ¶
PrivateNetworkDetector should be implemented by objects that can detect whether a host resolves to a private network address.
Source Files
¶
Click to show internal directories.
Click to hide internal directories.