Documentation
¶
Overview ¶
Package httpsyet provides the configuration and execution for crawling a list of sites for links that can be updated to HTTPS.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Crawler ¶
type Crawler struct {
Sites []string // At least one URL.
Out io.Writer // Required. Writes one detected site per line.
Log *log.Logger // Required. Errors are reported here.
Depth int // Optional. Limit depth. Set to >= 1.
Parallel int // Optional. Set how many sites to crawl in parallel.
Delay time.Duration // Optional. Set delay between crawls.
Get func(string) (*http.Response, error) // Optional. Defaults to http.Get.
Verbose bool // Optional. If set, status updates are written to logger.
}
Crawler is used as configuration for Run. Is validated in Run().
type Site ¶
Site represents what travels: an URL which may have a Parent URL, and a Depth.
type Traffic ¶
type Traffic struct {
Travel chan site // to be processed
*sync.WaitGroup // monitor SiteEnter & SiteLeave
}
Traffic as it goes around inside a circular site pipe network, e. g. a crawling Crawler. Composed of Travel, a channel for those who travel in the traffic, and an embedded *sync.WaitGroup to keep track of congestion.
Click to show internal directories.
Click to hide internal directories.