domain-crawler

command module

v0.0.0-...-47aab7a Latest Latest Go to latest Published: Dec 9, 2022 License: Apache-2.0 Imports: 11 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/illyasch/worker-pool

Links

Open Source Insights

README ¶

worker-pool

This demonstrational application shows how to implement a pool of workers to execute different tasks simultaneously.

Overview

The application takes a list of domains (as a file with one domain per line) from the standard input and outputs the average response time and the average data size of the download of the index pages across all of the input domains. Since the list of domains can potentially be very large (or streamed) and unknown, we want to do this in a controllable way. Doing it serially is too slow and doing everything at once is not scalable. That's why it uses a generic work pool to control the processing described above.

Command line flags

   -t int
      HTTP timeout. (default 10)
   -w int
      Number of workers. (default 10)

Run unit tests

go test -race ./...

Build the binary

go build

Run manual tests

$ ./domain-crawler -t 10 -w 30 < top111.txt 
processing started with 30 workers
success: https://github.com, size 309937, duration 280.099034ms
success: https://google.com, size 15075, duration 439.817687ms
...   
downloaded 95 files, average 203507 bytes, 1.158814315s

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL