recrawl
A simple web crawler written in Go
☛ Installation
Requires Go and git
go install github.com/root4loot/recrawl@latest
Usage
_
_ __ ___ ___ _ __ ___ __ _| |
| '__/ _ \/ __| '__/ _' \ \ /\ / / |
| | | __/ (__| | | (_| |\ V V /| |
|_| \___|\___|_| \__,_| \_/\_/ |_|
A simple web crawler by @danielantonsen
Arguments:
-domain [string] Domain to crawl
-whitelist [string] Domains to be whitelisted (Comma separated)
-domain [string] File to write results
-useragent [string] Set user-agent
-threads [int] Number of threads (Default: 10)
-depth [int] Maximum depth to crawl (Default: 999)
-delay [int] Seconds to wait before creating a new request to the matching domains
-delay2 [int] Seconds to be randomized prior to each new request (Default: 0)
-json Output as JSON (-outfile)
-silent Suppress output from console
-async Enable asynchronous network requests (Default: ON)
-version Display version
-help Display help
Example
➜ recrawl -domain hackerone.com -threads 12 -delay 0.2
https://www.hackerone.com
https://www.hackerone.com/attack-resistance-assessment
https://www.hackerone.com/product/attack-surface-management
https://www.hackerone.com/6th-annual-hacker-powered-security-report
https://www.hackerone.com/events/rsa-conference-2023
https://www.hackerone.com/security-at-beyond
...
As lib
package main
import (
"fmt"
crawl "github.com/root4loot/recrawl/pkg"
)
func main() {
domain := "hackerone.com"
whitelist := "www.hackerone.com,example.com"
threads := 10
depth := 3
delay := 0
delay2 := 0
useragent := ""
async := true
silent := true
options := crawl.Options{
Domain: &domain,
Whitelist: &whitelist,
Threads: &threads,
Depth: &depth,
Delay: &delay,
Delay2: &delay2,
UserAgent: &useragent,
Async: &async,
Silent: &silent,
}
results := crawl.Go(options)
fmt.Println(results)
}