# ProxyAtlas

ProxyAtlas is a high-performance Go CLI toolkit for collecting and validating free proxies with a strict-first pipeline and adaptive fallback when strict results are empty.
What It Is
ProxyAtlas has two binaries:
proxyharvest: collects proxies from curated public sources, normalizes/dedupes them, scores sources, and runs fast prefilter checks.
proxycheck: runs strict health/anonymity/stability validation, produces ranked diagnostics, and can auto-run adaptive fallback.
Architecture
- Ingestion:
raw_text and json_api source adapters.
- Normalization: canonical
scheme://host:port with protocol validation.
- Source reliability: health tracking, fail-threshold gating, cooldown skips.
- Prefilter: fast multi-target checks (
>=1 pass by default).
- Strict check: multi-target + latency + anonymity + stability retries.
- Adaptive fallback: triggered when strict healthy count is zero (or forced with
--mode adaptive).
- Output: JSONL + TXT + report JSON.
Install
go build -o bin/proxyharvest ./cmd/proxyharvest
go build -o bin/proxycheck ./cmd/proxycheck
Quick Start
PowerShell
.\bin\proxyharvest.exe `
--protocols http,https,socks4,socks5 `
--max-collect 50000 `
--max-per-protocol 15000 `
--fetch-workers 64 `
--source-timeout 12s `
--source-fail-threshold 3 `
--source-cooldown 12h `
--prefilter-profile fast `
--prefilter-min-pass 1 `
--prefilter-timeout 1800ms `
--prefilter-workers 200 `
--sources-file configs/sources.json `
--targets-file configs/targets.json `
--out-jsonl data/harvest/latest.jsonl `
--out-txt data/harvest/latest.txt
.\bin\proxycheck.exe `
--mode strict `
--in-jsonl data/harvest/latest.jsonl `
--in-txt data/harvest/latest.txt `
--workers 300 `
--max-eval 2000 `
--connect-timeout 1800ms `
--request-timeout 4500ms `
--min-pass 2 `
--max-latency 4500ms `
--stability-retries 2 `
--stability-gap 2s `
--adaptive-min-pass 1 `
--adaptive-max-latency 12s `
--adaptive-no-stability `
--targets-profile resilient `
--targets-file configs/targets.json `
--out-jsonl data/check/latest.jsonl `
--out-txt data/check/healthy.txt `
--out-report data/check/report.json `
--adaptive-out-jsonl data/check/latest_adaptive.jsonl `
--adaptive-out-txt data/check/healthy_adaptive.txt `
--adaptive-out-report data/check/report_adaptive.json
Linux/macOS
./bin/proxyharvest \
--protocols http,https,socks4,socks5 \
--max-collect 50000 \
--max-per-protocol 15000 \
--fetch-workers 64 \
--source-timeout 12s \
--source-fail-threshold 3 \
--source-cooldown 12h \
--prefilter-profile fast \
--prefilter-min-pass 1 \
--prefilter-timeout 1800ms \
--prefilter-workers 200 \
--sources-file configs/sources.json \
--targets-file configs/targets.json \
--out-jsonl data/harvest/latest.jsonl \
--out-txt data/harvest/latest.txt
./bin/proxycheck \
--mode strict \
--in-jsonl data/harvest/latest.jsonl \
--in-txt data/harvest/latest.txt \
--workers 300 \
--max-eval 2000 \
--connect-timeout 1800ms \
--request-timeout 4500ms \
--min-pass 2 \
--max-latency 4500ms \
--stability-retries 2 \
--stability-gap 2s \
--adaptive-min-pass 1 \
--adaptive-max-latency 12s \
--adaptive-no-stability \
--targets-profile resilient \
--targets-file configs/targets.json \
--out-jsonl data/check/latest.jsonl \
--out-txt data/check/healthy.txt \
--out-report data/check/report.json \
--adaptive-out-jsonl data/check/latest_adaptive.jsonl \
--adaptive-out-txt data/check/healthy_adaptive.txt \
--adaptive-out-report data/check/report_adaptive.json
Strict vs Adaptive
- Strict mode validates with stronger requirements (default trust-first profile).
- If strict healthy count is zero, ProxyAtlas automatically runs adaptive fallback unless strict already succeeded.
- Adaptive outputs are separated to avoid mixing confidence levels.
Tuning
- More speed: lower
--max-eval, reduce --stability-retries, reduce timeouts.
- More confidence: raise
--min-pass, keep stability retries, lower --max-latency.
- Better source quality: keep
--source-fail-threshold low and --source-cooldown high.
Output Schema
Harvest JSONL (data/harvest/latest.jsonl)
proxy_url, scheme, host, port, sources[], source_hits, source_score, prefilter_ok, prefilter_pass_count, prefilter_checks_total, prefilter_reason, timestamps.
Check JSONL (data/check/latest.jsonl)
proxy_url, mode, attempt_rounds, pass_count, checks_total, avg_latency_ms, p95_latency_ms, success_targets[], failed_targets[], exit_ip, local_ip, anonymous, header_leaks[], stable, score, tier, status, status_reason, rejection_stage.
Troubleshooting
| Symptom |
Meaning |
Action |
healthy_count = 0 in strict report |
Free proxies are mostly dead/unstable under strict rules |
Check adaptive report and tune strict thresholds |
status_reason = insufficient_pass dominates |
Proxies fail enough targets |
Increase source refresh, lower strict threshold for fallback |
status_reason = target_errors dominates |
Connectivity/timeouts to targets |
Increase request/connect timeout and verify network path |
| many sources skipped by cooldown |
Source health gating is working |
wait for cooldown or adjust --source-cooldown |
Benchmarks
Observed in local runs (Windows, Go 1.25):
- Harvest: ~15k+ lines, ~20-60s depending on source/API responsiveness.
- Strict check: 200 proxies typically completes in ~15-30s with bounded workers.
Limitations Of Free Proxies
- Public free proxies are volatile and frequently dead.
- Regional routing and target rate limits can heavily affect pass rates.
- Zero strict healthy proxies is a normal outcome in many windows.
Ethical / Legal Notice
Use this project only for authorized testing and lawful automation. You are responsible for compliance with local law, provider terms, and target system policies.
Release Process
See docs/release-checklist.md.
Release artifacts are published as proxyatlas-<os>-<arch>.tar.gz with matching
proxyatlas-<os>-<arch>.sha256 files, plus an aggregate checksums.txt.
Validate downloaded release files before use (sha256sum -c checksums.txt).
License
MIT