Documentation
¶
Index ¶
- Variables
- func BrowserEval(expression string) (string, error)
- func BrowserFetch(targetURL string) (string, error)
- func CleanupStaleProxies()
- func FetchAllSources(sources []Source)
- func GetDocFromURL(url string, customHeaders ...http.Header) (string, error)
- func GetURLThroughProxyWithRetry(u string, timeout time.Duration, proxyAddr string, retry int, ...) ([]byte, error)
- func Init()
- func ReleaseBrowser()
- func StartLightpanda() error
- func StopLightpanda()
- func TestFetchSource(src Source) ([]proxyinabox.Proxy, error)
- func UpdateSourceAvailableCounts(proxies []proxyinabox.Proxy)
- func ValidateProxy(p proxyinabox.Proxy) (country string, delay int64, err error)
- func Verify()
- type BrowserSession
- type Source
- type SourceStatus
Constants ¶
This section is empty.
Variables ¶
var ValidateJobs chan proxyinabox.Proxy
Functions ¶
func BrowserEval ¶
BrowserEval 在当前 session 的页面上执行 JS 表达式
func BrowserFetch ¶
BrowserFetch 启动临时 lightpanda 实例 → 导航到 URL → 等待 JS 渲染 → 返回 HTML 优先通过代理池中的随机 proxy 启动浏览器(lightpanda --http_proxy),若导航失败则销毁 session 并用直连重试。
func CleanupStaleProxies ¶ added in v0.8.0
func CleanupStaleProxies()
func FetchAllSources ¶
func FetchAllSources(sources []Source)
FetchAllSources starts a goroutine per source to continuously fetch proxies
func GetDocFromURL ¶
GetDocFromURL fetches a URL body as string, optionally through a random proxy. 优先通过代理池中的随机 proxy 抓取,若代理抓取失败则 fallback 到直连重试,确保源站可达性最大化。
func GetURLThroughProxyWithRetry ¶
func GetURLThroughProxyWithRetry(u string, timeout time.Duration, proxyAddr string, retry int, customHeaders ...http.Header) ([]byte, error)
GetURLThroughProxyWithRetry fetches a URL through the given proxy with retry logic
func ReleaseBrowser ¶ added in v0.4.1
func ReleaseBrowser()
ReleaseBrowser 停止当前 lightpanda 实例,释放所有资源
func StartLightpanda ¶ added in v0.9.14
func StartLightpanda() error
StartLightpanda 兼容 test-source 子命令的预启动接口
func StopLightpanda ¶ added in v0.9.14
func StopLightpanda()
StopLightpanda 兼容 test-source 子命令和信号处理的停止接口
func TestFetchSource ¶
func TestFetchSource(src Source) ([]proxyinabox.Proxy, error)
TestFetchSource performs a single fetch for testing purposes (does not send to ValidateJobs)
func UpdateSourceAvailableCounts ¶ added in v0.8.0
func UpdateSourceAvailableCounts(proxies []proxyinabox.Proxy)
UpdateSourceAvailableCounts 根据代理池快照更新各源的可用代理计数
func ValidateProxy ¶ added in v0.8.0
func ValidateProxy(p proxyinabox.Proxy) (country string, delay int64, err error)
ValidateProxy 通过代理访问 Cloudflare trace 端点验证代理可用性,返回验证结果 不依赖 DB/Cache,仅做网络验证,供 test-source 命令使用
Types ¶
type BrowserSession ¶ added in v0.4.1
type BrowserSession struct {
// contains filtered or unexported fields
}
BrowserSession 管理单次浏览器抓取的完整生命周期(lightpanda 进程 + CDP 连接) 每个 runScript 调用创建独立 session,用完即销毁,避免资源泄漏
type Source ¶
type Source struct {
Name string `yaml:"name"`
Type string `yaml:"type"` // text, json, script
URL string `yaml:"url"`
Protocol string `yaml:"protocol"`
Headers map[string]string `yaml:"headers"`
Interval string `yaml:"interval"`
IPField string `yaml:"ip_field"`
PortField string `yaml:"port_field"`
ProtocolField string `yaml:"protocol_field"`
Script string `yaml:"script"`
}
Source represents a YAML-driven proxy source configuration
func LoadSources ¶
LoadSources reads all .yaml files from the given directory and returns parsed sources
type SourceStatus ¶ added in v0.2.0
type SourceStatus struct {
Name string `json:"name"`
Type string `json:"type"`
LastFetch time.Time `json:"last_fetch"`
ProxyCount int `json:"proxy_count"`
Error string `json:"error"`
Interval string `json:"interval"`
// AvailableCount 该源当前在代理池中验证通过的可用代理数(实时从缓存统计)
AvailableCount int `json:"available_count"`
}
SourceStatus 记录每个 proxy 源的最近抓取状态,用于 dashboard 展示
func GetSourceStatuses ¶ added in v0.2.0
func GetSourceStatuses() []SourceStatus
GetSourceStatuses 返回所有源状态的快照副本(线程安全)