Documentation
¶
Overview ¶
Package inspect crawls websites and detects broken links, security issues, form problems, accessibility violations, and performance concerns.
It is designed as a standalone library imported by hawk — it has no CLI, no LLM dependency, and no TUI. Hawk wires inspect into its own commands.
Usage:
report, err := inspect.Scan(ctx, "https://example.com", inspect.Standard)
for _, f := range report.Findings {
fmt.Printf("[%s] %s: %s\n", f.Severity, f.URL, f.Message)
}
For high-throughput or repeated scans, use the reusable Scanner:
scanner := inspect.NewScanner(inspect.Standard) r1, _ := scanner.Scan(ctx, "https://site-a.com") r2, _ := scanner.Scan(ctx, "https://site-b.com")
Index ¶
- func ClearCustomChecks()
- func RegisterCheck(c Checker)
- func RegisterRule(rule RuleCheck)
- type AXNode
- type AxeNode
- type AxeViolation
- type BrowserEngine
- type BrowserOpts
- type Checker
- type FileConfig
- type Finding
- type NetworkEntry
- type Option
- func LoadConfig(dir string) ([]Option, error)
- func WithAcceptedStatusCodes(codes ...int) Option
- func WithAuth(header, value string) Option
- func WithBlockPrivateIPs() Option
- func WithBrowser(engine BrowserEngine) Option
- func WithChecks(checks ...string) Option
- func WithConcurrency(n int) Option
- func WithCookieJar(jar http.CookieJar) Option
- func WithDepth(n int) Option
- func WithExclude(patterns ...string) Option
- func WithFailOn(sev Severity) Option
- func WithFollowRedirects(max int) Option
- func WithLogger(l *slog.Logger) Option
- func WithPageTimeout(d time.Duration) Option
- func WithRateLimit(reqPerSec int) Option
- func WithRespectRobots(enabled bool) Option
- func WithTimeout(d time.Duration) Option
- func WithUserAgent(ua string) Option
- type Page
- type PageData
- type PageLink
- type Report
- type RuleCheck
- type Scanner
- type Severity
- type Stats
- type Viewport
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ClearCustomChecks ¶
func ClearCustomChecks()
ClearCustomChecks removes all registered custom checks and rules. Useful in tests.
func RegisterCheck ¶
func RegisterCheck(c Checker)
RegisterCheck registers a custom check that will run alongside built-in checks. Call this before Scan() to include custom logic.
func RegisterRule ¶
func RegisterRule(rule RuleCheck)
RegisterRule registers a declarative rule-based check. Rules are simpler than full Checker implementations — just pattern matching.
Types ¶
type AXNode ¶ added in v0.2.0
type AXNode struct {
Role string
Name string
Description string
Value string
Properties map[string]string
Children []AXNode
Ignored bool
}
AXNode represents a node in the computed accessibility tree.
type AxeViolation ¶ added in v0.2.0
type AxeViolation struct {
ID string // rule ID (e.g., "color-contrast")
Impact string // "critical", "serious", "moderate", "minor"
Description string
Help string
HelpURL string
Nodes []AxeNode
}
AxeViolation represents an axe-core accessibility violation.
type BrowserEngine ¶ added in v0.2.0
type BrowserEngine interface {
RenderPage(ctx context.Context, url string, opts BrowserOpts) (*PageData, error)
Close() error
}
BrowserEngine is the interface for optional browser-based page analysis. The core inspect package never imports rod — consumers provide an implementation via the inspect/browser sub-module.
type BrowserOpts ¶ added in v0.2.0
type BrowserOpts struct {
Viewport Viewport
WaitFor string // CSS selector to wait for before analysis
Timeout time.Duration
InjectAxe bool // inject axe-core and return accessibility violations
Screenshot bool // capture full-page screenshot
UserAgent string
}
BrowserOpts configures a single browser page render.
type Checker ¶
Checker is the public interface for custom checks. Implement this to add your own audit logic and register it with RegisterCheck.
type FileConfig ¶
type FileConfig struct {
Depth int `json:"depth"`
Checks []string `json:"checks"`
Exclude []string `json:"exclude"`
FailOn string `json:"fail_on"`
Concurrency int `json:"concurrency"`
RateLimit int `json:"rate_limit"`
Timeout string `json:"timeout"`
PageTimeout string `json:"page_timeout"`
UserAgent string `json:"user_agent"`
AuthHeader string `json:"auth_header"`
AuthValue string `json:"auth_value"`
AcceptedStatusCodes []int `json:"accepted_status_codes"`
}
FileConfig represents the contents of an .inspect.toml configuration file.
type Finding ¶
type Finding struct {
Check string `json:"check"`
Severity Severity `json:"severity"`
URL string `json:"url"`
Element string `json:"element,omitempty"`
Message string `json:"message"`
Fix string `json:"fix,omitempty"`
Evidence string `json:"evidence,omitempty"`
}
Finding represents a single issue detected during a scan.
type NetworkEntry ¶ added in v0.2.0
type NetworkEntry struct {
URL string
Method string
Status int
MimeType string
Size int64
Duration time.Duration
Failed bool
FailReason string
}
NetworkEntry represents a network request made during page load.
type Option ¶
type Option interface {
// contains filtered or unexported methods
}
Option configures a scan operation.
var CI Option = optFunc(func(c *config) { c.depth = 5 c.checks = []string{"links", "security", "forms", "a11y", "perf", "seo"} c.concurrency = 10 c.failOn = SeverityHigh })
CI configures for continuous integration: standard checks, fail on high, JSON output.
var Deep Option = optFunc(func(c *config) { c.depth = 0 c.checks = []string{"links", "security", "forms", "a11y", "perf", "seo"} c.concurrency = 20 })
Deep performs an exhaustive crawl with no depth limit.
var Quick Option = optFunc(func(c *config) { c.depth = 2 c.checks = []string{"links"} c.concurrency = 5 })
Quick performs a shallow crawl checking only broken links.
Security limits the scan to security-related checks.
var Standard Option = optFunc(func(c *config) { c.depth = 5 c.checks = []string{"links", "security", "forms", "a11y", "perf", "seo"} c.concurrency = 10 })
Standard performs a balanced crawl with all checks enabled.
func LoadConfig ¶
LoadConfig reads .inspect.toml or .inspect.yaml from the given directory (searching upward to parent directories). Returns nil options and nil error if no config file is found. Returns an error only on malformed files.
func WithAcceptedStatusCodes ¶
WithAcceptedStatusCodes sets the HTTP status codes considered acceptable by the link checker. By default (when no codes are specified), status codes 200-399 are accepted.
func WithBlockPrivateIPs ¶ added in v0.5.0
func WithBlockPrivateIPs() Option
WithBlockPrivateIPs enables SSRF protection that blocks requests to private IP ranges. By default, private IPs are allowed since inspect is typically used to scan your own servers.
func WithBrowser ¶ added in v0.2.0
func WithBrowser(engine BrowserEngine) Option
WithBrowser sets a BrowserEngine for browser-rendered page analysis. When set, the scanner uses the engine to render pages with full JavaScript execution instead of relying solely on raw HTTP fetches for HTML analysis. The browser sub-module (github.com/GrayCodeAI/inspect/browser) provides a rod-based implementation.
func WithChecks ¶
func WithConcurrency ¶
func WithCookieJar ¶
func WithExclude ¶
func WithFailOn ¶
func WithFollowRedirects ¶
func WithLogger ¶
func WithPageTimeout ¶
func WithRateLimit ¶
func WithRespectRobots ¶
func WithTimeout ¶
func WithUserAgent ¶
type Page ¶
type Page struct {
URL string
StatusCode int
Headers map[string]string
Body []byte
Links []PageLink
Depth int
ParentURL string
}
Page is the public representation of a crawled page, exposed to custom checks.
type PageData ¶ added in v0.2.0
type PageData struct {
FinalURL string
Title string
RenderedHTML string
AccessTree []AXNode
AxeViolations []AxeViolation
ConsoleErrors []string
NetworkLog []NetworkEntry
Screenshot []byte
LoadTime time.Duration
}
PageData holds results from browser-rendered page analysis.
type Report ¶
type Report struct {
Target string `json:"target"`
Findings []Finding `json:"findings"`
Stats Stats `json:"stats"`
CrawledURLs int `json:"crawled_urls"`
Duration time.Duration `json:"duration"`
FailOn Severity `json:"fail_on"`
}
Report is the complete result of a scan operation.
func Scan ¶
Scan crawls the target URL and runs all configured checks. This is the primary entry point for one-off scans.
func (*Report) Failed ¶
Failed returns true if any finding meets or exceeds the configured fail threshold.
func (*Report) MaxSeverity ¶
MaxSeverity returns the highest severity found in the report.
type RuleCheck ¶
type RuleCheck struct {
RuleName string
RuleSeverity Severity
Description string
// Match conditions (any match triggers a finding)
HeaderMatch map[string]string // header name → regex pattern (match = issue)
HeaderMissing []string // headers that MUST be present
BodyMatch []string // regex patterns in body (match = issue)
BodyMissing []string // regex patterns that MUST be present
URLMatch string // only apply to URLs matching this regex
StatusCodes []int // only apply to these status codes (empty = all)
FixSuggestion string
}
RuleCheck defines a declarative check via patterns — no Go code required. This is the equivalent of Nuclei templates but for site auditing.
type Scanner ¶
type Scanner struct {
// contains filtered or unexported fields
}
Scanner is a reusable site auditor. Create one with NewScanner and call Scan multiple times. It is safe for concurrent use.
func NewScanner ¶
NewScanner creates a configured Scanner. Apply presets and options:
s := inspect.NewScanner(inspect.Standard, inspect.WithDepth(3))
type Severity ¶
type Severity int
Severity represents the impact level of a finding.
func ParseSeverity ¶
ParseSeverity converts a string to a Severity.
type Stats ¶
type Stats struct {
PagesScanned int `json:"pages_scanned"`
FindingsTotal int `json:"findings_total"`
BySeverity map[Severity]int `json:"by_severity"`
ByCheck map[string]int `json:"by_check"`
DurationPerCheck map[string]time.Duration `json:"duration_per_check"`
}
Stats provides scan metrics, broken down by severity and check type.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
cmd
|
|
|
inspect-ci
command
Command inspect-ci is a lightweight CLI entrypoint for CI/CD pipelines and the GitHub Action.
|
Command inspect-ci is a lightweight CLI entrypoint for CI/CD pipelines and the GitHub Action. |
|
internal
|
|
|
check
Package check implements the check registry and individual site audit checks.
|
Package check implements the check registry and individual site audit checks. |
|
crawler
Package crawler implements a concurrent website crawler with rate limiting, depth control, URL deduplication, and robots.txt compliance.
|
Package crawler implements a concurrent website crawler with rate limiting, depth control, URL deduplication, and robots.txt compliance. |
|
report
Package report provides formatting utilities for scan results.
|
Package report provides formatting utilities for scan results. |