gosurfer

package module
v0.5.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 10, 2026 License: MIT Imports: 25 Imported by: 0

README

gosurfer logo

gosurfer

AI-powered browser automation and e2e testing in pure Go. Combines the AI agent of Browser Use with the testing ergonomics of Playwright — semantic locators, auto-retrying assertions, network mocking, device emulation, and auth state persistence. All via the Chrome DevTools Protocol.

No Python. No Node.js. One static binary.

// AI agent that completes tasks autonomously
agent, _ := gosurfer.NewAgent(gosurfer.AgentConfig{
    Task:    "Find the price of a mass produced mass driver on Alibaba",
    LLM:     gosurfer.NewOpenAI(os.Getenv("OPENAI_API_KEY"), "gpt-4o"),
    Stealth: true,
})
result, _ := agent.Run(ctx)
fmt.Println(result.Output)
// Or use it for e2e testing with Playwright-style locators and assertions
browser, _ := gosurfer.NewBrowser(gosurfer.BrowserConfig{Headless: true})
page, _ := browser.NewPage()
page.Navigate("https://example.com")

// Semantic locators — resilient to DOM changes
page.GetByLabel("Email").Type("user@test.com")
page.GetByRole("button", gosurfer.Name("Sign In")).Click()

// Auto-retrying assertions — no flaky tests
expect := gosurfer.Expect(page)
expect.ToHaveURL("/dashboard")
expect.Locator("h1").ToHaveText("Welcome back")

Why gosurfer?

gosurfer Browser Use Playwright
Language Go Python Node.js / Python / Java
Binary size 4 MB (UPX) ~100 MB runtime ~200 MB runtime
Docker image ~945 MB ~2-3 GB ~1.5-2 GB
Idle memory ~530 MB ~800+ MB ~700+ MB
Peak memory ~1.1 GB ~2+ GB ~1.5+ GB
LLM agent Yes Yes No (separate layer)
Semantic locators Yes (GetByRole, etc.) No Yes
Auto-retry assertions Yes (Expect API) No Yes
Network mocking Yes (MockJSON, etc.) No Yes
Device emulation Yes (7 presets) No Yes
Auth state persist Yes No Yes
CAPTCHA solving Yes Yes (cloud) No
Stealth mode Yes (12 vectors) Yes (cloud + local) No
TOTP 2FA Yes Yes No
Dependencies 1 (rod) ~50+ packages ~30+ packages
Startup time ~665 ms (container) ~3-5 s ~1-2 s

Memory Profile (Docker container benchmark)

Measured with go run ./examples/benchmark/ inside an Alpine container:

Stage                                  Go Heap     Go Sys     Chrome      Total
-----------------------------------   --------   --------   --------   --------
Baseline (before browser)                0.4 MB      6.4 MB      0.0 MB      6.4 MB
After browser launch                     4.2 MB     15.5 MB    517.1 MB    532.6 MB
After navigation (HN)                    2.4 MB     15.5 MB    570.4 MB    585.9 MB
After DOM extraction                     2.5 MB     15.7 MB    577.6 MB    593.3 MB
After heavy page (Wikipedia)             6.1 MB     15.7 MB    874.5 MB    890.2 MB
After full screenshot                   13.6 MB     40.0 MB   1078.8 MB   1118.8 MB
After GC                                 0.6 MB     40.0 MB    929.9 MB    969.8 MB

Go itself uses 0.6-16 MB heap. Chrome dominates, as it does in every browser automation tool.

Installation

go get github.com/dwoolworth/gosurfer@v0.2.0

Requires Chrome or Chromium. On first run, rod auto-downloads a compatible Chromium if none is found.

Features

AI Agent (Browser Use equivalent)

The agent takes a natural language task, launches a browser, and autonomously figures out how to complete it:

agent, err := gosurfer.NewAgent(gosurfer.AgentConfig{
    Task:      "Search for 'Go programming' and summarize the top 3 results",
    LLM:       gosurfer.NewAnthropic(apiKey, "claude-sonnet-4-20250514"),
    Headless:  true,
    Stealth:   true,
    UseVision: true,  // include screenshots in LLM context
    MaxSteps:  20,
    Verbose:   true,
    OnStep: func(info gosurfer.StepInfo) {
        fmt.Printf("[Step %d] %s -> %s\n", info.Step, info.Action, info.Result)
    },
})

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
defer cancel()

result, err := agent.Run(ctx)
fmt.Printf("Success: %v\nOutput: %s\nSteps: %d\nTokens: %d\n",
    result.Success, result.Output, result.Steps, result.TotalTokens.TotalTokens)
21 Built-in Agent Actions
Action Description
navigate Go to a URL
click Click element by index OR (x,y) viewport coordinates
type Type text into inputs (with {{secret}} placeholder support)
scroll Scroll page or specific element up/down
search Web search via Google, DuckDuckGo, or Bing
go_back Browser history back
wait Pause 1-10 seconds
screenshot Capture viewport
extract Extract page content with a query
send_keys Keyboard events (Enter, Escape, Tab)
select_option Choose from dropdowns
switch_tab Switch between browser tabs
close_tab Close a tab
new_tab Open URL in new tab
upload_file Upload file to input
get_cookies Retrieve all cookies for current page
set_cookie Set a cookie (name, value, domain)
get_storage Read localStorage values
set_storage Write localStorage values
drag Drag element to another element or coordinates
done Signal task completion with result
LLM Providers
// OpenAI
llm := gosurfer.NewOpenAI("sk-...", "gpt-4o")

// Anthropic
llm := gosurfer.NewAnthropic("sk-ant-...", "claude-sonnet-4-20250514")

// Ollama (local)
llm := gosurfer.NewOllama("llama3.1")

// Any OpenAI-compatible API (vLLM, Together, etc.)
llm := gosurfer.NewOpenAICompatible("https://api.together.xyz/v1", "key", "model")
Semantic Locators (Playwright-style)

Find elements by their accessible role, text, label, or test ID — resilient to DOM changes:

// By ARIA role + accessible name
btn, _ := page.GetByRole("button", gosurfer.Name("Sign In"))
link, _ := page.GetByRole("link", gosurfer.Name("About"))

// By visible text
el, _ := page.GetByText("Welcome back")
el, _ = page.GetByText("Welcome", gosurfer.Exact()) // exact match

// By form label (<label for="..."> or aria-label)
input, _ := page.GetByLabel("Email Address")

// By placeholder
search, _ := page.GetByPlaceholder("Search...")

// By data-testid
form, _ := page.GetByTestID("login-form")

// By alt text
img, _ := page.GetByAltText("Company Logo")
Auto-Retrying Assertions (Expect API)

Playwright-inspired assertions that retry until they pass or timeout (default 5s). Eliminates flaky e2e tests:

expect := gosurfer.Expect(page)

// Page-level assertions
expect.ToHaveTitle("Dashboard")
expect.ToHaveURL("https://example.com/home")
expect.ToHaveTitleContaining("Dash")
expect.ToHaveURLContaining("/home")

// Element assertions (auto-retry)
expect.Locator("#status").ToBeVisible()
expect.Locator("#status").ToHaveText("Ready")
expect.Locator("#status").ToContainText("Read")
expect.Locator("#search").ToHaveValue("query")
expect.Locator("#search").ToHaveAttribute("placeholder", "Search...")
expect.Locator("#btn").ToBeEnabled()
expect.Locator("button[disabled]").ToBeDisabled()
expect.Locator("#modal").ToBeHidden()
expect.Locator("input[type=checkbox]").ToBeChecked()
expect.Locator("li.item").ToHaveCount(5)

// Negation
expect.Locator("#modal").Not().ToBeVisible()

// Custom timeout
expect = gosurfer.Expect(page, gosurfer.WithTimeout(10*time.Second))
Auth State Save/Restore

Save login state (cookies + localStorage) to a JSON file and restore it across sessions — skip login in every test:

// After logging in:
page.SaveStorageState("auth.json")

// In subsequent tests:
page, _ := browser.NewPage()
page.Navigate("https://example.com")
page.LoadStorageState("auth.json")
page.Reload() // now authenticated

// Or capture/restore programmatically:
state, _ := page.GetStorageState()
page2.RestoreStorageState(state)
DOM Extraction for LLMs

The key innovation from Browser Use, implemented in Go. DOMState() extracts the page into an indexed format that LLMs can reason about:

state, _ := page.DOMState()
fmt.Println(state.Tree)

Output:

[0]<a href="https://news.ycombinator.com" />
  [1]<img />
    [2]<a href="news">Hacker News</a>
  [3]<a href="newest">new</a>
  [4]<a href="front">past</a>
  [5]<input type="text" name="q" placeholder="Search..." />
  [6]<button type="submit">Search</button>
1.
  [7]<a href="https://example.com">First Story Title</a>
    (example.com)
  [8]<a href="vote?id=123">upvote</a>

Interactive elements get [index] tags. The LLM says {"action":"click","params":{"index":7}} and gosurfer clicks it. Non-interactive text provides context. Shadow DOM is pierced with |SHADOW| markers, iframes with |IFRAME|.

The DOMState struct also includes:

  • Element metadata (tag, attributes, bounding box, CSS selector)
  • Tab list (ID, URL, title for all open tabs)
  • Scroll position, page height, viewport height
  • Optional JPEG screenshot
Stealth Mode (Anti-Detection)

12 evasion vectors ported from puppeteer-extra-plugin-stealth:

browser, _ := gosurfer.NewBrowser(gosurfer.BrowserConfig{
    Headless: true,
    Stealth:  true,  // enables all evasions
})

What it patches:

  1. navigator.webdriver removed
  2. window.chrome runtime emulated
  3. chrome.loadTimes / chrome.csi added
  4. navigator.plugins populated (3 realistic plugins)
  5. navigator.languages set to [en-US, en]
  6. Permissions API fixed (notification quirk)
  7. Window outer dimensions matched to inner
  8. navigator.hardwareConcurrency set to 4
  9. navigator.deviceMemory set to 8GB
  10. WebGL vendor/renderer spoofed (Intel Iris)
  11. Media devices enumerated
  12. Function.prototype.toString patched to return [native code]

Plus Chrome launch flags: --disable-blink-features=AutomationControlled

CAPTCHA Detection and Solving

Detects reCAPTCHA v2/v3, hCaptcha, and Cloudflare Turnstile automatically:

// Detect
info, _ := page.DetectCAPTCHA()
// info.Type: "recaptcha_v2", "recaptcha_v3", "hcaptcha", "turnstile"
// info.SiteKey: extracted from page

// Solve with 2Captcha
solver := gosurfer.NewTwoCaptchaSolver("your-2captcha-api-key")
page.SolveCAPTCHA(ctx, solver)

// Or CapSolver
solver := gosurfer.NewCapSolver("your-capsolver-api-key")

// Or custom callback
solver := &gosurfer.ManualCAPTCHASolver{
    SolveFunc: func(ctx context.Context, info gosurfer.CAPTCHAInfo) (string, error) {
        // Your custom solving logic
        return token, nil
    },
}

In the agent, CAPTCHAs are solved automatically:

agent, _ := gosurfer.NewAgent(gosurfer.AgentConfig{
    Task:          "Login to example.com",
    LLM:           llm,
    CAPTCHASolver: gosurfer.NewTwoCaptchaSolver(apiKey),
})
TOTP 2FA Auto-Generation

Secret keys ending in _totp automatically generate fresh TOTP codes:

agent, _ := gosurfer.NewAgent(gosurfer.AgentConfig{
    Task: "Login to my account",
    LLM:  llm,
    Secrets: map[string]string{
        "username":  "admin",
        "password":  "s3cret",
        "mfa_totp":  "JBSWY3DPEHPK3PXP",  // base32 TOTP secret
    },
})
// When the agent types {{mfa_totp}}, a fresh 6-digit code is generated

Or use directly:

code, _ := gosurfer.GenerateTOTP("JBSWY3DPEHPK3PXP")
// "482913" (changes every 30 seconds)
Tab Management

The agent automatically detects new tabs and can switch between them:

// Tabs are listed in DOMState
state, _ := page.DOMState()
for _, tab := range state.Tabs {
    fmt.Printf("[%s] %s - %s\n", tab.ID, tab.Title, tab.URL)
}

// Agent actions: switch_tab, close_tab, new_tab
// LLM sees tab list and can navigate between them

New tabs opened by target="_blank" links are auto-detected and switched to.

Network Interception and Mocking

Mock API responses without hitting real servers — test your frontend against any backend scenario:

interceptor := page.Intercept()

// Mock a JSON API endpoint
interceptor.MockJSON(`*/api/users*`, 200, map[string]any{
    "users": []map[string]any{{"id": 1, "name": "Alice"}},
})

// Mock with custom status and headers
interceptor.MockText(`*/api/health*`, 503, `{"status":"down"}`,
    "Content-Type", "application/json")

// Full control: inspect request, return custom response
interceptor.OnRequest(`*/api/data*`, func(req *gosurfer.InterceptedRequest) {
    if req.Method() == "POST" {
        req.RespondJSON(201, map[string]any{"created": true})
    } else {
        req.Continue() // let GET requests through
    }
})

// Modify real responses (fetch then alter)
interceptor.OnRequest(`*/api/config*`, func(req *gosurfer.InterceptedRequest) {
    _ = req.LoadResponse()              // fetch real response
    body := req.ResponseBody()          // read it
    req.SetResponseBody(body + "extra") // modify it
})

// Block unwanted requests
interceptor.BlockPatterns(`*.ads.*`, `*tracker*`, `*analytics*`)

interceptor.Start()
defer interceptor.Stop()
Device and Environment Emulation

Emulate mobile devices, geolocation, timezones, network conditions, and more:

// One-liner device emulation with presets
page.EmulateDevice(gosurfer.DeviceIPhoneX)
page.EmulateDevice(gosurfer.DevicePixel7)
page.EmulateDevice(gosurfer.DeviceIPadPro)
page.EmulateDevice(gosurfer.DeviceDesktop1080p)

// Or configure individually
page.SetViewport(1440, 900, 2.0, false)
page.SetUserAgent("Custom/Agent")
page.SetGeolocation(37.7749, -122.4194, 100)  // San Francisco
page.SetTimezone("Asia/Tokyo")
page.SetLocale("ja_JP")

// Network conditions
page.SetOffline(true)                           // simulate offline
page.SetNetworkConditions(150, 1.6*1024*1024, 750*1024) // 3G throttle

// Media features
page.SetColorScheme(gosurfer.ColorSchemeDark)
page.SetReducedMotion(gosurfer.ReducedMotionReduce)
page.SetTouchEnabled(true)

// Permissions
browser.GrantPermissions("https://example.com", "geolocation", "notifications")
browser.ResetPermissions()
Dialog Handling

JavaScript alert(), confirm(), and prompt() dialogs are auto-dismissed by the agent. For manual control:

// Auto-dismiss all dialogs
cancel := page.AutoDismissDialogs()
defer cancel()

// Or handle manually
wait, handle := page.HandleDialog()
go func() {
    dialog := wait()
    fmt.Println(dialog.Type, dialog.Message)
    handle(true, "") // accept
}()
Cookies and Storage

Full cookie and localStorage/sessionStorage management:

// Cookies
cookies, _ := page.GetCookies()
cookie, _ := page.GetCookie("session_id")
page.SetCookie("token", "abc123", ".example.com", "/")
page.DeleteCookies("token")
page.ClearCookies()

// localStorage
page.LocalStorageSet("key", "value")
val, _ := page.LocalStorageGet("key")
page.LocalStorageDelete("key")
page.LocalStorageClear()

// sessionStorage
page.SessionStorageSet("key", "value")
val, _ = page.SessionStorageGet("key")
Drag and Drop
// Element-to-element drag
source, _ := page.Element("#draggable")
target, _ := page.Element("#droppable")
source.DragTo(target)

// Element to coordinates
source.DragToCoordinates(300, 400)

// Coordinate-based drag
page.DragDrop(100, 200, 300, 400)
HAR Recording

Record network traffic in HAR 1.2 format for debugging or analysis:

recorder, _ := page.StartHAR()
page.Navigate("https://example.com")
// ... interact with page ...

data, _ := recorder.Export() // HAR 1.2 JSON bytes
fmt.Printf("Captured %d requests\n", recorder.Entries())
Browser Context Isolation
incognito, _ := browser.Incognito()
defer incognito.Close()
page, _ := incognito.NewPage() // isolated cookies, storage

CLI

gosurfer includes an interactive command-line tool for browser automation:

# Install
go install github.com/dwoolworth/gosurfer/cmd/gosurfer@latest

# Single command
gosurfer open https://example.com
gosurfer screenshot page.png

# Interactive REPL
gosurfer
gosurfer> open https://news.ycombinator.com
gosurfer> state
gosurfer> click "a.storylink"
gosurfer> screenshot hn.png
gosurfer> cookies
gosurfer> har traffic.har
gosurfer> close

Commands: open, click, type, screenshot, pdf, state, eval, cookies, cookie, storage, har, text, html, back, forward, reload, tabs, close.

Set GOSURFER_HEADLESS=false to see the browser window, GOSURFER_STEALTH=true for anti-detection mode.

MCP Server

gosurfer includes an MCP (Model Context Protocol) server that exposes browser automation as tools for AI agents. Agents connect over HTTP+SSE and browse the web without managing browsers.

# Install
go install github.com/dwoolworth/gosurfer/cmd/gosurfer-mcp@latest

# Run
BRAVE_API_KEY=your-key gosurfer-mcp
# Listening on http://localhost:8080/mcp
Tools
Tool Browser? Description
search No Web search via Brave Search API — fast and cheap, no Chrome needed
browse Yes Navigate to URL, return focused content (boilerplate stripped, markdown headings, token-efficient)
browse_full Yes Navigate to URL, return complete DOM state with all interactive elements indexed
screenshot Yes Navigate to URL, capture PNG (viewport or full page)
interact Yes Navigate to URL, execute an action sequence (click, type, scroll, wait), return final state
extract Yes Navigate to URL, evaluate JavaScript, return structured data
pdf Yes Navigate to URL, generate PDF
Stateless Design

Each tool call creates a fresh Chrome tab, does its work, and closes the tab. No session state between calls — any instance can handle any request, making it trivially scalable behind a load balancer.

Configuration
Environment Variable Default Purpose
MCP_PORT 8080 HTTP listen port
BRAVE_API_KEY required Brave Search API key (for search tool)
GOSURFER_PROXY none HTTP/SOCKS proxy for all browser traffic
GOSURFER_PROFILE none Chrome profile directory (persists login state)
GOSURFER_HUMAN true HumanMode: system Chrome + new headless + stealth
GOSURFER_HEADLESS true Set false to show browser window
Example: Connect from Claude Desktop

Add to your MCP configuration:

{
  "mcpServers": {
    "gosurfer": {
      "url": "http://localhost:8080/mcp"
    }
  }
}
Example: interact Tool

Fill a form and submit it in a single call:

{
  "name": "interact",
  "arguments": {
    "url": "https://example.com/login",
    "actions": "[{\"action\":\"type\",\"selector\":\"#email\",\"text\":\"user@test.com\"},{\"action\":\"type\",\"selector\":\"#password\",\"text\":\"secret\"},{\"action\":\"click\",\"selector\":\"#submit\"},{\"action\":\"wait\",\"seconds\":2}]"
  }
}

Returns the final page state after all actions execute — the agent sees the result without needing multiple round trips.

Containerized Deployment
FROM golang:1.23-alpine AS builder
RUN apk add --no-cache git upx
RUN go install github.com/dwoolworth/gosurfer/cmd/gosurfer-mcp@latest \
    && cp /go/bin/gosurfer-mcp /gosurfer-mcp \
    && upx --best --lzma /gosurfer-mcp

FROM alpine:3.20
RUN apk add --no-cache chromium nss freetype harfbuzz ca-certificates ttf-freefont
RUN adduser -D -u 1000 surfer
COPY --from=builder /gosurfer-mcp /home/surfer/gosurfer-mcp
USER surfer
WORKDIR /home/surfer
ENV CHROME_BIN=/usr/bin/chromium-browser GOSURFER_HUMAN=true MCP_PORT=8080
EXPOSE 8080
ENTRYPOINT ["./gosurfer-mcp"]

Run multiple instances behind a load balancer for horizontal scaling. Each instance manages its own Chrome process and can handle concurrent requests.

Docker

FROM golang:1.23-alpine AS builder
RUN apk add --no-cache git upx
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o /app/server . \
    && upx --best --lzma /app/server

FROM alpine:3.20
RUN apk add --no-cache chromium nss freetype harfbuzz ca-certificates ttf-freefont
ENV CHROME_BIN=/usr/bin/chromium-browser
RUN adduser -D app
USER app
COPY --from=builder /app/server .
CMD ["./server"]

For Kubernetes, set resource limits based on the benchmark:

resources:
  requests:
    memory: "512Mi"
    cpu: "250m"
  limits:
    memory: "1.5Gi"  # Chrome can spike during heavy pages
    cpu: "1000m"

Architecture

gosurfer
├── browser.go      Browser lifecycle, launch, stealth flags
├── page.go         Page navigation, interaction, dialogs, popups
├── element.go      Element handles, click/type/select, shadow DOM, iframes
├── dom.go          DOM extraction + LLM serialization (the key innovation)
├── agent.go        AI agent loop with CAPTCHA auto-solve, loop detection
├── action.go       21 agent actions + custom action registry
├── llm.go          OpenAI, Anthropic, Ollama providers (raw net/http)
├── stealth.go      12-vector anti-detection (JS injection + Chrome flags)
├── captcha.go      Detection + solving (2Captcha, CapSolver, manual)
├── totp.go         RFC 6238 TOTP + secrets management
├── network.go      Request interception, blocking, and API mocking
├── storage.go      Cookie + localStorage/sessionStorage management
├── drag.go         Drag and drop operations
├── har.go          HAR 1.2 network traffic recording
├── locator.go      Semantic locators (GetByRole, GetByText, GetByLabel, etc.)
├── expect.go       Auto-retrying Playwright-style assertions
├── auth.go         Storage state save/restore for auth persistence
├── emulation.go    Device, viewport, geolocation, timezone, network emulation
├── prompt.go       Agent system prompt generation
├── cmd/gosurfer/   CLI entry point
└── cmd/gosurfer-mcp/  MCP server for AI agents (HTTP+SSE)
How the Agent Works

Each step:

  1. Extract DOM state (+ optional screenshot)
  2. Check for CAPTCHAs, auto-solve if solver configured
  3. Build LLM prompt: system instructions + action history + current DOM
  4. Call LLM, parse JSON response: {"thought":"...","action":"click","params":{"index":5}}
  5. Execute the action via CDP
  6. Detect new tabs, check for loops
  7. Repeat until done action or max steps

The agent includes:

  • Context summarization: For long tasks, older steps are automatically summarized by the LLM into a running narrative, so the agent retains awareness of earlier actions, extracted data, and progress even beyond the 5-step recent history window. Enabled by default; disable with DisableSummary: true
  • Loop detection: Watches for repeating action patterns and nudges the LLM to try different approaches
  • Auto tab switching: Detects target="_blank" clicks and follows to the new tab
  • Message compaction: Keeps the last 5 steps verbatim, with LLM-generated summaries of older steps injected into the system prompt
  • Secret replacement: {{placeholder}} in typed text is replaced with actual values (TOTP codes generated fresh)
Built on Rod

gosurfer wraps go-rod/rod, the best Go CDP library. Rod provides:

  • Auto-waiting before interactions
  • Chrome lifecycle management
  • Network hijacking via CDP Fetch domain
  • Iframe and shadow DOM traversal

gosurfer adds the AI agent layer, DOM serialization, stealth, CAPTCHA solving, and TOTP on top.

Examples

AI Search Agent
export OPENAI_API_KEY=sk-...
go run ./examples/search/ "What is the population of Tokyo?"
Direct Scraping
go run ./examples/scrape/
Memory Benchmark
# Local
go run ./examples/benchmark/

# In Docker (realistic Kubernetes numbers)
docker build -f examples/benchmark/Dockerfile -t gosurfer-bench .
docker run --rm gosurfer-bench

Test Coverage

76%+ statement coverage across 14 test files (6,100+ lines of tests). Integration tests use a shared headless browser with an httptest server:

go test -timeout 180s ./...

License

MIT License. Concept and design by Derrick Woolworth.

Documentation

Overview

Package gosurfer provides AI-powered browser automation in pure Go.

It wraps headless Chrome via the Chrome DevTools Protocol (CDP) and provides an intelligent agent that can autonomously browse the web, similar to Python's Browser Use library but optimized for Go.

Key features:

  • Pure Go, no Node.js or Python dependency
  • LLM-driven autonomous browsing (OpenAI, Anthropic, Ollama)
  • Smart DOM extraction and serialization for LLM consumption
  • Auto-waiting, network interception, screenshot/PDF
  • Fits in a <100MB Docker container

Index

Constants

This section is empty.

Variables

View Source
var (
	DeviceIPhoneX = Device{
		Name:      "iPhone X",
		UserAgent: "Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Mobile/15E148 Safari/604.1",
		Width:     375,
		Height:    812,
		Scale:     3.0,
		Mobile:    true,
		Touch:     true,
	}
	DeviceIPhone14Pro = Device{
		Name:      "iPhone 14 Pro",
		UserAgent: "Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Mobile/15E148 Safari/604.1",
		Width:     393,
		Height:    852,
		Scale:     3.0,
		Mobile:    true,
		Touch:     true,
	}
	DevicePixel7 = Device{
		Name:      "Pixel 7",
		UserAgent: "Mozilla/5.0 (Linux; Android 14; Pixel 7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Mobile Safari/537.36",
		Width:     412,
		Height:    915,
		Scale:     2.625,
		Mobile:    true,
		Touch:     true,
	}
	DeviceIPadPro = Device{
		Name:      "iPad Pro 12.9",
		UserAgent: "Mozilla/5.0 (iPad; CPU OS 17_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Mobile/15E148 Safari/604.1",
		Width:     1024,
		Height:    1366,
		Scale:     2.0,
		Mobile:    true,
		Touch:     true,
	}
	DeviceGalaxyS23 = Device{
		Name:      "Galaxy S23",
		UserAgent: "Mozilla/5.0 (Linux; Android 14; SM-S911B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Mobile Safari/537.36",
		Width:     360,
		Height:    780,
		Scale:     3.0,
		Mobile:    true,
		Touch:     true,
	}
	DeviceDesktop1080p = Device{
		Name:      "Desktop 1080p",
		UserAgent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
		Width:     1920,
		Height:    1080,
		Scale:     1.0,
		Mobile:    false,
		Touch:     false,
	}
	DeviceDesktop4K = Device{
		Name:      "Desktop 4K",
		UserAgent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
		Width:     3840,
		Height:    2160,
		Scale:     2.0,
		Mobile:    false,
		Touch:     false,
	}
)

Pre-configured device profiles.

Functions

func ApplyStealth

func ApplyStealth(p *Page) error

ApplyStealth injects anti-detection scripts into a page. Must be called before navigating to a URL. The scripts run before any page JavaScript, patching common bot-detection vectors.

func GenerateTOTP

func GenerateTOTP(secret string) (string, error)

GenerateTOTP generates a 6-digit TOTP code from a base32-encoded secret. Implements RFC 6238 with a 30-second time step and HMAC-SHA1.

func GenerateTOTPAt

func GenerateTOTPAt(secret string, t time.Time) (string, error)

GenerateTOTPAt generates a TOTP code for a specific time.

func HumanDelay added in v0.2.8

func HumanDelay(base, jitter time.Duration)

HumanDelay pauses for a random duration that mimics human reaction time. Base is the minimum delay; jitter adds randomness up to that additional amount.

gosurfer.HumanDelay(200*time.Millisecond, 300*time.Millisecond) // 200-500ms

Types

type ActionContext

type ActionContext struct {
	Page    *Page
	State   *DOMState
	Browser *Browser
	Agent   *Agent
}

ActionContext provides the browser state to action handlers.

type ActionDef

type ActionDef struct {
	Name        string     `json:"name"`
	Description string     `json:"description"`
	Params      []ParamDef `json:"params"`
	Run         func(ctx context.Context, ac ActionContext, params map[string]interface{}) (string, error)
}

ActionDef defines a browser action that the agent can execute.

type ActionRegistry

type ActionRegistry struct {
	// contains filtered or unexported fields
}

ActionRegistry manages available actions.

func DefaultActions

func DefaultActions() *ActionRegistry

DefaultActions returns the built-in action set.

func NewActionRegistry

func NewActionRegistry() *ActionRegistry

NewActionRegistry creates an empty action registry.

func (*ActionRegistry) Actions

func (r *ActionRegistry) Actions() []*ActionDef

Actions returns all registered actions in order.

func (*ActionRegistry) Get

func (r *ActionRegistry) Get(name string) (*ActionDef, bool)

Get returns an action by name.

func (*ActionRegistry) Register

func (r *ActionRegistry) Register(action *ActionDef)

Register adds an action to the registry.

type Agent

type Agent struct {
	// contains filtered or unexported fields
}

Agent is an LLM-driven autonomous browser that completes tasks.

func NewAgent

func NewAgent(config AgentConfig) (*Agent, error)

NewAgent creates a new browsing agent.

func (*Agent) Run

func (a *Agent) Run(ctx context.Context) (*AgentResult, error)

Run executes the agent's task and returns the result.

type AgentConfig

type AgentConfig struct {
	// Task is the natural language description of what to accomplish.
	Task string

	// LLM is the language model provider for decision-making.
	LLM LLMProvider

	// Browser is an existing browser instance. If nil, a new headless one is created.
	Browser *Browser

	// MaxSteps is the maximum number of agent steps (default 50).
	MaxSteps int

	// MaxFailures is the max consecutive failures before stopping (default 5).
	MaxFailures int

	// UseVision includes screenshots in LLM context when true.
	UseVision bool

	// Headless controls browser visibility if a new browser is created.
	Headless bool

	// OnStep is called after each step with progress info.
	OnStep func(StepInfo)

	// Verbose enables detailed logging.
	Verbose bool

	// MaxTokens sets the max tokens per LLM call (default 4096).
	MaxTokens int

	// Temperature sets the LLM sampling temperature (default 0.0).
	Temperature float64

	// CAPTCHASolver is an optional CAPTCHA solving backend.
	// When set, the agent automatically detects and solves CAPTCHAs.
	CAPTCHASolver CAPTCHASolver

	// Secrets stores sensitive data (credentials, TOTP secrets).
	// Keys ending in "_totp" auto-generate TOTP codes on access.
	// Use {{key_name}} placeholders in typed text for auto-replacement.
	Secrets map[string]string

	// Stealth enables anti-detection mode on the browser.
	Stealth bool

	// DisableSummary disables LLM-powered context summarization for long tasks.
	// When false (default), the agent summarizes older steps to maintain awareness
	// of earlier context beyond the 5-step recent history window.
	DisableSummary bool

	// SummaryLLM is an optional cheaper/faster LLM used for context summarization.
	// If nil, the main LLM is used. Useful for pairing an expensive reasoning model
	// (e.g. Opus, GPT-4) with a cheaper summarizer (e.g. Haiku, GPT-4.1-mini).
	SummaryLLM LLMProvider

	// FocusContent strips boilerplate (nav, footer, cookie banners, social links,
	// terms/privacy links) from DOM extraction, reducing token usage by 30-60%.
	// Content is focused on <main>, <article>, [role="main"] regions.
	FocusContent bool
}

AgentConfig configures the AI browsing agent.

type AgentResult

type AgentResult struct {
	// Success indicates whether the task was completed.
	Success bool

	// Output is the final answer or result text.
	Output string

	// Steps is the total number of steps taken.
	Steps int

	// History contains details of each step.
	History []StepInfo

	// TotalTokens is the cumulative token usage.
	TotalTokens TokenUsage
}

AgentResult is the final result of an agent run.

type AnthropicProvider

type AnthropicProvider struct {
	// contains filtered or unexported fields
}

AnthropicProvider implements LLMProvider for the Anthropic Messages API.

func NewAnthropic

func NewAnthropic(apiKey, model string) *AnthropicProvider

NewAnthropic creates an Anthropic provider.

func (*AnthropicProvider) ChatCompletion

func (p *AnthropicProvider) ChatCompletion(ctx context.Context, messages []ChatMessage, opts ...ChatOption) (*ChatResponse, error)

func (*AnthropicProvider) Name

func (p *AnthropicProvider) Name() string

type BoundingBox

type BoundingBox struct {
	X      float64 `json:"x"`
	Y      float64 `json:"y"`
	Width  float64 `json:"width"`
	Height float64 `json:"height"`
}

BoundingBox represents an element's position and size.

type Browser

type Browser struct {
	// contains filtered or unexported fields
}

Browser wraps a Chrome/Chromium instance.

func ConnectBrowser

func ConnectBrowser(wsURL string, cfg ...BrowserConfig) (*Browser, error)

ConnectBrowser connects to an existing Chrome instance via a CDP WebSocket URL.

func NewBrowser

func NewBrowser(cfg ...BrowserConfig) (*Browser, error)

NewBrowser launches a new Chrome/Chromium instance.

func (*Browser) Close

func (b *Browser) Close() error

Close shuts down the browser.

func (*Browser) GrantPermissions added in v0.2.0

func (b *Browser) GrantPermissions(origin string, permissions ...string) error

GrantPermissions grants browser permissions for the given origin. Common permissions: "geolocation", "notifications", "camera", "microphone", "clipboard-read", "clipboard-write".

func (*Browser) HandleAuth

func (b *Browser) HandleAuth(username, password string) func() error

HandleAuth sets up HTTP Basic authentication handling.

func (*Browser) Incognito

func (b *Browser) Incognito() (*Browser, error)

Incognito creates an isolated browser context with separate cookies/storage.

func (*Browser) NewPage

func (b *Browser) NewPage() (*Page, error)

NewPage creates a new browser tab.

func (*Browser) PageByURL

func (b *Browser) PageByURL(urlRegex string) (*Page, error)

PageByURL finds an open page whose URL matches the regex pattern.

func (*Browser) Pages

func (b *Browser) Pages() ([]*Page, error)

Pages returns all open pages/tabs.

func (*Browser) ResetPermissions added in v0.2.0

func (b *Browser) ResetPermissions() error

ResetPermissions resets all permission overrides.

func (*Browser) Rod

func (b *Browser) Rod() *rod.Browser

Rod returns the underlying rod.Browser for advanced usage.

func (*Browser) WaitDownload

func (b *Browser) WaitDownload() func() []byte

WaitDownload sets up a download handler that returns file bytes when the next download completes. Call before triggering the download.

type BrowserConfig

type BrowserConfig struct {
	// Headless runs the browser without a visible window.
	Headless bool

	// ExecPath is the path to Chrome/Chromium. If empty, rod auto-detects.
	ExecPath string

	// UserDataDir persists browser state (cookies, localStorage, etc.).
	UserDataDir string

	// Proxy sets an HTTP/SOCKS proxy (e.g., "socks5://127.0.0.1:1080").
	Proxy string

	// WindowWidth sets the browser window width. Default: 1280.
	WindowWidth int

	// WindowHeight sets the browser window height. Default: 720.
	WindowHeight int

	// NoSandbox disables the Chromium sandbox (required in Docker).
	NoSandbox bool

	// Stealth enables anti-detection mode (patches navigator.webdriver,
	// spoofs plugins/WebGL, sets realistic user agent, etc.).
	Stealth bool

	// HumanMode enables maximum anti-detection: system Chrome, new headless mode,
	// stealth patches, and human-like behavior (random delays, mouse movement).
	// Automatically sets Stealth=true and uses --headless=new.
	HumanMode bool

	// AllowedDomains restricts navigation to these domains (glob patterns).
	AllowedDomains []string

	// BlockedDomains prevents navigation to these domains (glob patterns).
	BlockedDomains []string

	// ChallengeWaitTimeout controls how long Navigate() waits for an
	// auto-solvable bot-protection challenge (e.g., Cloudflare's "Just a
	// moment..." JS challenge) to clear. If 0, the default of 15s is used.
	// Set to -1 to disable auto-waiting entirely.
	ChallengeWaitTimeout time.Duration
}

BrowserConfig configures browser launch options.

type CAPTCHAInfo

type CAPTCHAInfo struct {
	Type    CAPTCHAType
	SiteKey string
	PageURL string
}

CAPTCHAInfo describes a detected CAPTCHA on a page.

type CAPTCHASolver

type CAPTCHASolver interface {
	// Solve sends the CAPTCHA to a solving service and returns the token.
	Solve(ctx context.Context, info CAPTCHAInfo) (string, error)
	// Name returns the solver name for logging.
	Name() string
}

CAPTCHASolver is the interface for CAPTCHA solving backends.

type CAPTCHAType

type CAPTCHAType string

CAPTCHAType identifies the CAPTCHA provider.

const (
	CAPTCHAReCaptchaV2 CAPTCHAType = "recaptcha_v2"
	CAPTCHAReCaptchaV3 CAPTCHAType = "recaptcha_v3"
	CAPTCHAHCaptcha    CAPTCHAType = "hcaptcha"
	CAPTCHATurnstile   CAPTCHAType = "turnstile"
)

type CapSolver

type CapSolver struct {
	APIKey  string
	BaseURL string // default: https://api.capsolver.com
	// contains filtered or unexported fields
}

CapSolver implements CAPTCHASolver using the capsolver.com API.

func NewCapSolver

func NewCapSolver(apiKey string) *CapSolver

NewCapSolver creates a CapSolver solver.

func (*CapSolver) Name

func (s *CapSolver) Name() string

func (*CapSolver) Solve

func (s *CapSolver) Solve(ctx context.Context, info CAPTCHAInfo) (string, error)

type ChallengeType added in v0.5.0

type ChallengeType string

ChallengeType identifies a specific bot-protection challenge.

const (
	// ChallengeNone means the page is not currently showing a challenge.
	ChallengeNone ChallengeType = ""
	// ChallengeCloudflareUAM is Cloudflare "Under Attack Mode" — the classic
	// "Just a moment..." JavaScript challenge that resolves in ~5-15 seconds.
	ChallengeCloudflareUAM ChallengeType = "cloudflare_uam"
	// ChallengeCloudflareTurnstile is Cloudflare's newer interactive challenge.
	// These usually require user interaction and cannot be auto-solved.
	ChallengeCloudflareTurnstile ChallengeType = "cloudflare_turnstile"
	// ChallengeDataDome is captcha-delivery.com's fingerprint-based blocker.
	// Usually not auto-solvable; detection exists so callers can fail fast.
	ChallengeDataDome ChallengeType = "datadome"
)

func (ChallengeType) IsAutoSolvable added in v0.5.0

func (c ChallengeType) IsAutoSolvable() bool

IsAutoSolvable reports whether a challenge type can be solved by waiting. Cloudflare UAM's JS challenge resolves itself; Turnstile and DataDome require user interaction or specialized bypass tooling.

type ChatMessage

type ChatMessage struct {
	Role    string        `json:"role"` // "system", "user", "assistant"
	Content []ContentPart `json:"-"`
}

ChatMessage represents a message in the conversation.

func ImageMessage

func ImageMessage(role, text string, imageData []byte, mimeType string) ChatMessage

ImageMessage creates a message with text and an image.

func TextMessage

func TextMessage(role, text string) ChatMessage

TextMessage creates a simple text message.

type ChatOption

type ChatOption func(*chatConfig)

ChatOption configures a chat completion request.

func WithJSONMode

func WithJSONMode() ChatOption

WithJSONMode requests JSON output from the model.

func WithMaxTokens

func WithMaxTokens(n int) ChatOption

WithMaxTokens sets the maximum response tokens.

func WithTemperature

func WithTemperature(t float64) ChatOption

WithTemperature sets the sampling temperature.

type ChatResponse

type ChatResponse struct {
	Content string     `json:"content"`
	Usage   TokenUsage `json:"usage"`
}

ChatResponse is the LLM's response.

type ColorScheme added in v0.2.0

type ColorScheme string

ColorScheme represents a CSS color scheme preference.

const (
	ColorSchemeLight        ColorScheme = "light"
	ColorSchemeDark         ColorScheme = "dark"
	ColorSchemeNoPreference ColorScheme = ""
)

type ContentPart

type ContentPart struct {
	Type     string `json:"type"` // "text" or "image"
	Text     string `json:"text,omitempty"`
	ImageB64 string `json:"image_b64,omitempty"`
	MimeType string `json:"mime_type,omitempty"`
}

ContentPart is a piece of a message (text or image).

type Cookie struct {
	Name     string  `json:"name"`
	Value    string  `json:"value"`
	Domain   string  `json:"domain"`
	Path     string  `json:"path"`
	Expires  float64 `json:"expires,omitempty"`
	HTTPOnly bool    `json:"httpOnly,omitempty"`
	Secure   bool    `json:"secure,omitempty"`
	SameSite string  `json:"sameSite,omitempty"`
}

Cookie represents a browser cookie.

type DOMElement

type DOMElement struct {
	Index        int               `json:"index"`
	Tag          string            `json:"tag"`
	Text         string            `json:"text"`
	Attributes   map[string]string `json:"attributes"`
	Rect         BoundingBox       `json:"rect"`
	IsEditable   bool              `json:"is_editable"`
	IsScrollable bool              `json:"is_scrollable"`
	Depth        int               `json:"depth"`
	CSSSelector  string            `json:"css_selector"`
}

DOMElement represents an interactive element extracted from the page.

type DOMService

type DOMService struct {
	// contains filtered or unexported fields
}

DOMService handles DOM extraction and serialization.

func (*DOMService) GetFocusedState added in v0.2.3

func (d *DOMService) GetFocusedState() (*DOMState, error)

GetFocusedState extracts the DOM state with boilerplate stripped. Removes navigation, footers, cookie banners, ad containers, and low-value links (terms, privacy, copyright, same-page anchors, social media). Focuses on <main>, <article>, [role="main"] content regions.

func (*DOMService) GetState

func (d *DOMService) GetState() (*DOMState, error)

GetState extracts the current DOM state, serialized for LLM consumption.

type DOMState

type DOMState struct {
	// URL is the current page URL.
	URL string `json:"url"`

	// Title is the page title.
	Title string `json:"title"`

	// Tree is the serialized DOM in indexed-element format for LLM consumption.
	// Interactive elements are tagged with [index] prefixes.
	Tree string `json:"tree"`

	// Elements maps element indices to their metadata for action execution.
	Elements map[int]*DOMElement `json:"elements"`

	// Tabs lists all open browser tabs.
	Tabs []TabInfo `json:"tabs,omitempty"`

	// Screenshot is an optional JPEG screenshot of the current viewport.
	Screenshot []byte `json:"-"`

	// ScrollPosition indicates current scroll percentage (0-100).
	ScrollPosition float64 `json:"scroll_position"`

	// PageHeight is the total page height in pixels.
	PageHeight float64 `json:"page_height"`

	// ViewportHeight is the visible viewport height in pixels.
	ViewportHeight float64 `json:"viewport_height"`
}

DOMState represents the current page state, optimized for LLM consumption.

type Device added in v0.2.0

type Device struct {
	Name      string
	UserAgent string
	Width     int
	Height    int
	Scale     float64
	Mobile    bool
	Touch     bool
}

Device represents a pre-configured device profile for emulation.

type Dialog

type Dialog struct {
	Type          string // "alert", "confirm", "prompt", "beforeunload"
	Message       string
	DefaultPrompt string
	// contains filtered or unexported fields
}

Dialog represents a JavaScript dialog (alert, confirm, prompt, beforeunload).

type Element

type Element struct {
	Index int // Index in DOM serialization (set by DOMState)
	// contains filtered or unexported fields
}

Element wraps a DOM element with interaction methods.

func (*Element) Attribute

func (e *Element) Attribute(name string) (string, error)

Attribute returns the value of an HTML attribute.

func (*Element) BBox

func (e *Element) BBox() (*BoundingBox, error)

BBox returns the element's bounding box in viewport coordinates.

func (*Element) Clear

func (e *Element) Clear() error

Clear clears the element's value by selecting all text and deleting it.

func (*Element) ClearAndType

func (e *Element) ClearAndType(text string) error

ClearAndType clears the element then types new text.

func (*Element) Click

func (e *Element) Click() error

Click clicks the element. It waits for the element to be visible and stable.

func (*Element) DoubleClick

func (e *Element) DoubleClick() error

DoubleClick double-clicks the element.

func (*Element) DragTo

func (e *Element) DragTo(target *Element) error

DragTo drags this element to a target element.

func (*Element) DragToCoordinates

func (e *Element) DragToCoordinates(x, y float64) error

DragToCoordinates drags this element to specific viewport coordinates.

func (*Element) Focus

func (e *Element) Focus() error

Focus sets focus on the element.

func (*Element) Frame

func (e *Element) Frame() (*Page, error)

Frame returns a Page representing the content of this iframe element. Panics if the element is not an iframe.

func (*Element) HTML

func (e *Element) HTML() (string, error)

HTML returns the element's outer HTML.

func (*Element) Hover

func (e *Element) Hover() error

Hover moves the mouse over the element.

func (*Element) HumanClick added in v0.2.8

func (e *Element) HumanClick() error

HumanClick clicks the element with random offset from center.

func (*Element) HumanType added in v0.2.8

func (e *Element) HumanType(text string) error

HumanType types text with random inter-keystroke delays.

func (*Element) Rod

func (e *Element) Rod() *rod.Element

Rod returns the underlying rod.Element for advanced usage.

func (*Element) Screenshot

func (e *Element) Screenshot() ([]byte, error)

Screenshot captures a screenshot of just this element.

func (*Element) ScrollIntoView

func (e *Element) ScrollIntoView() error

ScrollIntoView scrolls the element into the viewport.

func (*Element) SelectOption

func (e *Element) SelectOption(texts ...string) error

SelectOption selects a dropdown option by its visible text.

func (*Element) SelectOptionByValue

func (e *Element) SelectOptionByValue(values ...string) error

SelectOptionByValue selects a dropdown option by its value attribute.

func (*Element) ShadowRoot

func (e *Element) ShadowRoot() (*Element, error)

ShadowRoot returns the shadow root of this element for querying shadow DOM.

func (*Element) Text

func (e *Element) Text() (string, error)

Text returns the visible text content of the element.

func (*Element) Type

func (e *Element) Type(text string) error

Type types text into the element. It clears existing text first if clear is true.

func (*Element) UploadFile

func (e *Element) UploadFile(paths ...string) error

UploadFile sets files on a file input element.

func (*Element) Visible

func (e *Element) Visible() (bool, error)

Visible returns whether the element is visible.

func (*Element) WaitStable

func (e *Element) WaitStable() error

WaitStable waits until the element's position stops changing.

func (*Element) WaitVisible

func (e *Element) WaitVisible() error

WaitVisible waits until the element becomes visible.

type ExpectOption added in v0.1.1

type ExpectOption func(*PageExpect)

ExpectOption configures assertion behavior.

func WithTimeout added in v0.1.1

func WithTimeout(d time.Duration) ExpectOption

WithTimeout sets the retry timeout for assertions.

type HARRecorder

type HARRecorder struct {
	// contains filtered or unexported fields
}

HARRecorder captures all network traffic on a page in HAR 1.2 format. Start recording before navigation to capture everything.

func (*HARRecorder) Entries

func (rec *HARRecorder) Entries() int

Entries returns the number of recorded requests.

func (*HARRecorder) Export

func (rec *HARRecorder) Export() ([]byte, error)

Export returns the recorded traffic as a HAR 1.2 JSON byte slice.

type InterceptedRequest

type InterceptedRequest struct {
	// contains filtered or unexported fields
}

InterceptedRequest wraps a hijacked network request.

func (*InterceptedRequest) Abort

func (r *InterceptedRequest) Abort()

Abort blocks the request.

func (*InterceptedRequest) Body

func (r *InterceptedRequest) Body() string

Body returns the request body.

func (*InterceptedRequest) Continue

func (r *InterceptedRequest) Continue()

Continue allows the request to proceed normally.

func (*InterceptedRequest) Header

func (r *InterceptedRequest) Header(key string) string

Header returns a single request header value.

func (*InterceptedRequest) LoadResponse added in v0.2.0

func (r *InterceptedRequest) LoadResponse() error

LoadResponse fetches the real response from the server. Call this before reading or modifying the response.

func (*InterceptedRequest) Method

func (r *InterceptedRequest) Method() string

Method returns the HTTP method.

func (*InterceptedRequest) Respond

func (r *InterceptedRequest) Respond(status int, body string, headerPairs ...string)

Respond returns a custom response without hitting the real server.

req.Respond(200, "OK", "Content-Type", "text/plain")

func (*InterceptedRequest) RespondJSON added in v0.2.0

func (r *InterceptedRequest) RespondJSON(status int, data interface{})

RespondJSON returns a JSON response without hitting the real server. The data parameter is marshaled to JSON automatically.

req.RespondJSON(200, map[string]any{"status": "ok", "count": 42})

func (*InterceptedRequest) ResponseBody added in v0.2.0

func (r *InterceptedRequest) ResponseBody() string

ResponseBody returns the response body (after LoadResponse).

func (*InterceptedRequest) ResponseStatus added in v0.2.0

func (r *InterceptedRequest) ResponseStatus() int

ResponseStatus returns the response status code (after LoadResponse).

func (*InterceptedRequest) SetResponseBody added in v0.2.0

func (r *InterceptedRequest) SetResponseBody(body string)

SetResponseBody replaces the response body (after LoadResponse).

func (*InterceptedRequest) SetResponseHeader added in v0.2.0

func (r *InterceptedRequest) SetResponseHeader(pairs ...string)

SetResponseHeader sets a response header (after LoadResponse).

func (*InterceptedRequest) URL

func (r *InterceptedRequest) URL() string

URL returns the request URL.

type LLMProvider

type LLMProvider interface {
	// ChatCompletion sends messages to the LLM and returns a response.
	ChatCompletion(ctx context.Context, messages []ChatMessage, opts ...ChatOption) (*ChatResponse, error)
	// Name returns the provider/model name.
	Name() string
}

LLMProvider defines the interface for language model backends.

type LocatorExpect added in v0.1.1

type LocatorExpect struct {
	// contains filtered or unexported fields
}

LocatorExpect provides auto-retrying element-level assertions.

func (*LocatorExpect) Not added in v0.1.1

func (le *LocatorExpect) Not() *LocatorExpect

Not returns a negated LocatorExpect where all assertions are inverted.

expect.Locator("#modal").Not().ToBeVisible() // asserts element is NOT visible

func (*LocatorExpect) ToBeChecked added in v0.1.1

func (le *LocatorExpect) ToBeChecked() error

ToBeChecked asserts the element (checkbox/radio) is checked.

func (*LocatorExpect) ToBeDisabled added in v0.1.1

func (le *LocatorExpect) ToBeDisabled() error

ToBeDisabled asserts the element has the disabled attribute.

func (*LocatorExpect) ToBeEnabled added in v0.1.1

func (le *LocatorExpect) ToBeEnabled() error

ToBeEnabled asserts the element is not disabled.

func (*LocatorExpect) ToBeHidden added in v0.1.1

func (le *LocatorExpect) ToBeHidden() error

ToBeHidden asserts the element is hidden or not in the DOM.

func (*LocatorExpect) ToBeVisible added in v0.1.1

func (le *LocatorExpect) ToBeVisible() error

ToBeVisible asserts the element is visible on the page.

func (*LocatorExpect) ToContainText added in v0.1.1

func (le *LocatorExpect) ToContainText(substring string) error

ToContainText asserts the element's text content contains the substring.

func (*LocatorExpect) ToHaveAttribute added in v0.1.1

func (le *LocatorExpect) ToHaveAttribute(name, value string) error

ToHaveAttribute asserts the element has an attribute with the given value.

func (*LocatorExpect) ToHaveCount added in v0.1.1

func (le *LocatorExpect) ToHaveCount(expected int) error

ToHaveCount asserts the number of elements matching the selector.

func (*LocatorExpect) ToHaveText added in v0.1.1

func (le *LocatorExpect) ToHaveText(expected string) error

ToHaveText asserts the element's text content equals the expected string.

func (*LocatorExpect) ToHaveValue added in v0.1.1

func (le *LocatorExpect) ToHaveValue(expected string) error

ToHaveValue asserts the element's value equals the expected string.

type LocatorOption added in v0.1.1

type LocatorOption func(*locatorConfig)

LocatorOption configures semantic locator behavior.

func Exact added in v0.1.1

func Exact() LocatorOption

Exact requires an exact text match instead of substring.

func Name added in v0.1.1

func Name(name string) LocatorOption

Name filters elements by their accessible name (for GetByRole).

type ManualCAPTCHASolver

type ManualCAPTCHASolver struct {
	SolveFunc func(ctx context.Context, info CAPTCHAInfo) (string, error)
}

ManualCAPTCHASolver calls a user-provided function to solve CAPTCHAs. Useful for custom solving services or human-in-the-loop flows.

func (*ManualCAPTCHASolver) Name

func (s *ManualCAPTCHASolver) Name() string

func (*ManualCAPTCHASolver) Solve

func (s *ManualCAPTCHASolver) Solve(ctx context.Context, info CAPTCHAInfo) (string, error)

type NetworkInterceptor

type NetworkInterceptor struct {
	// contains filtered or unexported fields
}

NetworkInterceptor manages request interception for a page.

func (*NetworkInterceptor) BlockPatterns

func (ni *NetworkInterceptor) BlockPatterns(patterns ...string) *NetworkInterceptor

BlockPatterns blocks all requests matching the given URL glob patterns. Useful for blocking ads, trackers, and large resources. Example: "*analytics*", "*.ads.*", "*tracker*"

func (*NetworkInterceptor) MockJSON added in v0.2.0

func (ni *NetworkInterceptor) MockJSON(pattern string, status int, data interface{}) *NetworkInterceptor

MockJSON intercepts requests matching the pattern and returns a JSON response. This is the simplest way to mock an API endpoint:

interceptor.MockJSON(`/api/users`, 200, map[string]any{
    "users": []map[string]any{{"id": 1, "name": "Alice"}},
})

func (*NetworkInterceptor) MockText added in v0.2.0

func (ni *NetworkInterceptor) MockText(pattern string, status int, body string, headerPairs ...string) *NetworkInterceptor

MockText intercepts requests matching the pattern and returns a text response.

func (*NetworkInterceptor) OnRequest

func (ni *NetworkInterceptor) OnRequest(pattern string, handler func(req *InterceptedRequest)) *NetworkInterceptor

OnRequest adds a route that intercepts matching requests. The pattern uses CDP URL glob syntax: * matches any characters. Example: "*api/users*" matches any URL containing "api/users".

func (*NetworkInterceptor) Start

func (ni *NetworkInterceptor) Start()

Start begins intercepting requests. Call Stop() when done.

func (*NetworkInterceptor) Stop

func (ni *NetworkInterceptor) Stop() error

Stop stops intercepting requests.

type NetworkRoute

type NetworkRoute struct {
	// Pattern matches request URLs (regex).
	Pattern string

	// Handler processes matching requests.
	Handler func(req *InterceptedRequest)
}

NetworkRoute defines a rule for intercepting network requests.

type OpenAIProvider

type OpenAIProvider struct {
	// contains filtered or unexported fields
}

OpenAIProvider implements LLMProvider for OpenAI-compatible APIs.

func NewOllama

func NewOllama(model string) *OpenAIProvider

NewOllama creates an Ollama provider (OpenAI-compatible local inference).

func NewOpenAI

func NewOpenAI(apiKey, model string) *OpenAIProvider

NewOpenAI creates an OpenAI provider.

func NewOpenAICompatible

func NewOpenAICompatible(baseURL, apiKey, model string) *OpenAIProvider

NewOpenAICompatible creates a provider for OpenAI-compatible APIs (e.g., Ollama, vLLM).

func (*OpenAIProvider) ChatCompletion

func (p *OpenAIProvider) ChatCompletion(ctx context.Context, messages []ChatMessage, opts ...ChatOption) (*ChatResponse, error)

func (*OpenAIProvider) Name

func (p *OpenAIProvider) Name() string

type Page

type Page struct {
	// contains filtered or unexported fields
}

Page wraps a browser tab with high-level interaction methods.

func (*Page) AutoDismissDialogs

func (p *Page) AutoDismissDialogs() func()

AutoDismissDialogs sets up automatic handling of JS dialogs. Alerts are accepted, confirms are accepted, prompts are dismissed. Returns a cancel function to stop auto-handling.

func (*Page) Back

func (p *Page) Back() error

Back navigates backward in history.

func (*Page) ClearCookies

func (p *Page) ClearCookies() error

ClearCookies deletes all cookies.

func (*Page) ClearGeolocation added in v0.2.0

func (p *Page) ClearGeolocation() error

ClearGeolocation removes the geolocation override.

func (*Page) Click

func (p *Page) Click(selector string) error

Click finds an element by selector and clicks it.

func (*Page) Close

func (p *Page) Close() error

Close closes this page/tab.

func (*Page) DOMState

func (p *Page) DOMState() (*DOMState, error)

DOMState extracts the current page state optimized for LLM consumption. This is the key method for AI agent integration.

func (*Page) DOMStateWithScreenshot

func (p *Page) DOMStateWithScreenshot() (*DOMState, error)

DOMStateWithScreenshot extracts DOM state and captures a screenshot.

func (*Page) DeleteCookies

func (p *Page) DeleteCookies(name string) error

DeleteCookies deletes cookies matching the given name.

func (*Page) DetectCAPTCHA

func (p *Page) DetectCAPTCHA() (*CAPTCHAInfo, error)

DetectCAPTCHA inspects the current page for known CAPTCHA providers. Returns nil if no CAPTCHA is found.

func (*Page) DetectChallenge added in v0.5.0

func (p *Page) DetectChallenge() (ChallengeType, error)

DetectChallenge returns the bot-protection challenge currently shown on the page, or ChallengeNone if the page looks like real content.

func (*Page) DragDrop

func (p *Page) DragDrop(fromX, fromY, toX, toY float64) error

DragDrop performs a drag from one coordinate to another on the page.

func (*Page) Element

func (p *Page) Element(selector string) (*Element, error)

Element finds a single element by CSS selector (waits for it to appear).

func (*Page) ElementByXPath

func (p *Page) ElementByXPath(xpath string) (*Element, error)

ElementByXPath finds an element by XPath expression.

func (*Page) Elements

func (p *Page) Elements(selector string) ([]*Element, error)

Elements finds all matching elements by CSS selector.

func (*Page) EmulateDevice added in v0.2.0

func (p *Page) EmulateDevice(d Device) error

EmulateDevice applies a device profile to the page (viewport, user agent, touch).

page.EmulateDevice(gosurfer.DeviceIPhoneX)

func (*Page) Eval

func (p *Page) Eval(js string, args ...interface{}) (interface{}, error)

Eval evaluates JavaScript in the page context and returns the result.

func (*Page) FocusedDOMState added in v0.2.3

func (p *Page) FocusedDOMState() (*DOMState, error)

FocusedDOMState extracts a pruned DOM state with boilerplate stripped. Removes nav, footer, cookie banners, ad containers, social links, and low-value links (terms, privacy, copyright, same-page anchors). Focuses on <main>, <article>, [role="main"] content regions. Typically produces 30-60% fewer tokens than DOMState.

func (*Page) Forward

func (p *Page) Forward() error

Forward navigates forward in history.

func (*Page) Frames

func (p *Page) Frames() ([]*Page, error)

Frames returns all iframes on the page as Page instances.

func (*Page) FullScreenshot

func (p *Page) FullScreenshot() ([]byte, error)

FullScreenshot captures the entire page (scrolled) as PNG bytes.

func (*Page) GetAllByRole added in v0.1.1

func (p *Page) GetAllByRole(role string, opts ...LocatorOption) ([]*Element, error)

GetAllByRole finds all elements matching the given ARIA role.

func (*Page) GetAllByText added in v0.1.1

func (p *Page) GetAllByText(text string, opts ...LocatorOption) ([]*Element, error)

GetAllByText finds all elements whose text content matches.

func (*Page) GetByAltText added in v0.1.1

func (p *Page) GetByAltText(text string, opts ...LocatorOption) (*Element, error)

GetByAltText finds the first element with a matching alt attribute.

func (*Page) GetByLabel added in v0.1.1

func (p *Page) GetByLabel(text string, opts ...LocatorOption) (*Element, error)

GetByLabel finds the first form element associated with a label matching the text. Checks <label for="...">, nested inputs inside <label>, and aria-label attributes.

func (*Page) GetByPlaceholder added in v0.1.1

func (p *Page) GetByPlaceholder(text string, opts ...LocatorOption) (*Element, error)

GetByPlaceholder finds the first element with a matching placeholder attribute.

func (*Page) GetByRole added in v0.1.1

func (p *Page) GetByRole(role string, opts ...LocatorOption) (*Element, error)

GetByRole finds the first element matching the given ARIA role. Supports both explicit role attributes and implicit roles from HTML semantics. Use Name("Submit") to filter by accessible name.

func (*Page) GetByTestID added in v0.1.1

func (p *Page) GetByTestID(id string) (*Element, error)

GetByTestID finds the first element with a matching data-testid attribute.

func (*Page) GetByText added in v0.1.1

func (p *Page) GetByText(text string, opts ...LocatorOption) (*Element, error)

GetByText finds the first element whose text content matches. By default uses case-insensitive substring matching; use Exact() for exact match.

func (*Page) GetCookie

func (p *Page) GetCookie(name string) (string, error)

GetCookie returns a specific cookie by name, or empty string if not found.

func (*Page) GetCookies

func (p *Page) GetCookies() ([]Cookie, error)

GetCookies returns all cookies for the current page.

func (*Page) GetStorageState added in v0.1.1

func (p *Page) GetStorageState() (*StorageState, error)

GetStorageState captures the page's cookies and localStorage into a StorageState.

func (*Page) HTML

func (p *Page) HTML() (string, error)

HTML returns the full page HTML.

func (*Page) HandleDialog

func (p *Page) HandleDialog() (wait func() *Dialog, handle func(accept bool, promptText string) error)

HandleDialog returns two functions: wait blocks until the next JS dialog opens, and handle accepts/dismisses it. Use for fine-grained dialog control.

func (*Page) HandleFileDialog

func (p *Page) HandleFileDialog() (func(paths []string) error, error)

HandleFileDialog prepares to handle a native file chooser dialog. Call before the action that triggers the file dialog, then invoke the returned function with the file paths to select.

func (*Page) HumanClick added in v0.2.8

func (p *Page) HumanClick(selector string) error

HumanClick clicks an element with a random offset from center and a small delay, mimicking natural mouse behavior instead of clicking dead center instantly.

func (*Page) HumanMoveMouse added in v0.2.8

func (p *Page) HumanMoveMouse(toX, toY float64) error

HumanMoveMouse moves the mouse along a curved path to the target coordinates, simulating natural hand movement using a Bezier-like curve.

func (*Page) HumanScroll added in v0.2.8

func (p *Page) HumanScroll(deltaY float64) error

HumanScroll scrolls the page with natural-feeling incremental movements.

func (*Page) HumanType added in v0.2.8

func (p *Page) HumanType(selector string, text string) error

HumanType types text character by character with random delays between keystrokes, mimicking natural typing speed (~50-150ms per character).

func (*Page) Intercept

func (p *Page) Intercept() *NetworkInterceptor

Intercept sets up network request interception on the page.

func (*Page) IsIframe

func (p *Page) IsIframe() bool

IsIframe returns whether this page represents an iframe.

func (*Page) KeyPress

func (p *Page) KeyPress(key input.Key) error

KeyPress sends a keyboard event (e.g., input.Enter, input.Escape, input.Tab).

func (*Page) LoadStorageState added in v0.1.1

func (p *Page) LoadStorageState(path string) error

LoadStorageState restores cookies and localStorage from a JSON file. The page should be navigated to the relevant origin before calling this.

page.Navigate("https://example.com")
page.LoadStorageState("auth.json")

func (*Page) LocalStorageAll

func (p *Page) LocalStorageAll() (map[string]string, error)

LocalStorageAll returns all localStorage key-value pairs.

func (*Page) LocalStorageClear

func (p *Page) LocalStorageClear() error

LocalStorageClear clears all localStorage.

func (*Page) LocalStorageDelete

func (p *Page) LocalStorageDelete(key string) error

LocalStorageDelete removes a key from localStorage.

func (*Page) LocalStorageGet

func (p *Page) LocalStorageGet(key string) (string, error)

LocalStorageGet retrieves a value from localStorage.

func (*Page) LocalStorageSet

func (p *Page) LocalStorageSet(key, value string) error

LocalStorageSet stores a value in localStorage.

func (*Page) Navigate

func (p *Page) Navigate(url string) error

Navigate loads a URL and waits for the page to be ready. If the page is served a bot-protection challenge that can be auto-solved (e.g., Cloudflare's "Just a moment..." JavaScript challenge), Navigate will poll until the challenge clears or the configured timeout elapses.

Navigate does NOT return an error when the page lands on a non-auto-solvable challenge (Turnstile, DataDome) — the page did load, it just loaded a challenge. Callers who want to fail fast in that case should call DetectChallenge() after Navigate and check the return value. Only an auto-solvable challenge that fails to clear within the timeout produces an error.

func (*Page) PDF

func (p *Page) PDF() ([]byte, error)

PDF generates a PDF of the current page.

func (*Page) Reload

func (p *Page) Reload() error

Reload refreshes the current page.

func (*Page) RestoreStorageState added in v0.1.1

func (p *Page) RestoreStorageState(state *StorageState) error

RestoreStorageState applies a StorageState to the current page.

func (*Page) Rod

func (p *Page) Rod() *rod.Page

Rod returns the underlying rod.Page for advanced usage.

func (*Page) SaveStorageState added in v0.1.1

func (p *Page) SaveStorageState(path string) error

SaveStorageState serializes the page's cookies and localStorage to a JSON file. Use LoadStorageState to restore the state in a future session.

// After logging in:
page.SaveStorageState("auth.json")

func (*Page) Screenshot

func (p *Page) Screenshot() ([]byte, error)

Screenshot captures the visible viewport as PNG bytes.

func (*Page) ScreenshotJPEG

func (p *Page) ScreenshotJPEG(quality int) ([]byte, error)

ScreenshotJPEG captures the viewport as JPEG bytes with the given quality (0-100).

func (*Page) Scroll

func (p *Page) Scroll(dx, dy float64) error

Scroll scrolls the page by the given number of pixels. Positive dy scrolls down, negative scrolls up. Positive dx scrolls right, negative scrolls left.

func (*Page) ScrollToBottom

func (p *Page) ScrollToBottom() error

ScrollToBottom scrolls to the bottom of the page.

func (*Page) ScrollToTop

func (p *Page) ScrollToTop() error

ScrollToTop scrolls to the top of the page.

func (*Page) SessionStorageGet

func (p *Page) SessionStorageGet(key string) (string, error)

SessionStorageGet retrieves a value from sessionStorage.

func (*Page) SessionStorageSet

func (p *Page) SessionStorageSet(key, value string) error

SessionStorageSet stores a value in sessionStorage.

func (*Page) SetColorScheme added in v0.2.0

func (p *Page) SetColorScheme(scheme ColorScheme) error

SetColorScheme emulates a CSS prefers-color-scheme media feature.

page.SetColorScheme(gosurfer.ColorSchemeDark)

func (*Page) SetCookie

func (p *Page) SetCookie(name, value, domain, path string) error

SetCookie sets a single cookie on the current page.

func (*Page) SetCookies

func (p *Page) SetCookies(cookies []Cookie) error

SetCookies sets multiple cookies at once.

func (*Page) SetGeolocation added in v0.2.0

func (p *Page) SetGeolocation(latitude, longitude, accuracy float64) error

SetGeolocation sets the geographic location for the page. Pass accuracy in meters (e.g., 100.0 for ~100m accuracy).

page.SetGeolocation(37.7749, -122.4194, 100) // San Francisco

func (*Page) SetLocale added in v0.2.0

func (p *Page) SetLocale(locale string) error

SetLocale overrides the browser locale for Intl APIs (number formatting, date formatting, etc.). Use ICU locale format like "en_US", "fr_FR", "ja_JP". Note: this affects Intl.DateTimeFormat, Intl.NumberFormat, etc. but not navigator.language.

func (*Page) SetNetworkConditions added in v0.2.0

func (p *Page) SetNetworkConditions(latencyMs float64, downloadBytesPerSec, uploadBytesPerSec float64) error

SetNetworkConditions emulates network throttling. Latency is in milliseconds, throughput values are in bytes/second.

page.SetNetworkConditions(150, 1.6*1024*1024, 750*1024) // Regular 3G

func (*Page) SetOffline added in v0.2.0

func (p *Page) SetOffline(offline bool) error

SetOffline simulates offline/online network conditions.

page.SetOffline(true)  // disconnect
page.SetOffline(false) // reconnect

func (*Page) SetReducedMotion added in v0.2.0

func (p *Page) SetReducedMotion(motion ReducedMotion) error

SetReducedMotion emulates a CSS prefers-reduced-motion media feature.

func (*Page) SetTimezone added in v0.2.0

func (p *Page) SetTimezone(timezoneID string) error

SetTimezone overrides the browser timezone. Use IANA timezone identifiers like "America/New_York", "Europe/London", "Asia/Tokyo".

func (*Page) SetTouchEnabled added in v0.2.0

func (p *Page) SetTouchEnabled(enabled bool) error

SetTouchEnabled enables or disables touch event emulation.

func (*Page) SetUserAgent added in v0.2.0

func (p *Page) SetUserAgent(userAgent string) error

SetUserAgent overrides the browser user agent string.

func (*Page) SetViewport added in v0.2.0

func (p *Page) SetViewport(width, height int, scaleFactor float64, mobile bool) error

SetViewport overrides the page viewport dimensions and device scale factor.

page.SetViewport(375, 812, 3.0, true) // iPhone X

func (*Page) SolveCAPTCHA

func (p *Page) SolveCAPTCHA(ctx context.Context, solver CAPTCHASolver) error

SolveCAPTCHA detects a CAPTCHA on the page and solves it using the provided solver. It injects the solution token into the page automatically.

func (*Page) StartHAR

func (p *Page) StartHAR() *HARRecorder

StartHAR begins recording network traffic on the page. Call StopHAR() when done, then Export() to get the HAR data.

func (*Page) TargetID

func (p *Page) TargetID() string

TargetID returns the CDP target ID for this page (used for tab tracking). Returns the last 4 chars as a short ID, matching Browser Use convention.

func (*Page) Text

func (p *Page) Text(selector string) (string, error)

Text returns the text content of an element matched by selector.

func (*Page) Title

func (p *Page) Title() (string, error)

Title returns the page title.

func (*Page) Type

func (p *Page) Type(selector, text string) error

Type finds an element by selector and types text into it.

func (*Page) URL

func (p *Page) URL() string

URL returns the current page URL.

func (*Page) WaitForChallenge added in v0.5.0

func (p *Page) WaitForChallenge(timeout time.Duration) (ChallengeType, time.Duration, error)

WaitForChallenge polls the page until any auto-solvable challenge has cleared or the timeout elapses. It returns:

  • the challenge type that was initially detected (ChallengeNone if no challenge was present)
  • the time spent waiting
  • an error ONLY if an auto-solvable challenge failed to clear in time

Non-auto-solvable challenges (Turnstile, DataDome) are returned without an error — the page did load, it just loaded a challenge. The caller can inspect the returned ChallengeType to decide what to do.

A timeout of 0 disables waiting (no-op). Callers should pick a value appropriate to the challenge — Cloudflare UAM typically resolves in 5-15 seconds, rarely longer.

func (*Page) WaitIdle

func (p *Page) WaitIdle(timeout time.Duration) error

WaitIdle waits until the page has no pending network requests.

func (*Page) WaitLoad

func (p *Page) WaitLoad() error

WaitLoad waits for the page load event.

func (*Page) WaitPopup

func (p *Page) WaitPopup() func() (*Page, error)

WaitPopup waits for a new page/tab opened by this page (e.g., window.open, target="_blank"). Call before the action that triggers the popup.

func (*Page) WaitSelector

func (p *Page) WaitSelector(selector string) (*Element, error)

WaitSelector waits for an element matching the selector to appear.

func (*Page) WaitStable

func (p *Page) WaitStable(interval time.Duration) error

WaitStable waits until the page DOM stops changing.

type PageExpect added in v0.1.1

type PageExpect struct {
	// contains filtered or unexported fields
}

PageExpect provides auto-retrying page-level assertions.

func Expect added in v0.1.1

func Expect(page *Page, opts ...ExpectOption) *PageExpect

Expect creates auto-retrying assertions for a page. Assertions retry until they pass or the timeout expires (default 5s).

expect := gosurfer.Expect(page)
err := expect.ToHaveTitle("Dashboard")
err = expect.Locator("#btn").ToBeVisible()

func (*PageExpect) Locator added in v0.1.1

func (e *PageExpect) Locator(selector string) *LocatorExpect

Locator returns a LocatorExpect for assertions on a specific element.

func (*PageExpect) ToHaveTitle added in v0.1.1

func (e *PageExpect) ToHaveTitle(expected string) error

ToHaveTitle asserts the page title equals the expected string.

func (*PageExpect) ToHaveTitleContaining added in v0.1.1

func (e *PageExpect) ToHaveTitleContaining(substring string) error

ToHaveTitleContaining asserts the page title contains the substring.

func (*PageExpect) ToHaveURL added in v0.1.1

func (e *PageExpect) ToHaveURL(expected string) error

ToHaveURL asserts the page URL equals the expected string.

func (*PageExpect) ToHaveURLContaining added in v0.1.1

func (e *PageExpect) ToHaveURLContaining(substring string) error

ToHaveURLContaining asserts the page URL contains the substring.

type ParamDef

type ParamDef struct {
	Name        string `json:"name"`
	Type        string `json:"type"` // "string", "int", "float", "bool"
	Description string `json:"description"`
	Required    bool   `json:"required"`
}

ParamDef describes a parameter for an action.

type ReducedMotion added in v0.2.0

type ReducedMotion string

ReducedMotion represents a CSS reduced-motion preference.

const (
	ReducedMotionReduce       ReducedMotion = "reduce"
	ReducedMotionNoPreference ReducedMotion = ""
)

type Secrets

type Secrets struct {
	// contains filtered or unexported fields
}

Secrets manages sensitive data (credentials, TOTP secrets) for the agent. Keys ending in "_totp" are automatically treated as TOTP secrets and generate fresh codes on each access.

func NewSecrets

func NewSecrets(data map[string]string) *Secrets

NewSecrets creates a Secrets store from a key-value map.

func (*Secrets) Get

func (s *Secrets) Get(key string) (string, error)

Get retrieves a secret value. For keys ending in "_totp", a fresh TOTP code is generated from the stored secret.

func (*Secrets) Has

func (s *Secrets) Has(key string) bool

Has returns whether a key exists.

func (*Secrets) Keys

func (s *Secrets) Keys() []string

Keys returns all secret key names.

func (*Secrets) ReplaceInText

func (s *Secrets) ReplaceInText(text string) string

ReplaceInText replaces {{secret_name}} placeholders in text with actual secret values (generating TOTP codes for _totp keys).

type StepInfo

type StepInfo struct {
	Step     int
	Thought  string
	Action   string
	Params   map[string]interface{}
	Result   string
	Error    error
	Duration time.Duration
	URL      string
}

StepInfo provides information about a completed agent step.

type StorageState added in v0.1.1

type StorageState struct {
	Cookies      []Cookie          `json:"cookies"`
	LocalStorage map[string]string `json:"localStorage"`
	Origin       string            `json:"origin"`
}

StorageState captures the full browser storage for a page: cookies and localStorage. It can be serialized to JSON for reuse across sessions (e.g., preserving login state).

type TabInfo

type TabInfo struct {
	ID    string `json:"id"`
	URL   string `json:"url"`
	Title string `json:"title"`
}

TabInfo describes an open browser tab.

type TokenUsage

type TokenUsage struct {
	PromptTokens     int `json:"prompt_tokens"`
	CompletionTokens int `json:"completion_tokens"`
	TotalTokens      int `json:"total_tokens"`
}

TokenUsage tracks token consumption.

type TwoCaptchaSolver

type TwoCaptchaSolver struct {
	APIKey  string
	BaseURL string // default: https://2captcha.com
	// contains filtered or unexported fields
}

TwoCaptchaSolver implements CAPTCHASolver using the 2captcha.com API.

func NewTwoCaptchaSolver

func NewTwoCaptchaSolver(apiKey string) *TwoCaptchaSolver

NewTwoCaptchaSolver creates a 2Captcha solver.

func (*TwoCaptchaSolver) Name

func (s *TwoCaptchaSolver) Name() string

func (*TwoCaptchaSolver) Solve

func (s *TwoCaptchaSolver) Solve(ctx context.Context, info CAPTCHAInfo) (string, error)

Directories

Path Synopsis
cmd
gosurfer command
gosurfer CLI - persistent browser automation from the command line.
gosurfer CLI - persistent browser automation from the command line.
gosurfer-mcp command
Structured request logging with credential-safe URL sanitization.
Structured request logging with credential-safe URL sanitization.
examples
benchmark command
Benchmark: measures memory utilization of gosurfer during browser automation.
Benchmark: measures memory utilization of gosurfer during browser automation.
scrape command
Example: Direct browser automation (no AI) using gosurfer.
Example: Direct browser automation (no AI) using gosurfer.
search command
Example: AI-powered web search using gosurfer.
Example: AI-powered web search using gosurfer.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL