browser

package

v0.1.6 Latest Latest Go to latest Published: Mar 4, 2026 License: Apache-2.0 Imports: 10 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/stackgenhq/genie

Links

Open Source Insights

Documentation ¶

Overview ¶

Package browser provides a chromedp-backed browser automation provider. It exposes small, composable trpc-agent tools that an AI agent can invoke to navigate pages, interact with elements, extract content, and take screenshots. Without this package the agent has no way to observe or manipulate live web pages.

Index ¶

func AllTools(b *Browser) []tool.CallableTool
func NewClickTool(b *Browser) tool.CallableTool
func NewEvalJSTool(b *Browser) tool.CallableTool
func NewNavigateTool(b *Browser) tool.CallableTool
func NewReadHTMLTool(b *Browser) tool.CallableTool
func NewReadTextTool(b *Browser) tool.CallableTool
func NewScreenshotTool(b *Browser) tool.CallableTool
func NewTypeTool(b *Browser) tool.CallableTool
func NewWaitTool(b *Browser) tool.CallableTool
type Browser
- func New(ctx context.Context, opts ...Option) (*Browser, error)
- func (b *Browser) Close()
- func (b *Browser) GetTools() []tool.Tool
- func (b *Browser) NewTab(parent context.Context) (context.Context, context.CancelFunc, error)
type ClickRequest
type ClickResponse
type Config
type EvalJSRequest
type EvalJSResponse
type NavigateRequest
type NavigateResponse
type Option
- func WithBlockedDomains(domains []string) Option
- func WithHeadless(v bool) Option
- func WithTimeout(d time.Duration) Option
- func WithViewport(width, height int) Option
type ReadHTMLRequest
type ReadHTMLResponse
type ReadTextRequest
type ReadTextResponse
type ScreenshotRequest
type ScreenshotResponse
type TypeRequest
type TypeResponse
type WaitRequest
type WaitResponse

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func AllTools ¶

func AllTools(b *Browser) []tool.CallableTool

AllTools returns every browser tool wired to the given Browser instance. This is a convenience function for registering all tools at once.

func NewClickTool ¶

func NewClickTool(b *Browser) tool.CallableTool

NewClickTool creates the browser_click tool. It waits for the element to become visible and then clicks it. Without this tool the agent cannot interact with buttons, links, or other clickable elements.

func NewEvalJSTool ¶

func NewEvalJSTool(b *Browser) tool.CallableTool

NewEvalJSTool creates the browser_eval_js tool. It evaluates an arbitrary JavaScript expression in the page context. This is the escape hatch for any interaction that the other tools cannot cover.

func NewNavigateTool ¶

func NewNavigateTool(b *Browser) tool.CallableTool

NewNavigateTool creates the browser_navigate tool. It opens the requested URL in the shared browser tab. Without this tool the agent has no way to load a web page.

func NewReadHTMLTool ¶

func NewReadHTMLTool(b *Browser) tool.CallableTool

NewReadHTMLTool creates the browser_read_html tool. It returns the outer HTML of an element, useful when the agent needs structural information. Without this tool the agent can only see text, not the underlying markup.

func NewReadTextTool ¶

func NewReadTextTool(b *Browser) tool.CallableTool

NewReadTextTool creates the browser_read_text tool. It extracts the visible text content of an element. Without this tool the agent cannot read page content as plain text.

func NewScreenshotTool ¶

func NewScreenshotTool(b *Browser) tool.CallableTool

NewScreenshotTool creates the browser_screenshot tool. It captures a PNG screenshot of the viewport or a specific element and returns it as base64. Without this tool the agent has no visual feedback of the page state.

func NewTypeTool ¶

func NewTypeTool(b *Browser) tool.CallableTool

NewTypeTool creates the browser_type tool. It focuses the element and types the given text. Without this tool the agent cannot fill out forms.

func NewWaitTool ¶

func NewWaitTool(b *Browser) tool.CallableTool

NewWaitTool creates the browser_wait tool. It allows the agent to pause execution until a specific condition is met (time, selector visible, or network idle).

Types ¶

type Browser ¶

type Browser struct {
	// contains filtered or unexported fields
}

Browser manages a shared chromedp browser session. All tools operate on the same browser tab so that navigation state is preserved across calls. Without this struct every tool call would launch a new browser, losing cookies, logins, and page context.

func New ¶

func New(ctx context.Context, opts ...Option) (*Browser, error)

New allocates a new Chrome browser process (headless by default) and returns a Browser that tools can share. Callers MUST call Close when finished to avoid leaking Chrome processes.

func (*Browser) Close ¶

func (b *Browser) Close()

Close tears down the browser process and releases all resources. It is safe to call multiple times.

func (*Browser) GetTools ¶

func (b *Browser) GetTools() []tool.Tool

GetTools satisfies the tools.ToolProviders interface so a Browser instance can be passed directly to tools.NewRegistry. Without this, browser tool construction would be inlined in the registry.

func (*Browser) NewTab ¶

func (b *Browser) NewTab(parent context.Context) (context.Context, context.CancelFunc, error)

NewTab creates a new isolated browser context (tab). The caller is responsible for cancelling the returned context to close the tab. The tab will also be closed if the underlying browser context is cancelled (for example, via Close).

Note: The 'parent' argument is currently ignored for the purpose of browser inheritance to ensure the tab belongs to this Browser instance. If you need to tie the tab to an existing context's lifecycle, wrap the returned context with context.WithCancel/WithTimeout using your parent context as the reference (though hooking them up directly is not supported by chromedp structure).

type ClickRequest ¶

type ClickRequest struct {
	Selector string `json:"selector" jsonschema:"description=CSS selector of the element to click,required"`
}

ClickRequest is the input for the browser_click tool.

type ClickResponse ¶

type ClickResponse struct {
	Status string `json:"status"`
}

ClickResponse is the output for the browser_click tool.

type Config ¶

type Config struct {
	BlockedDomains []string `yaml:"blocked_domains,omitempty" toml:"blocked_domains,omitempty"`
}

Config holds configuration for the browser tool provider. BlockedDomains prevents the agent from navigating to specific domains (e.g. internal admin panels, payment processors). Matching is suffix-based so "example.com" also blocks "sub.example.com".

type EvalJSRequest ¶

type EvalJSRequest struct {
	Expression string `json:"expression" jsonschema:"description=JavaScript expression to evaluate in the page context,required"`
}

EvalJSRequest is the input for the browser_eval_js tool.

type EvalJSResponse ¶

type EvalJSResponse struct {
	Result string `json:"result"`
}

EvalJSResponse is the output for the browser_eval_js tool.

type NavigateRequest ¶

type NavigateRequest struct {
	URL string `json:"url" jsonschema:"description=The URL to navigate to,required"`
}

NavigateRequest is the input for the browser_navigate tool.

type NavigateResponse ¶

type NavigateResponse struct {
	Status string `json:"status"`
	URL    string `json:"url"`
}

NavigateResponse is the output for the browser_navigate tool.

type Option ¶

type Option func(*browserOpts)

Option configures a Browser instance.

func WithBlockedDomains ¶

func WithBlockedDomains(domains []string) Option

WithBlockedDomains sets domains that the browser is not allowed to navigate to. Matching is suffix-based: "example.com" blocks both "example.com" and "sub.example.com". This is a safety measure to prevent the agent from accessing sensitive internal services.

func WithHeadless ¶

func WithHeadless(v bool) Option

WithHeadless controls whether the browser runs without a visible window. It defaults to true. Setting this to false is useful during local debugging.

func WithTimeout ¶

func WithTimeout(d time.Duration) Option

WithTimeout overrides the default per-action timeout of 30 seconds.

func WithViewport ¶

func WithViewport(width, height int) Option

WithViewport sets the browser window size.

type ReadHTMLRequest ¶

type ReadHTMLRequest struct {
	Selector string `json:"selector" jsonschema:"description=CSS selector of the element whose outer HTML to read,required"`
}

ReadHTMLRequest is the input for the browser_read_html tool.

type ReadHTMLResponse ¶

type ReadHTMLResponse struct {
	HTML string `json:"html"`
}

ReadHTMLResponse is the output for the browser_read_html tool.

type ReadTextRequest ¶

type ReadTextRequest struct {
	Selector string `json:"selector" jsonschema:"description=CSS selector of the element whose visible text to read,required"`
}

ReadTextRequest is the input for the browser_read_text tool.

type ReadTextResponse ¶

type ReadTextResponse struct {
	Text string `json:"text"`
}

ReadTextResponse is the output for the browser_read_text tool.

type ScreenshotRequest ¶

type ScreenshotRequest struct {
	Selector string `` /* 146-byte string literal not displayed */
}

ScreenshotRequest is the input for the browser_screenshot tool.

type ScreenshotResponse ¶

type ScreenshotResponse struct {
	ImageBase64 string `json:"image_base64"`
}

ScreenshotResponse is the output for the browser_screenshot tool.

type TypeRequest ¶

type TypeRequest struct {
	Selector string `json:"selector" jsonschema:"description=CSS selector of the input element,required"`
	Text     string `json:"text" jsonschema:"description=Text to type into the element,required"`
}

TypeRequest is the input for the browser_type tool.

type TypeResponse ¶

type TypeResponse struct {
	Status string `json:"status"`
}

TypeResponse is the output for the browser_type tool.

type WaitRequest ¶

type WaitRequest struct {
	Selector    string `json:"selector,omitempty" jsonschema:"description=CSS selector to wait for visibility."`
	Duration    string `json:"duration,omitempty" jsonschema:"description=Duration to wait (e.g. '2s', '500ms')."`
	NetworkIdle bool   `json:"network_idle,omitempty" jsonschema:"description=If true, wait for network (HTML+images+CSS) to be idle."`
}

WaitRequest is the input for the browser_wait tool.

type WaitResponse ¶

type WaitResponse struct {
	Status string `json:"status"`
}

WaitResponse is the output for the browser_wait tool.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL