Documentation
ΒΆ
Index ΒΆ
- Variables
- type Agent
- func (a *Agent) Close() error
- func (a *Agent) CloseTab(tabID string) error
- func (a *Agent) GetTitle() string
- func (a *Agent) GetURL() string
- func (a *Agent) IsStarted() bool
- func (a *Agent) ListTabs() []TabInfo
- func (a *Agent) Navigate(ctx context.Context, url string) error
- func (a *Agent) NewTab(ctx context.Context, url string) (string, error)
- func (a *Agent) Run(ctx context.Context, task string) (*Result, error)
- func (a *Agent) Start(ctx context.Context) error
- func (a *Agent) SwitchTab(tabID string) error
- func (a *Agent) WithContext(ctx context.Context) *ContextualAgent
- type Config
- type ContextualAgent
- type Preset
- type Result
- type Step
- type TabInfo
- type Viewport
Constants ΒΆ
This section is empty.
Variables ΒΆ
var ( // ErrMissingAPIKey is returned when Config.APIKey is not set. ErrMissingAPIKey = errors.New("bua: API key is required") // ErrNotStarted is returned when Run is called before Start. ErrNotStarted = errors.New("bua: agent not started, call Start() first") // ErrAlreadyStarted is returned when Start is called twice. ErrAlreadyStarted = errors.New("bua: agent already started") // ErrMaxStepsReached is returned when the agent exceeds MaxSteps. ErrMaxStepsReached = errors.New("bua: maximum steps reached without completing task") // ErrBrowserClosed is returned when the browser is unexpectedly closed. ErrBrowserClosed = errors.New("bua: browser was closed") // ErrElementNotFound is returned when an element index is invalid. ErrElementNotFound = errors.New("bua: element not found") // ErrElementNotVisible is returned when an element is not visible. ErrElementNotVisible = errors.New("bua: element is not visible") ErrNavigationFailed = errors.New("bua: navigation failed") // ErrTimeout is returned when an operation times out. ErrTimeout = errors.New("bua: operation timed out") // ErrHumanTakeoverTimeout is returned when human intervention times out. ErrHumanTakeoverTimeout = errors.New("bua: human takeover timed out") )
Common errors returned by the bua package.
Functions ΒΆ
This section is empty.
Types ΒΆ
type Agent ΒΆ
type Agent struct {
// contains filtered or unexported fields
}
Agent is the main interface for browser automation with LLM.
func (*Agent) Navigate ΒΆ
Navigate opens a URL in the browser. This is a convenience method for direct navigation without a task.
func (*Agent) Run ΒΆ
Run executes a task described in natural language. Returns a Result containing the outcome and execution details.
func (*Agent) WithContext ΒΆ
func (a *Agent) WithContext(ctx context.Context) *ContextualAgent
WithContext returns a helper for chaining operations with context.
type Config ΒΆ
type Config struct {
// APIKey is the Gemini API key (required).
APIKey string
// Model is the Gemini model to use. Default: "gemini-2.5-flash".
Model string
// Headless runs the browser without a visible window. Default: false.
Headless bool
// Debug enables verbose logging. Default: false.
Debug bool
// ProfileName specifies a named browser profile for session persistence.
// Empty string uses a temporary profile that is deleted on close.
ProfileName string
// ProfileDir is the directory to store browser profiles.
// Default: ~/.bua/profiles
ProfileDir string
// Viewport sets the browser viewport dimensions.
// Default: 1280x720
Viewport *Viewport
// MaxSteps is the maximum number of agent steps before giving up.
// Default: 100
MaxSteps int
// Preset configures token/quality tradeoffs.
// Default: PresetBalanced
Preset Preset
// MaxTokens is the maximum token budget for context.
// Set automatically based on Preset if not specified.
MaxTokens int
// MaxElements is the maximum number of elements to include in state.
// Set automatically based on Preset if not specified.
MaxElements int
// ScreenshotMaxWidth is the maximum width for screenshots.
// Set automatically based on Preset if not specified.
ScreenshotMaxWidth int
// ScreenshotQuality is the JPEG quality (1-100) for screenshots.
// Set automatically based on Preset if not specified.
ScreenshotQuality int
// TextOnly disables screenshots entirely for minimum token usage.
// Set automatically based on Preset if not specified.
TextOnly bool
// ShowAnnotations displays element indices on the page during execution.
// Useful for debugging. Default: false.
ShowAnnotations bool
// ShowHighlight highlights elements before actions.
// Default: true.
ShowHighlight bool
// HighlightDuration is how long to show action highlights.
// Default: 300ms.
HighlightDurationMs int
// ScreenshotDir is the directory to save screenshots.
// Default: system temp directory.
ScreenshotDir string
}
Config holds agent configuration.
type ContextualAgent ΒΆ
type ContextualAgent struct {
// contains filtered or unexported fields
}
ContextualAgent wraps Agent with a context for convenience methods.
func (*ContextualAgent) Navigate ΒΆ
func (ca *ContextualAgent) Navigate(url string) error
Navigate opens a URL using the stored context.
type Preset ΒΆ
type Preset string
Preset defines token/quality tradeoffs for different use cases.
const ( // PresetFast uses text-only mode for lowest token usage. PresetFast Preset = "fast" // PresetEfficient uses low quality screenshots. PresetEfficient Preset = "efficient" // PresetBalanced is the default with good balance of quality and cost. PresetBalanced Preset = "balanced" // PresetQuality uses higher quality screenshots. PresetQuality Preset = "quality" // PresetMax uses maximum quality for complex pages. PresetMax Preset = "max" )
type Result ΒΆ
type Result struct {
// Success indicates whether the task completed successfully.
Success bool
// Data contains the extracted data or task output.
// The type depends on what the agent was asked to do.
Data any
// Error contains the error message if Success is false.
Error string
// Steps contains the sequence of actions taken during execution.
Steps []Step
// Duration is the total execution time.
Duration time.Duration
// TokensUsed is the approximate number of tokens consumed.
TokensUsed int
// ScreenshotPaths contains paths to saved screenshots.
ScreenshotPaths []string
}
Result represents the outcome of a task execution.
type Step ΒΆ
type Step struct {
// Number is the step index (1-based).
Number int
// Action is the tool that was called (e.g., "click", "type_text").
Action string
// Target describes what the action was performed on.
Target string
// Thinking contains the agent's reasoning for this step.
Thinking string
// Evaluation is the agent's assessment of the previous action.
Evaluation string
// NextGoal describes what the agent planned to do.
NextGoal string
// Memory contains what the agent chose to remember.
Memory string
// URL is the page URL at this step.
URL string
// Title is the page title at this step.
Title string
// ScreenshotPath is the path to the screenshot for this step.
ScreenshotPath string
// Duration is how long this step took.
Duration time.Duration
// Error contains any error that occurred during this step.
Error string
}
Step represents a single action in the execution sequence.
Directories
ΒΆ
| Path | Synopsis |
|---|---|
|
examples
|
|
|
01_quick_start
command
|
|
|
02_annotations
command
|
|
|
tests
|
|
|
e2e
command
|