Documentation
¶
Index ¶
- func BuildIDMap(nodes []*Node) map[int]int64
- func Count(nodes []*Node) int
- func DumpRaw(w io.Writer, nodes []*accessibility.Node)
- func EnrichWithOCR(ctx context.Context, nodes []*Node, engine *ocr.Engine)
- func Extract(ctx context.Context) ([]*accessibility.Node, error)
- func Format(nodes []*Node) string
- func FormatWithOptions(nodes []*Node, opts FormatOptions) string
- type FormatOptions
- type Node
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func BuildIDMap ¶
BuildIDMap traverses the tree using the same counter logic as Format(), returning a map from display ID (1, 2, ...) to BackendDOMNodeID. This is essential for translating user-facing element numbers into CDP-level identifiers for actions like click/type.
func Count ¶
Count returns the total number of displayable nodes in the tree. Root document nodes are excluded from the count (they are flattened).
func DumpRaw ¶
func DumpRaw(w io.Writer, nodes []*accessibility.Node)
DumpRaw prints the raw AX node list for debugging purposes.
func EnrichWithOCR ¶
EnrichWithOCR traverses the filtered tree and applies OCR to image/interactive elements that have no accessible name. This fills in the gap where AX Tree provides no text info for image buttons/links.
Patterns that trigger OCR (to minimize overhead):
- Any image/img node with no name (images are visual; missing alt = OCR)
- link/button nodes with no name and no text children (likely icon-only)
The OCR engine should be pre-initialized and reused across calls.
func Extract ¶
func Extract(ctx context.Context) ([]*accessibility.Node, error)
Extract fetches the full accessibility tree from the current page via CDP. This is a single CDP call (Accessibility.getFullAXTree) that typically completes in < 50ms, returning a flat list of AX nodes.
func Format ¶
Format renders the filtered tree as structured text suitable for LLM consumption. Output format:
1: heading "Page Title" 2: button "Submit" disabled 3: navigation "Main Nav" 4: link "Home" 5: link "About" 6: textbox "Search" focused
func FormatWithOptions ¶
func FormatWithOptions(nodes []*Node, opts FormatOptions) string
FormatWithOptions renders the filtered tree with additional formatting controls.
Types ¶
type FormatOptions ¶
type FormatOptions struct {
InteractiveOnly bool // only show interactive elements
Compact bool // compact mode: omit structural wrappers without names
MaxDepth int // 0 = unlimited
Cursor bool // annotate the focused element with [cursor]
}
FormatOptions controls the formatting behavior.
type Node ¶
type Node struct {
Role string // semantic role: button, link, heading, textbox, etc.
Name string // accessible name (label)
Value string // current value (e.g. text in an input field)
States []string // active states: focused, checked, unchecked, disabled, etc.
Level int // heading level (1-6), 0 if not applicable
BackendID int64 // backendDOMNodeId — for future CDP click/type operations
Children []*Node
}
Node is a simplified, filtered accessibility tree node. It carries only the information needed for LLM consumption.
func BuildAndFilter ¶
func BuildAndFilter(rawNodes []*accessibility.Node) []*Node
BuildAndFilter converts a flat CDP AX node list into a filtered, hierarchical tree. This is the core of Layer 2 — deterministic local processing, zero LLM tokens.