crawler

package
v0.0.0-beta.30 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 23, 2026 License: MIT Imports: 12 Imported by: 0

Documentation

Overview

Package crawler discovers and fetches pages from documentation websites. It supports sitemap.xml discovery (MkDocs, Docusaurus, Sphinx) and falls back to BFS link-following within the same origin.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Options

type Options struct {
	MaxPages    int  // 0 = unlimited
	MaxDepth    int  // 0 = unlimited BFS depth
	Concurrency int  // parallel fetchers (default 4)
	SkipSitemap bool // force BFS even if sitemap.xml exists
}

Options controls crawl behaviour.

type Page

type Page = loader.RawDocument

Page is a fetched documentation page.

func Crawl

func Crawl(ctx context.Context, rootURL string, opts Options) ([]*Page, error)

Crawl fetches all pages reachable from rootURL within the same origin. It tries sitemap.xml first; if absent it falls back to BFS link following.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL