crawler

package

v0.0.0-beta.30 Latest Latest Go to latest Published: Mar 23, 2026 License: MIT Imports: 12 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/RandomCodeSpace/docscontext

Links

Open Source Insights

Documentation ¶

Overview ¶

Package crawler discovers and fetches pages from documentation websites. It supports sitemap.xml discovery (MkDocs, Docusaurus, Sphinx) and falls back to BFS link-following within the same origin.

Index ¶

type Options
type Page
- func Crawl(ctx context.Context, rootURL string, opts Options) ([]*Page, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Options ¶

type Options struct {
	MaxPages    int  // 0 = unlimited
	MaxDepth    int  // 0 = unlimited BFS depth
	Concurrency int  // parallel fetchers (default 4)
	SkipSitemap bool // force BFS even if sitemap.xml exists
}

Options controls crawl behaviour.

type Page ¶

type Page = loader.RawDocument

Page is a fetched documentation page.

func Crawl ¶

func Crawl(ctx context.Context, rootURL string, opts Options) ([]*Page, error)

Crawl fetches all pages reachable from rootURL within the same origin. It tries sitemap.xml first; if absent it falls back to BFS link following.

Source Files ¶

View all Source files

crawler.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL