asset

package
v0.3.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 15, 2026 License: MIT Imports: 12 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var ErrTooLarge = errors.New("asset over size cap")

ErrTooLarge reports that an asset exceeds the size cap and was skipped without being saved. It is deliberately a skip, not a download failure: the caller leaves the asset out of the mirror rather than writing a truncated fragment of it, so a 500 MB installer or video never bloats the archive with a corrupt quarter of itself.

Functions

func RewriteCSS

func RewriteCSS(css []byte, base *url.URL, sink RefSink) []byte

RewriteCSS rewrites every url(...) and @import in a stylesheet so its references point at local files. base is the stylesheet's own URL (so relative references resolve correctly); sink maps each absolute URL to its local path. data: URLs and unparseable references are left untouched.

func RewriteHTML

func RewriteHTML(root *html.Node, base *url.URL, sink RefSink)

RewriteHTML walks the parsed document and rewrites every resource and link reference through sink, resolving relative URLs against base. It mutates the tree in place; the caller renders it afterwards. References kage cannot handle (data:, mailto:, fragment-only, …) are left untouched.

Types

type Downloader

type Downloader struct {
	Client    *http.Client
	UserAgent string
	MaxBytes  int64 // per-asset cap; 0 = unlimited
	Retries   int   // extra attempts for a transient failure (0 = try once)
}

Downloader fetches asset bytes over plain HTTP. It is separate from the Chrome pool: assets are public bytes that rarely need a real browser, so a fast HTTP client keeps the crawl cheap. Failures are returned to the caller, which logs them and moves on — a missing asset degrades a page, it never aborts a clone.

func NewDownloader

func NewDownloader(userAgent string, timeout time.Duration, maxBytes int64) *Downloader

NewDownloader builds a Downloader with a sane client and the given timeout.

func (*Downloader) Get

func (d *Downloader) Get(ctx context.Context, u *url.URL, referer string) (*Result, error)

Get fetches u, sending referer as the Referer header. It reads at most MaxBytes and reports whether the body is CSS (so the caller can rewrite it). A transient failure (a 403/429/5xx or a network blip) is retried with a short backoff up to Retries times.

type RefSink

type RefSink func(u *url.URL, kind urlx.Kind) string

RefSink registers a resolved asset/page URL with the cloner and returns the string to write back into the markup or CSS — a relative local path for things kage saves, or the absolute URL for anything it leaves on the live web.

type Result

type Result struct {
	Body        []byte
	ContentType string
	IsCSS       bool
}

Result is a downloaded asset.

type StatusError added in v0.1.2

type StatusError struct {
	Code int
}

StatusError reports a non-2xx HTTP response. It carries the code so callers can render a clear message ("HTTP 403 Forbidden") and decide whether a retry is worthwhile, without the URL baked in (the caller already has it).

func (*StatusError) Error added in v0.1.2

func (e *StatusError) Error() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL