cloner

package
v0.0.0-...-9c3f283 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 27, 2026 License: GPL-3.0 Imports: 16 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func RewriteAssetURLs

func RewriteAssetURLs(htmlContent string, assets []Asset) string

RewriteAssetURLs replaces original asset URLs in HTML with local /assets/ paths. Assets that were not successfully downloaded keep their original URLs.

func RewriteCSSURLs

func RewriteCSSURLs(css string, assets []CSSAsset) string

RewriteCSSURLs replaces original url()/import references in CSS with local /assets/ paths. Only rewrites assets that were successfully downloaded.

Types

type Asset

type Asset struct {
	OriginalURL string
	AbsoluteURL string
	LocalPath   string // filename under assets/
	Tag         string
	Attr        string
	Downloaded  bool
}

Asset represents a discovered external resource in the cloned page.

func ExtractAssetURLs

func ExtractAssetURLs(htmlContent string, base *url.URL) []Asset

ExtractAssetURLs parses HTML and returns all external assets referenced in tags, inline <style> blocks, and style="" attributes.

type CSSAsset

type CSSAsset struct {
	OriginalRef string // the full matched token, e.g. `url("foo.png")`
	RawURL      string // the URL string within the token
	AbsoluteURL string
	LocalPath   string
	Downloaded  bool
}

CSSAsset represents a resource discovered inside a CSS file.

func ExtractCSSURLs

func ExtractCSSURLs(css string, cssBaseURL *url.URL) []CSSAsset

ExtractCSSURLs finds all url() and @import references in CSS text, resolving relative URLs against cssBaseURL (the URL of the CSS file itself).

type Cloner

type Cloner struct {
	// contains filtered or unexported fields
}

Cloner fetches and stores a clone of a target web page and its assets.

func New

func New(targetURL, cloneDir, userAgent string, insecure, verbose bool) *Cloner

New constructs a Cloner.

func (*Cloner) Clone

func (c *Cloner) Clone(ctx context.Context) error

Clone fetches the target page, downloads all assets, rewrites URLs, and writes index.html to the clone directory.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL