html

package
v0.159.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 10, 2026 License: AGPL-3.0 Imports: 12 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrNotFound = errors.New("not found")
	ErrParseURL = errors.New("could not parse URL")
)

Functions

func DiscoverFeedURL added in v0.159.0

func DiscoverFeedURL(sourceURL *url.URL, content []byte) (string, error)

DiscoverFeedURL attempts to find a feed URL within a HTML page.

There are a couple of "canonical" places the feed URL is located. Firstly, as per the RSS spec, look for a link element with rel="alternate" and type="application/rss+xml". Secondly, check for a link element with a URL that ends with feed, rss or atom, which would indicate a feed URL.

func FindAllHTMLNodes

func FindAllHTMLNodes(n *html.Node, tag string) []*html.Node

FindAllHTMLNodes returns all nodes matching the tag within n.

func FindHTMLNode

func FindHTMLNode(n *html.Node, tag string) *html.Node

FindHTMLNode does a depth-first search for the first node matching the tag.

func FindMainImage

func FindMainImage(page []byte, rawURL string) (string, error)

FindMainImage tries to find a "main" image for the page, using the readability parser.

func IsHTML

func IsHTML(s string) bool

func IsHTMLElement

func IsHTMLElement(str, tag string) bool

IsHTMLElement returns a boolean indicating whether the given string is the given HTML element.

func SanitizeHTMLString added in v0.83.0

func SanitizeHTMLString(rawStr string) (string, error)

SanitizeHTMLString will parse and re-render the given string containing HTML. In doing so, the HTML is hopefully sanitized and reformatted to be well-formed HTML.

func ToPlainText added in v0.155.0

func ToPlainText(s string) string

ToPlainText converts a HTML encoded string to plain text.

Types

type Favicon

type Favicon struct {
	// contains filtered or unexported fields
}

Favicon is a favicon link found in <head>.

func FindFavicon

func FindFavicon(
	page []byte,
	pageURL string,
) ([]byte, string, Favicon, error)

FindFavicon tries each candidate in order and returns the first one that responds with a 2xx status and a non-empty body.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL