htmlutil

package
v0.8.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 12, 2025 License: Apache-2.0 Imports: 4 Imported by: 0

Documentation

Overview

Package htmlutil provides HTML processing utilities for social media scraping.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ContactLinks(htmlContent, baseURL string) []string

ContactLinks extracts contact/about page URLs from HTML content. These pages often contain additional social media links.

func Description

func Description(htmlContent string) string

Description extracts the meta description from HTML content.

func EmailAddresses

func EmailAddresses(htmlContent string) []string

EmailAddresses extracts email addresses from HTML content. Filters out common false positives like noreply@, example@, etc.

func ExtractEmailFromURL

func ExtractEmailFromURL(urlStr string) (string, bool)

ExtractEmailFromURL extracts an email address from URLs like "https://user@domain.com" or "http://email@example.com". Returns the email address and true if found, empty string and false otherwise.

func ExtractRedirectURL added in v0.7.9

func ExtractRedirectURL(htmlContent string) string

ExtractRedirectURL checks HTML content for meta refresh or JavaScript redirects. Returns the redirect URL if found, empty string otherwise.

func IsEmailURL

func IsEmailURL(urlStr string) bool

IsEmailURL returns true if the URL is a mailto: link or an email address with http(s):// prefix.

func SocialLinks(htmlContent string) []string

SocialLinks extracts social media URLs from HTML content.

func Title

func Title(htmlContent string) string

Title extracts the title from HTML content.

func ToMarkdown

func ToMarkdown(htmlContent string) string

ToMarkdown converts HTML content to markdown format.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL