Documentation
¶
Index ¶
- Constants
- func BaseURL(raw string) string
- func ContentTypeByExt(ext string) string
- func ContentTypeFromPath(path string) string
- func ExtensionByContentType(contentType string) string
- func GetHTML(ctx context.Context, url string, opts ...any) ([]byte, error)
- func GetImage(ctx context.Context, url string, opts ...any) ([]byte, string, error)
- func GetImageExtFromHeader(resp *http.Response) (string, error)
- func GetLinksInHead(ctx context.Context, url string, opts ...any) ([][]html.Attribute, error)
- func GetSchemeFromURL(rawURL string) string
- func GetTitle(ctx context.Context, url string, opts ...any) (string, error)
- func MinifyCSS(input string) string
- func NewClient(opts ...ClientOption) *http.Client
- type ClientOption
- type GetOptions
Constants ¶
const ( LinuxUserAgent = "Mozilla/5.0 (X11; Linux x86_64)" + " AppleWebKit/537.36 (KHTML, like Gecko)" + " Chrome/146.0.0.0 Safari/537.36" // Limit for the size of the image to avoid excessive memory usage. LimitImageDefault int64 = 5 << 20 // 5 MB // Limit for the size of the HTML to avoid excessive memory usage. LimitHTMLDefault int64 = 512 << 10 // 512 KB // Limit for the size of the HTML for reading the TITLE to avoid excessive memory usage. LimitHTMLTitleDefault int64 = 1024 << 10 // 1024 KB )
Variables ¶
This section is empty.
Functions ¶
func BaseURL ¶
BaseURL returns the base URL (scheme + host) from a full URL string. If parsing fails, it returns an empty string.
func ContentTypeByExt ¶ added in v0.18.0
ContentTypeByExt returns the MIME content type for the given file extension. The extension may be provided with or without a leading dot. Unknown extensions fall back to "application/octet-stream".
func ContentTypeFromPath ¶ added in v0.18.0
ContentTypeFromPath returns the MIME content type based on the file's path.
func ExtensionByContentType ¶ added in v0.18.0
ExtensionByContentType returns the most common file extension for a given Content-Type. It correctly strips any parameters (e.g. "; charset=utf-8") and returns the extension without the leading dot.
Examples:
"image/jpeg" → "jpg" "image/jpeg; charset=utf-8" → "jpg" "text/html; charset=utf-8" → "html" "image/vnd.microsoft.icon" → "ico" "application/octet-stream" → ""
func GetHTML ¶
GetHTML performs an HTTP GET request using net/http and returns the response body as a safe-to-use byte slice.
The function accepts optional parameters (opts) to configure the request. Currently supported option types are:
- GetOptions : general options for retrieval (e.g., limit, etc.) Only one GetOptions instance can be passed; later ones overwrite earlier ones.
- ClientOption : client-specific options (e.g., ClientOptionWithTimeout()) Multiple ClientOption instances can be passed and are all applied.//
func GetImage ¶
GetImage downloads an image from the given URL using net/http and returns a safe-to-use copy of the image bytes along with its format.
The function accepts optional parameters (opts) to configure the request. Currently supported option types are:
- GetOptions : general options for image retrieval (e.g., limit, etc.) Only one GetOptions instance can be passed; later ones overwrite earlier ones.
- ClientOption : client-specific options (e.g., ClientOptionWithTimeout()) Multiple ClientOption instances can be passed and are all applied.//
func GetImageExtFromHeader ¶
GetImageExtFromHeader returns image format based on Content-Type header
func GetLinksInHead ¶
GetLinksInHead downloads the HTML from the given URL and extracts all <link> tags that are inside the <head> section. The parsing stops as soon as </head> or <body> is encountered, so the rest of the HTML is not read.
The function accepts optional parameters (opts) to configure the request. Currently supported option types are:
- GetOptions : general options for retrieval (e.g., Limit, AcceptLanguage). Only one GetOptions instance can be passed; later ones overwrite earlier ones.
- ClientOption : client-specific options (e.g., ClientOptionWithTimeout()). Multiple ClientOption instances can be passed and are all applied.
func GetSchemeFromURL ¶ added in v0.18.0
GetSchemeFromURL parses a URL and returns the scheme part of the URL.
func MinifyCSS ¶ added in v0.18.0
MinifyCSS minifies the given CSS string and returns the minified version. If an error occurs during minification, it logs the error and returns the original input.
func NewClient ¶
func NewClient(opts ...ClientOption) *http.Client
NewClient creates a new http.Client and applies any number of ClientOptions. opts: variadic list of ClientOption functions to customize the client.
Types ¶
type ClientOption ¶
ClientOption defines a function type that modifies an http.Client.
func ClientOptionWithCookieJar ¶
func ClientOptionWithCookieJar(opt *cookiejar.Options) ClientOption
ClientOptionWithCookieJar returns a ClientOption that sets a custom cookie jar for the client. opt: options used to configure the cookie jar.
func ClientOptionWithDefaultCookieJar ¶
func ClientOptionWithDefaultCookieJar() ClientOption
ClientOptionWithDefaultCookieJar returns a ClientOption that sets a default (empty) cookie jar for the client.
func ClientOptionWithTimeout ¶
func ClientOptionWithTimeout(d time.Duration) ClientOption
ClientOptionWithTimeout returns a ClientOption that sets the client's timeout. d: the duration before the client times out for a request.
type GetOptions ¶
type GetOptions struct {
// Limit specifies the maximum number of bytes to read from the response.
Limit int64
// AcceptLanguage sets the value for the "Accept-Language" HTTP header.
// Example: "en-US,en;q=0.9"
AcceptLanguage string
// IgnoreStatusCode — ignore HTTP 4xx/5xx and return the response body.
IgnoreStatusCode bool
}
GetOptions defines optional parameters for Get methods.