htmltotext

package

v0.7.0 Latest Latest Go to latest Published: Feb 19, 2026 License: MIT Imports: 4 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/WaylonWalker/markata-go

Links

Open Source Insights

Documentation ¶

Overview ¶

Package htmltotext converts HTML content to plain text with proper formatting.

Overview ¶

This package provides HTML-to-plain-text conversion that:

Decodes all HTML entities to their Unicode equivalents
Converts hyperlinks to footnote-style references (Lynx/Pandoc convention)
Strips all HTML tags while preserving meaningful whitespace
Preserves block-level structure (paragraphs, headings, lists)

Link Formatting ¶

Links are converted to footnote-style references following the Lynx/Pandoc convention. Each unique URL gets a sequential reference number:

Input:  <a href="https://go.dev">Go</a> is great. See <a href="https://go.dev/doc">docs</a>.
Output: Go [1] is great. See docs [2].

        References:
        [1]: https://go.dev
        [2]: https://go.dev/doc

When the link text matches the URL (bare links), no footnote is added:

Input:  Visit <a href="https://go.dev">https://go.dev</a>
Output: Visit https://go.dev

Duplicate URLs reuse the same reference number.

Usage ¶

text := htmltotext.Convert("<p>Hello &amp; <a href=\"https://go.dev\">Go</a></p>")
// Returns: "Hello & Go [1]\n\nReferences:\n[1]: https://go.dev"

Index ¶

func Convert(htmlContent string) string

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Convert ¶

func Convert(htmlContent string) string

Convert transforms HTML content into plain text with footnote-style link references. It decodes HTML entities, strips tags while preserving block structure, and appends a references section for any hyperlinks found.

Links where the visible text matches the URL are rendered inline without a footnote reference. Duplicate URLs share the same reference number.

Types ¶

This section is empty.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL