html2text

package module
v0.0.0-...-e894509 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 6, 2025 License: MIT Imports: 11 Imported by: 0

README

html2text

This is a custom fork of jaytaylor/html2text tailored towards usage in feed2imap-go.

You are free to use it, but I make heavy use of force-pushes, so be warned.

Changes incorporated in this branch:

  • go.mod support (PR #42)
  • citation style links (PR #41)
  • Support for tablewriter v1.0.0 (PR #68)
  • Removed cmd/html2text
  • Renamed to Necoro/html2text to avoid problems in downstream

Documentation

Overview

Example
inputHTML := `
<html>
	<head>
		<title>My Mega Service</title>
		<link rel=\"stylesheet\" href=\"main.css\">
		<style type=\"text/css\">body { color: #fff; }</style>
	</head>

	<body>
		<div class="logo">
			<a href="http://jaytaylor.com/"><img src="/logo-image.jpg" alt="Mega Service"/></a>
		</div>

		<h1>Welcome to your new account on my service!</h1>

		<p>
			Here is some more information:

			<ul>
				<li>Link 1: <a href="https://example.com">Example.com</a></li>
				<li>Link 2: <a href="https://example2.com">Example2.com</a></li>
				<li>Something else</li>
			</ul>
		</p>

		<table>
			<thead>
				<tr><th>Header 1</th><th>Header 2</th></tr>
			</thead>
			<tfoot>
				<tr><td>Footer 1</td><td>Footer 2</td></tr>
			</tfoot>
			<tbody>
				<tr><td>Row 1 Col 1</td><td>Row 1 Col 2</td></tr>
				<tr><td>Row 2 Col 1</td><td>Row 2 Col 2</td></tr>
			</tbody>
		</table>
	</body>
</html>`

text, err := FromString(inputHTML, Options{PrettyTables: true})
if err != nil {
	panic(err)
}
fmt.Println(text)
Output:

Mega Service ( http://jaytaylor.com/ )

******************************************
Welcome to your new account on my service!
******************************************

Here is some more information:

* Link 1: Example.com ( https://example.com )
* Link 2: Example2.com ( https://example2.com )
* Something else

+-------------+-------------+
|  HEADER 1   |  HEADER 2   |
+-------------+-------------+
| Row 1 Col 1 | Row 1 Col 2 |
| Row 2 Col 1 | Row 2 Col 2 |
+-------------+-------------+
|  FOOTER 1   |  FOOTER 2   |
+-------------+-------------+

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func FromHTMLNode

func FromHTMLNode(doc *html.Node, o ...Options) (string, error)

FromHTMLNode renders text output from a pre-parsed HTML document.

func FromReader

func FromReader(reader io.Reader, options ...Options) (string, error)

FromReader renders text output after parsing HTML for the specified io.Reader.

func FromString

func FromString(input string, options ...Options) (string, error)

FromString parses HTML from the input string, then renders the text form.

Types

type Border

type Border struct {
	Left, Right, Bottom, Top bool
}

Border controls tablewriter borders. It uses simple bools instead of tablewriters `State`

type BorderStyle

type BorderStyle struct {
	ColumnSeparator string
	RowSeparator    string
	CenterSeparator string
}

func (BorderStyle) BottomLeft

func (b BorderStyle) BottomLeft() string

func (BorderStyle) BottomMid

func (b BorderStyle) BottomMid() string

func (BorderStyle) BottomRight

func (b BorderStyle) BottomRight() string

func (BorderStyle) Center

func (b BorderStyle) Center() string

func (BorderStyle) Column

func (b BorderStyle) Column() string

func (BorderStyle) HeaderLeft

func (b BorderStyle) HeaderLeft() string

func (BorderStyle) HeaderMid

func (b BorderStyle) HeaderMid() string

func (BorderStyle) HeaderRight

func (b BorderStyle) HeaderRight() string

func (BorderStyle) MidLeft

func (b BorderStyle) MidLeft() string

func (BorderStyle) MidRight

func (b BorderStyle) MidRight() string

func (BorderStyle) Name

func (b BorderStyle) Name() string

func (BorderStyle) Row

func (b BorderStyle) Row() string

func (BorderStyle) TopLeft

func (b BorderStyle) TopLeft() string

func (BorderStyle) TopMid

func (b BorderStyle) TopMid() string

func (BorderStyle) TopRight

func (b BorderStyle) TopRight() string

type Options

type Options struct {
	PrettyTables        bool                 // Turns on pretty ASCII rendering for table elements.
	PrettyTablesOptions *PrettyTablesOptions // Configures pretty ASCII rendering for table elements.
	OmitLinks           bool                 // Turns on omitting links
	TextOnly            bool                 // Returns only plain text
	CitationStyleLinks  bool                 // Uses citation style links like [1]
}

Options provide toggles and overrides to control specific rendering behaviors.

type PrettyTablesOptions

type PrettyTablesOptions struct {
	AutoFormatHeader bool
	AutoWrapText     bool
	// Deprecated. Tablewriter always assumes this to be `true`
	ReflowDuringAutoWrap bool
	ColWidth             int
	ColumnSeparator      string
	RowSeparator         string
	CenterSeparator      string
	HeaderAlignment      tw.Align
	FooterAlignment      tw.Align
	Alignment            tw.Align
	ColumnAlignment      tw.Alignment
	// Deprecated. Tablewriter always assumes this to be `\n`
	NewLine        string
	HeaderLine     bool
	RowLine        bool
	AutoMergeCells bool
	Borders        Border
	// Configuration allows to directly manipulate the `Table` with all what [tablewriter] offers.
	// Setting this ignores all the rest of the settings of this struct.
	Configuration func(table *tablewriter.Table)
}

PrettyTablesOptions overrides tablewriter behaviors

func NewPrettyTablesOptions

func NewPrettyTablesOptions() *PrettyTablesOptions

NewPrettyTablesOptions creates PrettyTablesOptions with default settings

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL