README

Go Reference Go Report Card Gitter

url2epub

Create ePub files from URLs

Overview

The root directory provides a Go library that creates ePub files out of URLs, with limitations (currently only support articles with an AMP version).

rmapi/ directory provides a Go library that implements reMarkable API, so that the ePub files generated can be sent to reMarkable paper tablet directly.

tgbot/ directory provides a Go library that implements partial Telegram bot API, so all this can be done in a Telegram message.

appengine/ directory provides the AppEngine implementation of the Telegram Bot that does all this.

License

BSD 3-Clause.

Expand ▾ Collapse ▴

Documentation

Overview

Package url2epub fetches http(s) URL and extracts ePub files from them.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DrainAndClose

func DrainAndClose(r io.ReadCloser) error

DrainAndClose drains and closes r.

func Epub

func Epub(args EpubArgs) (id string, err error)

Epub creates an Epub 3.0 file from given content.

Types

type EpubArgs

type EpubArgs struct {
	// The destination to write the epub content to.
	Dest io.Writer

	// The title of the epub.
	Title string

	// The node pointing to the html tag.
	Node *html.Node

	// Images map:
	// key: image local filename
	// value: image content
	Images map[string]io.Reader
}

EpubArgs defines the args used by Epub function.

type GetHTMLArgs

type GetHTMLArgs struct {
	// The HTTP GET URL, required.
	URL string

	// The User-Agent header to use, optional.
	UserAgent string
}

GetHTMLArgs define the arguments used by GetHTML function.

type Node

type Node html.Node

Node is typedef'd *html.Node with helper functions attached.

func FromNode

func FromNode(n *html.Node) *Node

FromNode casts *html.Node into *Node.

func GetHTML

func GetHTML(ctx context.Context, args GetHTMLArgs) (*Node, *url.URL, error)

GetHTML does HTTP get requests on HTML content.

It's different from standard http.Get in the following ways:

- If there are redirects happening during the request, returned URL will be the URL of the last (final) request.

- Instead of returning *http.Response, it returns parsed *html.Node, with Type being ElementNode and DataAtom being Html (instead of root node, which is usually DoctypeNode).

- The client used by Get does not have timeout set. It's expected that a deadline is set in the ctx passed in.

func (Node) AsNode

func (n Node) AsNode() html.Node

AsNode casts n back to *html.Node

func (*Node) FindFirstAtomNode

func (n *Node) FindFirstAtomNode(a atom.Atom) *Node

FindFirstAtomNode returns n itself or the first node in its descendants, with Type == html.ElementNode and DataAtom == a, using depth first search.

If none of n's descendants matches, nil will be returned.

func (Node) ForEachChild

func (n Node) ForEachChild(f func(child *Node) bool)

ForEachChild calls f on each of n's children.

If f returns false, ForEachChild stops the iteration.

func (*Node) GetAMPurl

func (n *Node) GetAMPurl() string

GetAMPurl returns the amp URL of the document, if any.

func (*Node) GetLang

func (n *Node) GetLang() string

GetLang returns the lang attribute of html node, if any.

func (*Node) GetTitle

func (n *Node) GetTitle() string

GetTitle returns the title of the document, if any.

Note that if og:title exists in the meta header, it's preferred over title.

func (*Node) IsAMP

func (n *Node) IsAMP() bool

IsAMP returns true if root is an AMP html document.

func (*Node) Readable

func (n *Node) Readable(ctx context.Context, args ReadableArgs) (*html.Node, map[string]io.Reader, error)

Readable strips node n into a readable one, with all images downloaded and replaced.

type ReadableArgs

type ReadableArgs struct {
	// Base URL of the document, used in case the image URLs are relative.
	BaseURL *url.URL

	// User-Agent to be used to download images.
	UserAgent string

	// Directory prefix for downloaded images.
	ImagesDir string

	// If Grayscale is set to true,
	// all images will be grayscaled and encoded as jpegs.
	//
	// If any error happened while trying to grayscale the image,
	// it will be logged via Logger.
	Grayscale bool
	Logger    logger.Logger
}

ReadableArgs defines the args used by Readable function.

Directories

Path Synopsis
grayscale Package grayscale provides function to grayscale an image.
internal/set Package set provides a simple set implementation.
logger Package logger provides a simple log interface that you can wrap whatever logging library you use into.
rmapi Package rmapi implements reMarkable api, as described in https://github.com/splitbrain/ReMarkableAPI/wiki.
tgbot Package tgbot provides some simple wrapping around telegram bot api.
ziputil Package ziputil provides some utility functions for zip archive handling.
MODULE appengine
MODULE debug