lolhtml

package module
v0.2.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 27, 2020 License: BSD-3-Clause Imports: 6 Imported by: 0

README

go-lolhtml

GitHub Workflow Status Codecov Go Report Card PkgGoDev

Go bindings for the Rust library cloudflare/lol-html, the Low Output Latency streaming HTML rewriter/parser with CSS-selector based API, talking via cgo.

Status: All abilities provided by C-API implemented, except for customized user data in handlers. Tests are partially covered. The code is at its early stage and the API is therefore subject to change. If you have any ideas on how API can be better structured, feel free to open a PR or an issue.

Installation

For Linux/macOS/Windows x86_64 platforms, installation is as simple as a single go get:

$ go get github.com/coolspring8/go-lolhtml

There is no need for you to install Rust. That's because lol-html could be prebuilt into static libraries, stored and shipped in /build folder, so that cgo can handle other matters naturally and smoothly.

For other platforms, you'll have to compile it yourself.

Getting Started

Now let's initialize a project and create main.go:

package main

import (
	"bytes"
	"github.com/coolspring8/go-lolhtml"
	"io"
	"log"
	"os"
)

func main() {
	chunk := []byte("Hello, <span>World</span>!")
	r := bytes.NewReader(chunk)
	w, err := lolhtml.NewWriter(
		os.Stdout,
		&lolhtml.Handlers{
			ElementContentHandler: []lolhtml.ElementContentHandler{
				{
					Selector: "span",
					ElementHandler: func(e *lolhtml.Element) lolhtml.RewriterDirective {
						err := e.SetInnerContentAsText("LOL-HTML")
						if err != nil {
							log.Fatal(err)
						}
						return lolhtml.Continue
					},
				},
			},
		},
	)
	if err != nil {
		log.Fatal(err)
	}
	defer w.Free()

	_, err = io.Copy(w, r)
	if err != nil {
		log.Fatal(err)
	}
    
	err = w.End()
	if err != nil {
		log.Fatal(err)
	}
	// Output: Hello, <span>LOL-HTML</span>!
}

The above program takes the chunk Hello, <span>World</span>! as input, is configured to rewrite all texts in span tags to "LOL-HTML" and prints the result to standard output.

And the result is Hello, <span>LOL-HTML</span>! .

For more examples, explore the /examples directory.

Documentation

Available at pkg.go.dev. (WIP)

Other Bindings

License

BSD 3-Clause "New" or "Revised" License

Disclaimer

This is an unofficial binding.

Cloudflare is a registered trademark of Cloudflare, Inc. Cloudflare names used in this project are for identification purposes only. The project is not associated in any way with Cloudflare Inc.

Documentation

Overview

It is a binding for Rust crate lol_html. https://github.com/cloudflare/lol-html

Index

Examples

Constants

This section is empty.

Variables

View Source
var ErrCannotGetErrorMessage = errors.New("cannot get error message from underlying lol-html lib")

Functions

func RewriteString added in v0.2.2

func RewriteString(s string, h *Handlers, config ...Config) (string, error)

RewriteString rewrites the given string with the provided Handlers and Config.

Example
output, err := lolhtml.RewriteString(
	`<div><a href="http://example.com"></a></div>`,
	&lolhtml.Handlers{
		ElementContentHandler: []lolhtml.ElementContentHandler{
			{
				Selector: "a[href]",
				ElementHandler: func(e *lolhtml.Element) lolhtml.RewriterDirective {
					href, err := e.AttributeValue("href")
					if err != nil {
						log.Fatal(err)
					}
					href = strings.ReplaceAll(href, "http:", "https:")

					err = e.SetAttribute("href", href)
					if err != nil {
						log.Fatal(err)
					}

					return lolhtml.Continue
				},
			},
		},
	},
)
if err != nil {
	log.Fatal(err)
}

fmt.Println(output)
Output:

<div><a href="https://example.com"></a></div>

Types

type Attribute

type Attribute C.lol_html_attribute_t

func (*Attribute) Name added in v0.2.1

func (a *Attribute) Name() string

func (*Attribute) Value added in v0.2.1

func (a *Attribute) Value() string

type AttributeIterator

type AttributeIterator C.lol_html_attributes_iterator_t

AttributeIterator cannot be iterated by "range" syntax. You should use AttributeIterator.Next() instead.

func (*AttributeIterator) Free

func (ai *AttributeIterator) Free()

func (*AttributeIterator) Next

func (ai *AttributeIterator) Next() *Attribute

type Comment

type Comment C.lol_html_comment_t

func (*Comment) InsertAfterAsHtml

func (c *Comment) InsertAfterAsHtml(content string) error

func (*Comment) InsertAfterAsText added in v0.2.2

func (c *Comment) InsertAfterAsText(content string) error

func (*Comment) InsertBeforeAsHtml

func (c *Comment) InsertBeforeAsHtml(content string) error

func (*Comment) InsertBeforeAsText added in v0.2.2

func (c *Comment) InsertBeforeAsText(content string) error

func (*Comment) IsRemoved

func (c *Comment) IsRemoved() bool

func (*Comment) Remove

func (c *Comment) Remove()

func (*Comment) ReplaceAsHtml

func (c *Comment) ReplaceAsHtml(content string) error

func (*Comment) ReplaceAsText added in v0.2.2

func (c *Comment) ReplaceAsText(content string) error

func (*Comment) SetText

func (c *Comment) SetText(text string) error

func (*Comment) Text added in v0.2.1

func (c *Comment) Text() string

type CommentHandlerFunc added in v0.2.1

type CommentHandlerFunc func(*Comment) RewriterDirective

type Config

type Config struct {
	// defaults to "utf-8".
	Encoding string
	// defaults to PreallocatedParsingBufferSize: 1024, MaxAllowedMemoryUsage: 1<<63 - 1.
	Memory *MemorySettings
	// defaults to func([]byte) {}. In other words, totally discard output.
	Sink OutputSink
	// defaults to true. If true, bail out for security reasons when ambiguous.
	Strict bool
}

Config defines settings for the rewriter.

type Doctype

type Doctype C.lol_html_doctype_t

func (*Doctype) Name added in v0.2.1

func (d *Doctype) Name() string

func (*Doctype) PublicId added in v0.2.1

func (d *Doctype) PublicId() string

func (*Doctype) SystemId added in v0.2.1

func (d *Doctype) SystemId() string

type DoctypeHandlerFunc added in v0.2.1

type DoctypeHandlerFunc func(*Doctype) RewriterDirective

type DocumentContentHandler added in v0.2.1

type DocumentContentHandler struct {
	DoctypeHandler     DoctypeHandlerFunc
	CommentHandler     CommentHandlerFunc
	TextChunkHandler   TextChunkHandlerFunc
	DocumentEndHandler DocumentEndHandlerFunc
}

type DocumentEnd added in v0.2.1

type DocumentEnd C.lol_html_doc_end_t

func (*DocumentEnd) AppendAsHtml added in v0.2.1

func (d *DocumentEnd) AppendAsHtml(content string) error

func (*DocumentEnd) AppendAsText added in v0.2.2

func (d *DocumentEnd) AppendAsText(content string) error

type DocumentEndHandlerFunc added in v0.2.1

type DocumentEndHandlerFunc func(*DocumentEnd) RewriterDirective

type Element

type Element C.lol_html_element_t

func (*Element) AttributeIterator added in v0.2.1

func (e *Element) AttributeIterator() *AttributeIterator

func (*Element) AttributeValue added in v0.2.1

func (e *Element) AttributeValue(name string) (string, error)

func (*Element) HasAttribute

func (e *Element) HasAttribute(name string) (bool, error)

func (*Element) InsertAfterEndTagAsHtml

func (e *Element) InsertAfterEndTagAsHtml(content string) error

func (*Element) InsertAfterEndTagAsText added in v0.2.2

func (e *Element) InsertAfterEndTagAsText(content string) error

func (*Element) InsertAfterStartTagAsHtml

func (e *Element) InsertAfterStartTagAsHtml(content string) error

func (*Element) InsertAfterStartTagAsText added in v0.2.2

func (e *Element) InsertAfterStartTagAsText(content string) error

func (*Element) InsertBeforeEndTagAsHtml

func (e *Element) InsertBeforeEndTagAsHtml(content string) error

func (*Element) InsertBeforeEndTagAsText added in v0.2.2

func (e *Element) InsertBeforeEndTagAsText(content string) error

func (*Element) InsertBeforeStartTagAsHtml

func (e *Element) InsertBeforeStartTagAsHtml(content string) error

func (*Element) InsertBeforeStartTagAsText added in v0.2.2

func (e *Element) InsertBeforeStartTagAsText(content string) error

func (*Element) IsRemoved

func (e *Element) IsRemoved() bool

func (*Element) NamespaceUri added in v0.2.1

func (e *Element) NamespaceUri() string

func (*Element) Remove

func (e *Element) Remove()

func (*Element) RemoveAndKeepContent

func (e *Element) RemoveAndKeepContent()

func (*Element) RemoveAttribute

func (e *Element) RemoveAttribute(name string) error

func (*Element) ReplaceAsHtml

func (e *Element) ReplaceAsHtml(content string) error

func (*Element) ReplaceAsText added in v0.2.2

func (e *Element) ReplaceAsText(content string) error

func (*Element) SetAttribute

func (e *Element) SetAttribute(name string, value string) error

func (*Element) SetInnerContentAsHtml

func (e *Element) SetInnerContentAsHtml(content string) error

func (*Element) SetInnerContentAsText added in v0.2.2

func (e *Element) SetInnerContentAsText(content string) error

func (*Element) SetTagName

func (e *Element) SetTagName(name string) error

func (*Element) TagName added in v0.2.1

func (e *Element) TagName() string

type ElementContentHandler added in v0.2.1

type ElementContentHandler struct {
	Selector         string
	ElementHandler   ElementHandlerFunc
	CommentHandler   CommentHandlerFunc
	TextChunkHandler TextChunkHandlerFunc
}

type ElementHandlerFunc added in v0.2.1

type ElementHandlerFunc func(*Element) RewriterDirective

type Handlers added in v0.2.1

type Handlers struct {
	DocumentContentHandler []DocumentContentHandler
	ElementContentHandler  []ElementContentHandler
}

type MemorySettings

type MemorySettings struct {
	PreallocatedParsingBufferSize int
	MaxAllowedMemoryUsage         int
}

type OutputSink

type OutputSink func([]byte)

OutputSink takes each chunked output as a byte slice.

type RewriterDirective

type RewriterDirective int

RewriterDirective should returned by callback handlers, to inform the rewriter to continue or stop parsing.

const (
	// Let the normal parsing process continue.
	Continue RewriterDirective = iota

	// Stop the rewriter immediately. Content currently buffered is discarded, and an error is returned.
	Stop
)

type TextChunk

type TextChunk C.lol_html_text_chunk_t

func (*TextChunk) Content added in v0.2.1

func (t *TextChunk) Content() string

func (*TextChunk) InsertAfterAsHtml

func (t *TextChunk) InsertAfterAsHtml(content string) error

func (*TextChunk) InsertAfterAsText added in v0.2.2

func (t *TextChunk) InsertAfterAsText(content string) error

func (*TextChunk) InsertBeforeAsHtml

func (t *TextChunk) InsertBeforeAsHtml(content string) error

func (*TextChunk) InsertBeforeAsText added in v0.2.2

func (t *TextChunk) InsertBeforeAsText(content string) error

func (*TextChunk) IsLastInTextNode

func (t *TextChunk) IsLastInTextNode() bool

func (*TextChunk) IsRemoved

func (t *TextChunk) IsRemoved() bool

func (*TextChunk) Remove

func (t *TextChunk) Remove()

func (*TextChunk) ReplaceAsHtml

func (t *TextChunk) ReplaceAsHtml(content string) error

func (*TextChunk) ReplaceAsText added in v0.2.2

func (t *TextChunk) ReplaceAsText(content string) error

type TextChunkHandlerFunc added in v0.2.1

type TextChunkHandlerFunc func(*TextChunk) RewriterDirective

type Writer added in v0.2.1

type Writer struct {
	// contains filtered or unexported fields
}

func NewWriter added in v0.2.1

func NewWriter(w io.Writer, handlers *Handlers, config ...Config) (*Writer, error)

NewWriter returns a new Writer with Handlers and Config configured, writing to w.

Example
chunk := []byte("Hello, <span>World</span>!")
r := bytes.NewReader(chunk)
w, err := lolhtml.NewWriter(
	os.Stdout,
	&lolhtml.Handlers{
		ElementContentHandler: []lolhtml.ElementContentHandler{
			{
				Selector: "span",
				ElementHandler: func(e *lolhtml.Element) lolhtml.RewriterDirective {
					err := e.SetInnerContentAsText("LOL-HTML")
					if err != nil {
						log.Fatal(err)
					}
					return lolhtml.Continue
				},
			},
		},
	},
)
if err != nil {
	log.Fatal(err)
}
defer w.Free()

_, err = io.Copy(w, r)
if err != nil {
	log.Fatal(err)
}

err = w.End()
if err != nil {
	log.Fatal(err)
}
Output:

Hello, <span>LOL-HTML</span>!

func (*Writer) End added in v0.2.1

func (w *Writer) End() error

func (*Writer) Free added in v0.2.1

func (w *Writer) Free()

func (Writer) Write added in v0.2.1

func (w Writer) Write(p []byte) (n int, err error)

func (Writer) WriteString added in v0.2.1

func (w Writer) WriteString(s string) (n int, err error)

Directories

Path Synopsis
examples
defer-scripts command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL