textproc

package module
Version: v2.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 20, 2020 License: MIT Imports: 6 Imported by: 0

README

Text processing

For example LF end-of-line, remove trailing white space, sort paragraphs.

Go library and command.

https://pkg.go.dev/github.com/MihaiB/textproc/v2

go install ./textproc
`go env GOPATH`/bin/textproc -help

Documentation

Overview

Package textproc provides text processing.

Index

Constants

This section is empty.

Variables

View Source
var ErrInvalidUTF8 = errors.New("Invalid UTF-8")

ErrInvalidUTF8 is the error returned when the input is not valid UTF-8.

Functions

func ConvertLineTerminatorsToLF

func ConvertLineTerminatorsToLF(in <-chan rune) <-chan rune

ConvertLineTerminatorsToLF converts "\r" and "\r\n" to "\n".

func EmitLFLineContent added in v2.1.0

func EmitLFLineContent(in <-chan rune) <-chan []rune

EmitLFLineContent emits the content of each line (excluding the line terminator "\n") as a token.

func EmitLFParagraphContent added in v2.1.0

func EmitLFParagraphContent(in <-chan rune) <-chan []rune

EmitLFParagraphContent emits the content of each paragraph (excluding the line terminator of the paragraph's last line) as a token.

A paragraph consists of adjacent non-empty lines. Lines are terminated by "\n".

func EnsureFinalLFIfNonEmpty

func EnsureFinalLFIfNonEmpty(in <-chan rune) <-chan rune

EnsureFinalLFIfNonEmpty ensures non-empty content ends with "\n".

func Read

func Read(r io.Reader) (<-chan rune, <-chan error)

Read returns two channels. All runes read from r as UTF-8 are sent, then the rune channel is closed, then the error from r is sent, then the error channel is closed.

func SortLFLinesI

func SortLFLinesI(in <-chan rune) <-chan rune

SortLFLinesI reads the content of all lines excluding the line terminator "\n", sorts that content in case-insensitive order and adds "\n" after each item.

func SortLFParagraphsI

func SortLFParagraphsI(in <-chan rune) <-chan rune

SortLFParagraphsI reads the content of all paragraphs excluding the line terminator of a paragraph's last line, sorts that content in case-insensitive order, joins the items with "\n\n" and adds "\n" after the last item.

A paragraph consists of adjacent non-empty lines. Lines are terminated by "\n".

func TrimLFTrailingWhiteSpace

func TrimLFTrailingWhiteSpace(in <-chan rune) <-chan rune

TrimLFTrailingWhiteSpace removes white space at the end of lines. Lines are terminated by "\n".

func TrimLeadingEmptyLFLines

func TrimLeadingEmptyLFLines(in <-chan rune) <-chan rune

TrimLeadingEmptyLFLines removes empty lines at the start of the input. Lines are terminated by "\n".

func TrimTrailingEmptyLFLines

func TrimTrailingEmptyLFLines(in <-chan rune) <-chan rune

TrimTrailingEmptyLFLines removes empty lines at the end of the input. Lines are terminated by "\n".

Types

type Processor

type Processor = func(<-chan rune) <-chan rune

A Processor processes runes.

type Tokenizer added in v2.1.0

type Tokenizer = func(<-chan rune) <-chan []rune

A Tokenizer emits tokens.

Source Files

Directories

Path Synopsis
Textproc processes text.
Textproc processes text.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL