text

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 28, 2021 License: GPL-3.0 Imports: 9 Imported by: 0

Documentation

Index

Constants

View Source
const (
	ReadDirectionForward = ReadDirection(iota)
	ReadDirectionBackward
)

Variables

This section is empty.

Functions

func Repeat

func Repeat(c rune, n int) string

Repeat creates a string with the same character repeated n times.

func Reverse

func Reverse(s string) string

Reverse reverses the bytes of a string. The result may not be a valid UTF-8.

func SearchAllInString

func SearchAllInString(q string, text string) []uint64

func SearchNextInReader

func SearchNextInReader(q string, r io.Reader) (bool, uint64, error)

func ToggleRuneCase

func ToggleRuneCase(r rune) rune

ToggleRuneCase changes the case of the rune from lower-to-upper or vice-versa.

Types

type CloneableReader

type CloneableReader interface {
	io.Reader

	// Clone creates a new, independent reader at the same position as the original reader.
	Clone() CloneableReader
}

CloneableReader is an io.Read that can be cloned to produce a new, independent reader at the same position.

func NewCloneableReaderFromString

func NewCloneableReaderFromString(s string) CloneableReader

func NewSingleByteReader

func NewSingleByteReader(s string) CloneableReader

type CloneableRuneIter

type CloneableRuneIter interface {
	RuneIter

	// Clone returns a new, independent iterator at the same position as the original iterator.
	Clone() CloneableRuneIter
}

CloneableRuneIter is a RuneIter that can be cloned to produce a new, independent iterator at the same position as the original iterator.

func NewCloneableBackwardRuneIter

func NewCloneableBackwardRuneIter(in CloneableReader) CloneableRuneIter

If inputReversed is true, then it interprets the reader output in reverse order.

func NewCloneableForwardRuneIter

func NewCloneableForwardRuneIter(in CloneableReader) CloneableRuneIter

NewCloneableForwardRuneIter creates a CloneableRuneIter for a stream of UTF-8 bytes. It assumes the provided reader produces a stream of valid UTF-8 bytes.

func NewRuneIterForSlice

func NewRuneIterForSlice(runeSlice []rune) CloneableRuneIter

NewRuneIterForSlice returns a RuneIter over the given slice. This assumes that runeSlice will be immutable for the lifetime of the iterator.

type ReadDirection

type ReadDirection int

func (ReadDirection) Reverse

func (d ReadDirection) Reverse() ReadDirection

func (ReadDirection) String

func (d ReadDirection) String() string

type RuneIter

type RuneIter interface {
	// NextRune returns the next available rune.  If no rune is available, it returns the error io.EOF.
	NextRune() (rune, error)
}

RuneIter iterates over UTF-8 codepoints (runes).

type Searcher

type Searcher struct {
	// contains filtered or unexported fields
}

Searcher searches for an exact match of a query. It uses the Knuth-Morris-Pratt algorithm, which runs in O(n+m) time, where n is the length of the text and m is the length of the query.

func NewSearcher

func NewSearcher(query string) *Searcher

func (*Searcher) AllInString

func (s *Searcher) AllInString(text string, matchPositions []uint64) []uint64

AllInString finds all (possibly overlapping) matches of the query in a string. It returns the rune positions for the start of each match. If not nil, the matchPositions slice will be used to store the results (avoids allocating a new slice for each call).

func (*Searcher) NextInReader

func (s *Searcher) NextInReader(r io.Reader) (bool, uint64, error)

NextInReader finds the next occurrence of a query in the text produced by an io.Reader. If it finds a match, it returns the offset (in rune positions) from the start of the reader.

type SingleByteReader

type SingleByteReader struct {
	// contains filtered or unexported fields
}

SingleByteReader is a CloneableReader that produces a single byte at a time.

func (*SingleByteReader) Clone

func (r *SingleByteReader) Clone() CloneableReader

func (*SingleByteReader) Read

func (r *SingleByteReader) Read(p []byte) (n int, err error)

type Tree

type Tree struct {
	// contains filtered or unexported fields
}

text.Tree is a data structure for representing UTF-8 text. It supports efficient insertions, deletions, and lookup by character offset and line number. It is inspired by two papers: Boehm, H. J., Atkinson, R., & Plass, M. (1995). Ropes: an alternative to strings. Software: Practice and Experience, 25(12), 1315-1330. Rao, J., & Ross, K. A. (2000, May). Making B+-trees cache conscious in main memory. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data (pp. 475-486). Like a rope, the tree maintains character counts at each level to efficiently locate a character at a given offset. To use the CPU cache efficiently, all children of a node are pre-allocated in a group (what the Rao & Ross paper calls a "full" cache-sensitive B+ tree), and the parent uses offsets within the node group to identify child nodes. All nodes are carefully designed to fit as much data as possible within a 64-byte cache line.

func NewTree

func NewTree() *Tree

NewTree returns a tree representing an empty string.

func NewTreeFromReader

func NewTreeFromReader(r io.Reader) (*Tree, error)

NewTreeFromReader creates a new Tree from a reader that produces UTF-8 text. This is more efficient than inserting the bytes into an empty tree. Returns an error if the bytes are invalid UTF-8.

func NewTreeFromString

func NewTreeFromString(s string) (*Tree, error)

NewTreeFromString creates a new Tree from a UTF-8 string.

func (*Tree) DeleteAtPosition

func (t *Tree) DeleteAtPosition(charPos uint64) (bool, rune)

DeleteAtPosition removes the UTF-8 character at the specified position (0-indexed). If charPos is past the end of the text, this has no effect.

func (*Tree) InsertAtPosition

func (t *Tree) InsertAtPosition(charPos uint64, c rune) error

InsertAtPosition inserts a UTF-8 character at the specified position (0-indexed). If charPos is past the end of the text, it will be appended at the end. Returns an error if c is not a valid UTF-8 character.

func (*Tree) LineNumForPosition

func (t *Tree) LineNumForPosition(charPos uint64) uint64

LineNumForPosition returns the line number (0-indexed) for the line containing the specified position.

func (*Tree) LineStartPosition

func (t *Tree) LineStartPosition(lineNum uint64) uint64

LineStartPosition returns the position of the first character at the specified line (0-indexed). If the line number is greater than the maximum line number, returns one past the position of the last character.

func (*Tree) NumChars

func (t *Tree) NumChars() uint64

NumChars returns the total number of characters (runes) in the tree.

func (*Tree) NumLines

func (t *Tree) NumLines() uint64

NumLines returns the total number of lines in the tree.

func (*Tree) ReaderAtPosition

func (t *Tree) ReaderAtPosition(charPos uint64, direction ReadDirection) *TreeReader

ReaderAtPosition returns a reader starting at the UTF-8 character at the specified position (0-indexed). If the position is past the end of the text, the returned reader will read zero bytes.

func (*Tree) String

func (t *Tree) String() string

String returns the text in the tree as a string.

type TreeReader

type TreeReader struct {
	// contains filtered or unexported fields
}

TreeReader reads UTF-8 bytes from a text.Tree. It implements io.Reader and CloneableReader. text.Tree is NOT thread-safe, so reading from a tree while modifying it is undefined behavior!

func (*TreeReader) Clone

func (r *TreeReader) Clone() CloneableReader

Clone implements CloneableReader#Clone

func (*TreeReader) Read

func (r *TreeReader) Read(b []byte) (int, error)

Read implements io.Reader#Read

func (*TreeReader) SeekBackward

func (r *TreeReader) SeekBackward(offset uint64) error

SeekBackward implements parser.InputReader#SeekBackward

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL