text

package

v0.2.0 Latest Latest Go to latest Published: Aug 21, 2021 License: GPL-3.0 Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/aretext/aretext

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
Variables
func Repeat(c rune, n int) string
func Reverse(s string) string
func SearchAllInString(q string, text string) []uint64
func SearchNextInReader(q string, r io.Reader) (bool, uint64, error)
func ToggleRuneCase(r rune) rune
type CloneableReader
- func NewCloneableReaderFromString(s string) CloneableReader
- func NewSingleByteReader(s string) CloneableReader
type CloneableRuneIter
- func NewCloneableBackwardRuneIter(in CloneableReader) CloneableRuneIter
- func NewCloneableForwardRuneIter(in CloneableReader) CloneableRuneIter
- func NewRuneIterForSlice(runeSlice []rune) CloneableRuneIter
type ReadDirection
- func (d ReadDirection) Reverse() ReadDirection
- func (d ReadDirection) String() string
type RuneIter
type Searcher
- func NewSearcher(query string) *Searcher
- func (s *Searcher) AllInString(text string, matchPositions []uint64) []uint64
- func (s *Searcher) Limit(offset uint64) *Searcher
- func (s *Searcher) NextInReader(r io.Reader) (bool, uint64, error)
type SingleByteReader
- func (r *SingleByteReader) Clone() CloneableReader
- func (r *SingleByteReader) Read(p []byte) (n int, err error)
type Tree
- func NewTree() *Tree
- func NewTreeFromReader(r io.Reader) (*Tree, error)
- func NewTreeFromString(s string) (*Tree, error)
- func (t *Tree) DeleteAtPosition(charPos uint64) (bool, rune)
- func (t *Tree) InsertAtPosition(charPos uint64, c rune) error
- func (t *Tree) LineNumForPosition(charPos uint64) uint64
- func (t *Tree) LineStartPosition(lineNum uint64) uint64
- func (t *Tree) NumChars() uint64
- func (t *Tree) NumLines() uint64
- func (t *Tree) ReaderAtPosition(charPos uint64, direction ReadDirection) *TreeReader
- func (t *Tree) String() string
type TreeReader
- func (r *TreeReader) Clone() CloneableReader
- func (r *TreeReader) Read(b []byte) (int, error)
- func (r *TreeReader) SeekBackward(offset uint64) error

Constants ¶

View Source

const (
	ReadDirectionForward = ReadDirection(iota)
	ReadDirectionBackward
)

Variables ¶

View Source

var (
	InvalidUtf8Error = errors.New("invalid UTF-8")
)

Functions ¶

func Repeat ¶

func Repeat(c rune, n int) string

Repeat creates a string with the same character repeated n times.

func Reverse ¶

func Reverse(s string) string

Reverse reverses the bytes of a string. The result may not be a valid UTF-8.

func SearchAllInString ¶

func SearchAllInString(q string, text string) []uint64

func SearchNextInReader ¶

func SearchNextInReader(q string, r io.Reader) (bool, uint64, error)

func ToggleRuneCase ¶

func ToggleRuneCase(r rune) rune

ToggleRuneCase changes the case of the rune from lower-to-upper or vice-versa.

Types ¶

type CloneableReader ¶

type CloneableReader interface {
	io.Reader

	// Clone creates a new, independent reader at the same position as the original reader.
	Clone() CloneableReader
}

CloneableReader is an io.Reader that can be cloned to produce a new, independent reader at the same position.

func NewCloneableReaderFromString ¶

func NewCloneableReaderFromString(s string) CloneableReader

func NewSingleByteReader ¶

func NewSingleByteReader(s string) CloneableReader

type CloneableRuneIter ¶

type CloneableRuneIter interface {
	RuneIter

	// Clone returns a new, independent iterator at the same position as the original iterator.
	Clone() CloneableRuneIter
}

CloneableRuneIter is a RuneIter that can be cloned to produce a new, independent iterator at the same position as the original iterator.

func NewCloneableBackwardRuneIter ¶

func NewCloneableBackwardRuneIter(in CloneableReader) CloneableRuneIter

NewCloneableBackwardRuneIter creates a CloneableRuneIter for a stream of UTF-8 bytes in reverse order.

func NewCloneableForwardRuneIter ¶

func NewCloneableForwardRuneIter(in CloneableReader) CloneableRuneIter

NewCloneableForwardRuneIter creates a CloneableRuneIter for a stream of UTF-8 bytes. It assumes the provided reader produces a stream of valid UTF-8 bytes.

func NewRuneIterForSlice ¶

func NewRuneIterForSlice(runeSlice []rune) CloneableRuneIter

NewRuneIterForSlice returns a RuneIter over the given slice. This assumes that runeSlice will be immutable for the lifetime of the iterator.

type ReadDirection ¶

type ReadDirection int

func (ReadDirection) Reverse ¶

func (d ReadDirection) Reverse() ReadDirection

func (ReadDirection) String ¶

func (d ReadDirection) String() string

type RuneIter ¶

type RuneIter interface {
	// NextRune returns the next available rune.  If no rune is available, it returns the error io.EOF.
	NextRune() (rune, error)
}

RuneIter iterates over unicode codepoints (runes).

type Searcher ¶

type Searcher struct {
	// contains filtered or unexported fields
}

Searcher searches for an exact match of a query. It uses the Knuth-Morris-Pratt algorithm, which runs in O(n+m) time, where n is the length of the text and m is the length of the query.

func NewSearcher ¶

func NewSearcher(query string) *Searcher

func (*Searcher) AllInString ¶

func (s *Searcher) AllInString(text string, matchPositions []uint64) []uint64

AllInString finds all (possibly overlapping) matches of the query in a string. It returns the rune positions for the start of each match. If not nil, the matchPositions slice will be used to store the results (avoids allocating a new slice for each call).

func (*Searcher) Limit ¶ added in v0.2.0

func (s *Searcher) Limit(offset uint64) *Searcher

Limit sets the maximum offset (in rune positions) for the end of a match. For example, a limit of 3 would allow matches that end on the second rune from the reader, but not on the following runes.

func (*Searcher) NextInReader ¶

func (s *Searcher) NextInReader(r io.Reader) (bool, uint64, error)

NextInReader finds the next occurrence of a query in the text produced by an io.Reader. If it finds a match, it returns the offset (in rune positions) from the start of the reader.

type SingleByteReader ¶

type SingleByteReader struct {
	// contains filtered or unexported fields
}

SingleByteReader is a CloneableReader that produces a single byte at a time.

func (*SingleByteReader) Clone ¶

func (r *SingleByteReader) Clone() CloneableReader

func (*SingleByteReader) Read ¶

func (r *SingleByteReader) Read(p []byte) (n int, err error)

type Tree ¶

type Tree struct {
	// contains filtered or unexported fields
}

text.Tree is a data structure for representing UTF-8 text. It supports efficient insertions, deletions, and lookup by character offset and line number. It is inspired by two papers: Boehm, H. J., Atkinson, R., & Plass, M. (1995). Ropes: an alternative to strings. Software: Practice and Experience, 25(12), 1315-1330. Rao, J., & Ross, K. A. (2000, May). Making B+-trees cache conscious in main memory. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data (pp. 475-486). Like a rope, the tree maintains character counts at each level to efficiently locate a character at a given offset. To use the CPU cache efficiently, all children of a node are pre-allocated in a group (what the Rao & Ross paper calls a "full" cache-sensitive B+ tree), and the parent uses offsets within the node group to identify child nodes. All nodes are carefully designed to fit as much data as possible within a 64-byte cache line.

func NewTree ¶

func NewTree() *Tree

NewTree returns a tree representing an empty string.

func NewTreeFromReader ¶

func NewTreeFromReader(r io.Reader) (*Tree, error)

NewTreeFromReader creates a new Tree from a reader that produces UTF-8 text. This is more efficient than inserting the bytes into an empty tree. Returns an error if the bytes are invalid UTF-8.

func NewTreeFromString ¶

func NewTreeFromString(s string) (*Tree, error)

NewTreeFromString creates a new Tree from a UTF-8 string.

func (*Tree) DeleteAtPosition ¶

func (t *Tree) DeleteAtPosition(charPos uint64) (bool, rune)

DeleteAtPosition removes the UTF-8 character at the specified position (0-indexed). If charPos is past the end of the text, this has no effect.

func (*Tree) InsertAtPosition ¶

func (t *Tree) InsertAtPosition(charPos uint64, c rune) error

InsertAtPosition inserts a UTF-8 character at the specified position (0-indexed). If charPos is past the end of the text, it will be appended at the end. Returns an error if c is not a valid UTF-8 character.

func (*Tree) LineNumForPosition ¶

func (t *Tree) LineNumForPosition(charPos uint64) uint64

LineNumForPosition returns the line number (0-indexed) for the line containing the specified position.

func (*Tree) LineStartPosition ¶

func (t *Tree) LineStartPosition(lineNum uint64) uint64

LineStartPosition returns the position of the first character at the specified line (0-indexed). If the line number is greater than the maximum line number, returns one past the position of the last character.

func (*Tree) NumChars ¶

func (t *Tree) NumChars() uint64

NumChars returns the total number of characters (runes) in the tree.

func (*Tree) NumLines ¶

func (t *Tree) NumLines() uint64

NumLines returns the total number of lines in the tree.

func (*Tree) ReaderAtPosition ¶

func (t *Tree) ReaderAtPosition(charPos uint64, direction ReadDirection) *TreeReader

ReaderAtPosition returns a reader starting at the UTF-8 character at the specified position (0-indexed). If the position is past the end of the text, the returned reader will read zero bytes.

func (*Tree) String ¶

func (t *Tree) String() string

String returns the text in the tree as a string.

type TreeReader ¶

type TreeReader struct {
	// contains filtered or unexported fields
}

TreeReader reads UTF-8 bytes from a text.Tree. It implements io.Reader and CloneableReader. text.Tree is NOT thread-safe, so reading from a tree while modifying it is undefined behavior!

func (*TreeReader) Clone ¶

func (r *TreeReader) Clone() CloneableReader

Clone implements CloneableReader#Clone

func (*TreeReader) Read ¶

func (r *TreeReader) Read(b []byte) (int, error)

Read implements io.Reader#Read

func (*TreeReader) SeekBackward ¶

func (r *TreeReader) SeekBackward(offset uint64) error

SeekBackward implements parser.InputReader#SeekBackward

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
segment
utf8

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL