Documentation
¶
Index ¶
- Constants
- Variables
- func Repeat(c rune, n int) string
- func Reverse(s string) string
- func SearchAllInString(q string, text string) []uint64
- func SearchNextInReader(q string, r io.Reader) (bool, uint64, error)
- func ToggleRuneCase(r rune) rune
- type CloneableReader
- type CloneableRuneIter
- type ReadDirection
- type RuneIter
- type Searcher
- type SingleByteReader
- type Tree
- func (t *Tree) DeleteAtPosition(charPos uint64) (bool, rune)
- func (t *Tree) InsertAtPosition(charPos uint64, c rune) error
- func (t *Tree) LineNumForPosition(charPos uint64) uint64
- func (t *Tree) LineStartPosition(lineNum uint64) uint64
- func (t *Tree) NumChars() uint64
- func (t *Tree) NumLines() uint64
- func (t *Tree) ReaderAtPosition(charPos uint64, direction ReadDirection) *TreeReader
- func (t *Tree) String() string
- type TreeReader
Constants ¶
const ( ReadDirectionForward = ReadDirection(iota) ReadDirectionBackward )
Variables ¶
var (
InvalidUtf8Error = errors.New("invalid UTF-8")
)
Functions ¶
func SearchAllInString ¶
func ToggleRuneCase ¶
ToggleRuneCase changes the case of the rune from lower-to-upper or vice-versa.
Types ¶
type CloneableReader ¶
type CloneableReader interface {
io.Reader
// Clone creates a new, independent reader at the same position as the original reader.
Clone() CloneableReader
}
CloneableReader is an io.Reader that can be cloned to produce a new, independent reader at the same position.
func NewCloneableReaderFromString ¶
func NewCloneableReaderFromString(s string) CloneableReader
func NewSingleByteReader ¶
func NewSingleByteReader(s string) CloneableReader
type CloneableRuneIter ¶
type CloneableRuneIter interface {
RuneIter
// Clone returns a new, independent iterator at the same position as the original iterator.
Clone() CloneableRuneIter
}
CloneableRuneIter is a RuneIter that can be cloned to produce a new, independent iterator at the same position as the original iterator.
func NewCloneableBackwardRuneIter ¶
func NewCloneableBackwardRuneIter(in CloneableReader) CloneableRuneIter
NewCloneableBackwardRuneIter creates a CloneableRuneIter for a stream of UTF-8 bytes in reverse order.
func NewCloneableForwardRuneIter ¶
func NewCloneableForwardRuneIter(in CloneableReader) CloneableRuneIter
NewCloneableForwardRuneIter creates a CloneableRuneIter for a stream of UTF-8 bytes. It assumes the provided reader produces a stream of valid UTF-8 bytes.
func NewRuneIterForSlice ¶
func NewRuneIterForSlice(runeSlice []rune) CloneableRuneIter
NewRuneIterForSlice returns a RuneIter over the given slice. This assumes that runeSlice will be immutable for the lifetime of the iterator.
type ReadDirection ¶
type ReadDirection int
func (ReadDirection) Reverse ¶
func (d ReadDirection) Reverse() ReadDirection
func (ReadDirection) String ¶
func (d ReadDirection) String() string
type RuneIter ¶
type RuneIter interface {
// NextRune returns the next available rune. If no rune is available, it returns the error io.EOF.
NextRune() (rune, error)
}
RuneIter iterates over unicode codepoints (runes).
type Searcher ¶
type Searcher struct {
// contains filtered or unexported fields
}
Searcher searches for an exact match of a query. It uses the Knuth-Morris-Pratt algorithm, which runs in O(n+m) time, where n is the length of the text and m is the length of the query.
func NewSearcher ¶
func (*Searcher) AllInString ¶
AllInString finds all (possibly overlapping) matches of the query in a string. It returns the rune positions for the start of each match. If not nil, the matchPositions slice will be used to store the results (avoids allocating a new slice for each call).
type SingleByteReader ¶
type SingleByteReader struct {
// contains filtered or unexported fields
}
SingleByteReader is a CloneableReader that produces a single byte at a time.
func (*SingleByteReader) Clone ¶
func (r *SingleByteReader) Clone() CloneableReader
type Tree ¶
type Tree struct {
// contains filtered or unexported fields
}
text.Tree is a data structure for representing UTF-8 text. It supports efficient insertions, deletions, and lookup by character offset and line number. It is inspired by two papers: Boehm, H. J., Atkinson, R., & Plass, M. (1995). Ropes: an alternative to strings. Software: Practice and Experience, 25(12), 1315-1330. Rao, J., & Ross, K. A. (2000, May). Making B+-trees cache conscious in main memory. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data (pp. 475-486). Like a rope, the tree maintains character counts at each level to efficiently locate a character at a given offset. To use the CPU cache efficiently, all children of a node are pre-allocated in a group (what the Rao & Ross paper calls a "full" cache-sensitive B+ tree), and the parent uses offsets within the node group to identify child nodes. All nodes are carefully designed to fit as much data as possible within a 64-byte cache line.
func NewTreeFromReader ¶
NewTreeFromReader creates a new Tree from a reader that produces UTF-8 text. This is more efficient than inserting the bytes into an empty tree. Returns an error if the bytes are invalid UTF-8.
func NewTreeFromString ¶
NewTreeFromString creates a new Tree from a UTF-8 string.
func (*Tree) DeleteAtPosition ¶
DeleteAtPosition removes the UTF-8 character at the specified position (0-indexed). If charPos is past the end of the text, this has no effect.
func (*Tree) InsertAtPosition ¶
InsertAtPosition inserts a UTF-8 character at the specified position (0-indexed). If charPos is past the end of the text, it will be appended at the end. Returns an error if c is not a valid UTF-8 character.
func (*Tree) LineNumForPosition ¶
LineNumForPosition returns the line number (0-indexed) for the line containing the specified position.
func (*Tree) LineStartPosition ¶
LineStartPosition returns the position of the first character at the specified line (0-indexed). If the line number is greater than the maximum line number, returns one past the position of the last character.
func (*Tree) ReaderAtPosition ¶
func (t *Tree) ReaderAtPosition(charPos uint64, direction ReadDirection) *TreeReader
ReaderAtPosition returns a reader starting at the UTF-8 character at the specified position (0-indexed). If the position is past the end of the text, the returned reader will read zero bytes.
type TreeReader ¶
type TreeReader struct {
// contains filtered or unexported fields
}
TreeReader reads UTF-8 bytes from a text.Tree. It implements io.Reader and CloneableReader. text.Tree is NOT thread-safe, so reading from a tree while modifying it is undefined behavior!
func (*TreeReader) Clone ¶
func (r *TreeReader) Clone() CloneableReader
Clone implements CloneableReader#Clone
func (*TreeReader) Read ¶
func (r *TreeReader) Read(b []byte) (int, error)
Read implements io.Reader#Read
func (*TreeReader) SeekBackward ¶
func (r *TreeReader) SeekBackward(offset uint64) error
SeekBackward implements parser.InputReader#SeekBackward