Documentation
¶
Overview ¶
Package segmenter implements Unicode rules used to segment a paragraph of text according to several criteria. In particular, it provides a way of delimiting line break opportunities.
The API of the package follows the very nice iterator pattern proposed in github.com/npillmayer/uax, but use a somewhat simpler internal implementation, inspired by Pango.
The reference documentation is at https://unicode.org/reports/tr14 and https://unicode.org/reports/tr29.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Grapheme ¶
type Grapheme struct {
// Text is a subslice of the original input slice, containing the delimited grapheme
Text []rune
// Offset is the start of the grapheme in the input rune slice
Offset int
}
Line is the content of a grapheme delimited by the segmenter.
type GraphemeIterator ¶
type GraphemeIterator struct {
// contains filtered or unexported fields
}
GraphemeIterator provides a convenient way of iterating over the graphemes delimited by a `Segmenter`.
func (*GraphemeIterator) Grapheme ¶
func (gr *GraphemeIterator) Grapheme() Grapheme
Grapheme returns the current `Grapheme`
func (*GraphemeIterator) Next ¶
func (gr *GraphemeIterator) Next() bool
Next returns true if there is still a grapheme to process, and advances the iterator; or return false.
type Line ¶
type Line struct {
// Text is a subslice of the original input slice, containing the delimited line
Text []rune
// Offset is the start of the line in the input rune slice
Offset int
// IsMandatoryBreak is true if breaking (at the end of the line)
// is mandatory
IsMandatoryBreak bool
}
Line is the content of a line delimited by the segmenter.
type LineIterator ¶
type LineIterator struct {
// contains filtered or unexported fields
}
LineIterator provides a convenient way of iterating over the lines delimited by a `Segmenter`.
func (*LineIterator) Next ¶
func (li *LineIterator) Next() bool
Next returns true if there is still a line to process, and advances the iterator; or return false.
type Segmenter ¶
type Segmenter struct {
// contains filtered or unexported fields
}
Segmenter is the entry point of the package.
Usage :
var seg Segmenter
seg.Init(...)
iter := seg.LineIterator()
for iter.Next() {
... // do something with iter.Line()
}
func (*Segmenter) GraphemeIterator ¶
func (sg *Segmenter) GraphemeIterator() *GraphemeIterator
GraphemeIterator returns an iterator over the graphemes delimited in [Init].
func (*Segmenter) Init ¶
Init resets the segmenter storage with the given input, and computes the attributes required to segment the text.
func (*Segmenter) LineIterator ¶
func (sg *Segmenter) LineIterator() *LineIterator
LineIterator returns an iterator on the lines delimited in [Init].