Documentation ¶
Overview ¶
Package analysis represents API to convert text into indexable/searchable tokens
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type TokenFilter ¶
type TokenFilter interface { // Filter filters the given list with described behaviour Filter(list []Token) []Token }
TokenFilter is responsible for removing, modifiying and altering the given token flow
func NewEnglishStemmerFilter ¶
func NewEnglishStemmerFilter() TokenFilter
NewEnglishStemmerFilter creates a new stemmer for English language.
func NewNormalizerFilter ¶
func NewNormalizerFilter(chars alphabet.Alphabet, pad string) TokenFilter
NewNormalizerFilter returns tokens filter
func NewRussianStemmerFilter ¶
func NewRussianStemmerFilter() TokenFilter
NewRussianStemmerFilter creates a new stemmer for Russian language.
type Tokenizer ¶
type Tokenizer interface { // Splits the given text on a sequence of tokens Tokenize(text string) []Token }
Tokenizer performs splitting the given text on a sequence of tokens
func NewFilterTokenizer ¶
func NewFilterTokenizer(tokenizer Tokenizer, filter TokenFilter) Tokenizer
NewFilterTokenizer creates a new instance of filter tokenizer
func NewNGramTokenizer ¶
NewNGramTokenizer creates a new instance of Tokenizer
func NewWordTokenizer ¶
NewWordTokenizer creates a new instance of Tokenizer
func NewWrapTokenizer ¶
NewWrapTokenizer returns a tokenizer that performs wrap the provided text before tokenization