Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Filter ¶
Filter removes elements from the input channel where the supplied predicate is satisfied Filter is a Predicate aggregation
func IsNotStopWord ¶
IsNotStopWord is the inverse function of IsStopWord
func IsStopWord ¶
IsStopWord performs a binary search against a list of known english stop words returns true if v is a stop word; false otherwise
Types ¶
type Classifier ¶
type Classifier interface { // Train allows clients to train the classifier Train(io.Reader, string) error // TrainString allows clients to train the classifier using a string TrainString(string, string) error // Classify performs a classification on the input corpus and assumes that // the underlying classifier has been trained. Classify(io.Reader) (string, error) // ClassifyString performs text classification using a string ClassifyString(string) (string, error) }
Classifier provides a simple interface for different text classifiers
type StdOption ¶
type StdOption func(*StdTokenizer)
StdOption provides configuration settings for a StdTokenizer
func BufferSize ¶
BufferSize adjusts the size of the buffered channel
type StdTokenizer ¶
type StdTokenizer struct {
// contains filtered or unexported fields
}
StdTokenizer provides a common document tokenizer that splits a document by word boundaries
func NewTokenizer ¶
func NewTokenizer(opts ...StdOption) *StdTokenizer
NewTokenizer initializes a new standard Tokenizer instance
Click to show internal directories.
Click to hide internal directories.