Documentation
¶
Overview ¶
Package spell provides fast spelling correction and string segmentation
Index ¶
- Constants
- type DictionaryOption
- type Entry
- type LookupOption
- func DictionaryOpts(opts ...DictionaryOption) LookupOption
- func DistanceFunc(df func([]rune, []rune, int) int) LookupOption
- func EditDistance(dist uint32) LookupOption
- func PrefixLength(prefixLength uint32) LookupOption
- func SortFunc(sf func(SuggestionList)) LookupOption
- func SuggestionLevel(level suggestionLevel) LookupOption
- type Segment
- type SegmentOption
- type SegmentResult
- type Spell
- func (s *Spell) AddEntry(de Entry, opts ...DictionaryOption) (bool, error)
- func (s *Spell) GetEntry(word string, opts ...DictionaryOption) (*Entry, error)
- func (s *Spell) GetLongestWord() uint32
- func (s *Spell) Lookup(input string, opts ...LookupOption) (SuggestionList, error)
- func (s *Spell) RemoveEntry(word string, opts ...DictionaryOption) (bool, error)
- func (s *Spell) Save(filename string) error
- func (s *Spell) Segment(input string, opts ...SegmentOption) (*SegmentResult, error)
- type Suggestion
- type SuggestionList
- type WordData
Examples ¶
Constants ¶
const ( // LevelBest will yield 'best' suggestion. LevelBest suggestionLevel = iota // LevelClosest will yield closest suggestions. LevelClosest // LevelAll will yield all suggestions. LevelAll )
Suggestion Levels used during Lookup.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type DictionaryOption ¶
type DictionaryOption func(*dictOptions) error
DictionaryOption is a function that controls the dictionary being used. An error will be returned if a dictionary option is invalid.
func DictionaryName ¶
func DictionaryName(name string) DictionaryOption
DictionaryName defines the name of the dictionary that should be used when storing, deleting, looking up words, etc. If not set, the default dictionary will be used.
type Entry ¶
type Entry struct { Frequency uint64 `json:",omitempty"` Word string WordData WordData `json:",omitempty"` }
Entry represents a word in the dictionary.
type LookupOption ¶
type LookupOption func(*lookupParams) error
LookupOption is a function that controls how a Lookup is performed. An error will be returned if the LookupOption is invalid.
func DictionaryOpts ¶
func DictionaryOpts(opts ...DictionaryOption) LookupOption
DictionaryOpts accepts multiple DictionaryOption and controls what dictionary should be used during lookup.
func DistanceFunc ¶
func DistanceFunc(df func([]rune, []rune, int) int) LookupOption
DistanceFunc accepts a function, f(str1, str2, maxDist), which calculates the distance between two strings. It should return -1 if the distance between the strings is greater than maxDist.
func EditDistance ¶
func EditDistance(dist uint32) LookupOption
EditDistance allows the max edit distance to be set for the Lookup. Reducing the edit distance will improve lookup performance.
func PrefixLength ¶
func PrefixLength(prefixLength uint32) LookupOption
PrefixLength defines how much of the input word should be used for the lookup.
func SortFunc ¶
func SortFunc(sf func(SuggestionList)) LookupOption
SortFunc allows the sorting of the SuggestionList to be configured. By default, suggestions will be sorted by their edit distance, then their frequency.
func SuggestionLevel ¶
func SuggestionLevel(level suggestionLevel) LookupOption
SuggestionLevel defines how many results are returned for the lookup. See the package constants for the levels available.
type SegmentOption ¶
type SegmentOption func(*segmentParams) error
SegmentOption is a function that controls how a Segment is performed. An error will be returned if the SegmentOption is invalid.
func SegmentLookupOpts ¶
func SegmentLookupOpts(opt ...LookupOption) SegmentOption
SegmentLookupOpts allows the Lookup() options for the current segmentation to be configured.
type SegmentResult ¶
SegmentResult holds the result of a call to Segment().
func (SegmentResult) GetWords ¶
func (s SegmentResult) GetWords() []string
GetWords returns a string slice of words for the segments.
func (SegmentResult) String ¶
func (s SegmentResult) String() string
String returns a string representation of the SegmentList.
type Spell ¶
type Spell struct { // The max number of deletes that will be performed to each word in the // dictionary MaxEditDistance uint32 // The prefix length that will be examined PrefixLength uint32 // contains filtered or unexported fields }
Spell provides access to functions for spelling correction.
func Load ¶
Load a dictionary from disk from filename. Returns a new Spell instance on success, or will return an error if there's a problem reading the file.
func (*Spell) AddEntry ¶
func (s *Spell) AddEntry(de Entry, opts ...DictionaryOption) (bool, error)
AddEntry adds an entry to the dictionary. If the word already exists its data will be overwritten. Returns true if a new word was added, false otherwise. Will return an error if there was a problem adding a word.
Example ¶
package main import ( "fmt" "github.com/eskriett/spell" ) func main() { // Create a new speller s := spell.New() // Add a new word, "example" to the dictionary _, _ = s.AddEntry(spell.Entry{ Frequency: 10, Word: "example", }) // Overwrite the data for word "example" _, _ = s.AddEntry(spell.Entry{ Frequency: 100, Word: "example", }) // Output the frequency for word "example" entry, _ := s.GetEntry("example") fmt.Printf("Output for word 'example' is: %v\n", entry.Frequency) }
Output: Output for word 'example' is: 100
func (*Spell) GetEntry ¶
func (s *Spell) GetEntry(word string, opts ...DictionaryOption) (*Entry, error)
GetEntry returns the Entry for word. If a word does not exist, nil will be returned.
func (*Spell) GetLongestWord ¶
GetLongestWord returns the length of the longest word in the dictionary.
func (*Spell) Lookup ¶
func (s *Spell) Lookup(input string, opts ...LookupOption) (SuggestionList, error)
Lookup takes an input and returns suggestions from the dictionary for that word. By default, it will return the best suggestion for the word if it exists.
Accepts zero or more LookupOption that can be used to configure how lookup occurs.
Example ¶
package main import ( "fmt" "github.com/eskriett/spell" ) func main() { // Create a new speller s := spell.New() _, _ = s.AddEntry(spell.Entry{ Frequency: 1, Word: "example", }) // Perform a default lookup for example suggestions, _ := s.Lookup("eample") fmt.Printf("Suggestions are: %v\n", suggestions) }
Output: Suggestions are: [example]
Example (ConfigureDistanceFunc) ¶
package main import ( "github.com/eskriett/spell" "github.com/eskriett/strmet" ) func main() { // Create a new speller s := spell.New() _, _ = s.AddEntry(spell.Entry{ Frequency: 1, Word: "example", }) // Configure the Lookup to use Levenshtein distance rather than the default // Damerau Levenshtein calculation _, _ = s.Lookup("example", spell.DistanceFunc(func(r1, r2 []rune, maxDist int) int { // Call the Levenshtein function from github.com/eskriett/strmet return strmet.LevenshteinRunes(r1, r2, maxDist) })) }
Output:
Example (ConfigureEditDistance) ¶
package main import ( "fmt" "github.com/eskriett/spell" ) func main() { // Create a new speller s := spell.New() _, _ = s.AddEntry(spell.Entry{ Frequency: 1, Word: "example", }) // Lookup exact matches, i.e. edit distance = 0 suggestions, _ := s.Lookup("eample", spell.EditDistance(0)) fmt.Printf("Suggestions are: %v\n", suggestions) }
Output: Suggestions are: []
Example (ConfigureSortFunc) ¶
package main import ( "sort" "github.com/eskriett/spell" ) func main() { // Create a new speller s := spell.New() _, _ = s.AddEntry(spell.Entry{ Frequency: 1, Word: "example", }) // Configure suggestions to be sorted solely by their frequency _, _ = s.Lookup("example", spell.SortFunc(func(sl spell.SuggestionList) { sort.Slice(sl, func(i, j int) bool { return sl[i].Frequency < sl[j].Frequency }) })) }
Output:
func (*Spell) RemoveEntry ¶
func (s *Spell) RemoveEntry(word string, opts ...DictionaryOption) (bool, error)
RemoveEntry removes a entry from the dictionary. Returns true if the entry was removed, false otherwise.
func (*Spell) Segment ¶
func (s *Spell) Segment(input string, opts ...SegmentOption) (*SegmentResult, error)
Segment takes an input string which may have word concatenations, and attempts to divide it into the most likely set of words by adding spaces at the most appropriate positions.
Accepts zero or more SegmentOption that can be used to configure how segmentation occurs.
Example ¶
package main import ( "fmt" "github.com/eskriett/spell" ) func main() { // Create a new speller s := spell.New() _, _ = s.AddEntry(spell.Entry{Frequency: 1, Word: "the"}) _, _ = s.AddEntry(spell.Entry{Frequency: 1, Word: "quick"}) _, _ = s.AddEntry(spell.Entry{Frequency: 1, Word: "brown"}) _, _ = s.AddEntry(spell.Entry{Frequency: 1, Word: "fox"}) // Segment a string with word concatenated together segmentResult, _ := s.Segment("thequickbrownfox") fmt.Println(segmentResult) }
Output: the quick brown fox
type Suggestion ¶
type Suggestion struct { // The distance between this suggestion and the input word Distance int Entry }
Suggestion is used to represent a suggested word from a lookup.
type SuggestionList ¶
type SuggestionList []Suggestion
SuggestionList is a slice of Suggestion.
func (SuggestionList) GetWords ¶
func (s SuggestionList) GetWords() []string
GetWords returns a string slice of words for the suggestions.
func (SuggestionList) String ¶
func (s SuggestionList) String() string
String returns a string representation of the SuggestionList.