nicenshtein

package module
Version: v0.0.0-...-3df51a9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 16, 2018 License: MIT Imports: 4 Imported by: 0

README

Note: this was mostly meant for me to learn some basic Go. Right now it is really slow for distances > 2 (slower than the naive approach).

Nicenshtein

Efficiently index and search a dictionary by Levenshtein distance. This is done by creating a trie (prefix tree) as an index and then walking the trie for collecting all words within a given distance. We keep track of the number of edits that have been made and walk multiple paths at the same time until all edits are consumed.

It is safe to use with utf-8 strings as it uses runes internally.

Check out nicenshtein-server as well, it has a demo live at https://nicenshtein.now.sh.

API

NewNicenshtein()

Returns a new instance of a Nicenshtein index with the following methods:

IndexFile(filePath string): error

Indexes every single line in the given file using AddWord.

AddWord(word string)

Adds a word to the index.

ContainsWord(word string): bool

Returns whether or not the index contains the given word.

CollectWords(out *map[string]byte, word string, maxDistance byte)

Will fill out (maps words to distances) with all words that are within maxDistance of word.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Nicenshtein

type Nicenshtein struct {
	// contains filtered or unexported fields
}

func NewNicenshtein

func NewNicenshtein() Nicenshtein

func (*Nicenshtein) AddWord

func (nice *Nicenshtein) AddWord(word string)

func (*Nicenshtein) CollectWords

func (nice *Nicenshtein) CollectWords(out *map[string]int, word string, maxDistance int)

func (*Nicenshtein) ContainsWord

func (nice *Nicenshtein) ContainsWord(word string) bool

func (*Nicenshtein) IndexFile

func (nice *Nicenshtein) IndexFile(filePath string) error

type RuneNode

type RuneNode struct {
	// contains filtered or unexported fields
}

A trie structure that maps runes to a list of following (child-) runes. `word` serves two purposes: 1. If it is not an empty string, it marks the end of a word like a flag 2. It stores the word that the path to it spells

Source Files

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL