speyl

package module
v0.0.0-...-96e0f7f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 22, 2025 License: MIT Imports: 7 Imported by: 0

README

Speyl

A Golang Based spell checker

Installation

Add:

import "github.com/Descent098/speyl"

To your go file, then run go mod tidy to download. Here's a minimal example:

package main

import (
	"fmt"

	"github.com/Descent098/speyl"
)

func main() {
	validWords := []string{"hi", "hello", "bonjour", "alumni"}

	inputWord := "alumni"
	s := speyl.SuggestWord(inputWord, validWords) // Returns algorithms.Suggestion{Likelihood: 1.0, Word: "alumni"}
	// Prints: alumni is alumni with %100.00 Likelihood
	fmt.Printf("%s is %s with %%%.2f Likelihood\n", inputWord, s.Word, s.Likelihood*100)

	misspeltWord := "almni"
	s = speyl.SuggestWord(misspeltWord, validWords) // Returns algorithms.Suggestion{Likelihood: 0.944, Word: "alumni"}
	// Prints: almni is alumni with %94.44 Likelihood
	fmt.Printf("%s is %s with %%%.2f Likelihood\n", misspeltWord, s.Word, s.Likelihood*100)
}

Usage

The generally recommended usage is to use the SuggestWord() function in the main package, this uses the Jaro Similarity, and is the fastest algorithm by far:

func SuggestWord(word string, validWords []string) algorithms.Suggestion{}

Which returns a struct:

type Suggestion struct {
	Likelihood float32 // How confident the suggestion is
	Word       string  // The suggested word
}

If you want to use a different algorithm you can do so with:

func SuggestWordWithSpecificAlgorithm(word string, validWords []string, algorithm algorithms.SimilarityAlgorithm) algorithms.Suggestion {}

You can find the various available SimilarityAlgorithm's in the algorithm package.

Performance

Below is the performance tests of the various algorithms and their implementations. They were tested using words.txt a corpus of ~370,000 words. There were two separate tests. The first was the synchronus execution using algorithms.SuggestWord().

Algorithm Time Taken (miliseconds)
Jaro 27
Levenshtein (Dynamic Programming) 105
Indel 1,975
Levenshtein (Recursive Damerau) 2,997
Levenshtein (Recursive) 18,078

You may assume that SuggestWord() is concurrent, but actually because of the execution speed of the Jaro and Dynamic Programming Levenshtein, it was slower to do the task asynchronously than synchronously. Since these would be the two that are most likely to get practical use I left the main implementations synchronous. If your list of valid words is over ~1 mil it might be worth implementing the concurrent version yourself. Here the code I tried (I also tried with channels and it was still slower):

func SuggestWord(word string, validWords []string, algorithm algorithms.SimilarityAlgorithm) algorithms.Suggestion {
	writeResultLock := sync.Mutex{}

	highestRatio := float32(0)
	currentSuggestion := ""

	wg := sync.WaitGroup{}

	for _, currentWord := range validWords {
		wg.Add(1)
		go func(wordToCompare string) {
			defer wg.Done()
			defer writeResultLock.Unlock()
			res := algorithm(word, wordToCompare)
			writeResultLock.Lock()
			if res > highestRatio {
				highestRatio = res
				currentSuggestion = currentWord
			}
		}(currentWord)

	}
	wg.Wait()
	return algorithms.Suggestion{
		Likelihood: highestRatio,
		Word:       currentSuggestion,
	}
}
Algorithm Time Taken (miliseconds)
Jaro 70
Levenshtein (Dynamic Programming) 163
Indel 205
Levenshtein (Recursive Damerau) 620
Levenshtein (Recursive) 1542

Documentation

Overview

A library to help calculate the similarity between two strings

Functions

CheckSimilarity(): Check the similarity of a word to a corpus of validWords
LoadWords(): Returns a slice of a huge number of words (>350,000)

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func LoadPremadeWords

func LoadPremadeWords() []string

Helper function to load a default corpus of over 350,000 words

Returns

[]string: A slice with the words in the corpus

func SuggestWord

func SuggestWord(word string, validWords []string) algorithms.Suggestion

Used to get a suggestion using Jaro Similarity, fastest solution available

Parameters

inputWord (string): The word to find a similar word for
validWords ([]string): A slice with the words that are considered valid

Returns

Suggestion: A suggestion struct with the word and it's likelihood

func SuggestWordWithSpecificAlgorithm

func SuggestWordWithSpecificAlgorithm(word string, validWords []string, algorithm algorithms.SimilarityAlgorithm) algorithms.Suggestion

Used to get a suggested word with a specific algorithm

Parameters

inputWord (string): The word to find a similar word for
validWords ([]string): A slice with the words in the corpus
algorithm (algorithms.SimilarityAlgorithm): The algorithm to use to calculate the similarity of the words

Returns

algorithms.Suggestion: A pointer to the suggestion struct with the word and it's likelihood

Types

This section is empty.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL