spellchecker

package module
v3.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 18, 2025 License: MIT Imports: 13 Imported by: 0

README

Spellchecker

Go Reference CI

Yet another spellchecker written in go.

Features:

  • very compact database: ~1 MB for 30,000 unique words
  • average time to fix a single word: ~35 µs
  • achieves about 70–74% accuracy on Peter Norvig’s test sets (see benchmarks)
  • no built-in dictionary — you can provide any custom words, and the spellchecker will only know them

Installation

go get -v github.com/f1monkey/spellchecker/v3

Usage

Quick start
  1. Initialize the spellchecker. You need to pass an alphabet: a set of allowed characters that will be used for indexing and primary word checks. (All other characters will be ignored for these operations.)
	// Create a new instance
	sc, err := spellchecker.New(
		"abcdefghijklmnopqrstuvwxyz1234567890", // allowed symbols, other symbols will be ignored
	)
  1. Add some words to the dictionary:

    1. from any io.Reader:
    	in, _ := os.Open("data/sample.txt")
    	sc.AddFrom(in)
    
    1. Or add words manually:
    	sc.AddMany([]string{"lock", "stock", "and", "two", "smoking"})
    	sc.Add("barrels")
    
  2. Use the spellchecker:

    1. Check if a word is correct:
    	result := sc.IsCorrect("stock")
    	fmt.Println(result) // true
    
    1. Suggest corrections:
    	// Find up to 10 suggestions for a word
    	matches := sc.Suggest(nil, "rang", 10)
    	fmt.Println(matches) // [range, orange]
    
Options
Options

The spellchecker supports customizable options for both searching/suggesting corrections and adding words to the dictionary.

Search/Suggestion Options

These options are passed to the Suggest method (or to SuggestWith... helpers).

  • SuggestWithMaxErrors(maxErrors int)
    Sets the maximum allowed edit distance (in "bits") between the input word and dictionary candidates.

    • Deletion: 1 bit (e.g., "proble" → "problem")
    • Insertion: 1 bit (e.g., "problemm" → "problem")
    • Substitution: 2 bits (e.g., "problam" → "problem")
    • Transposition: 0 bits (e.g., "problme" → "problem")

    Default: 2. Increasing this value beyond 2 is not recommended as it can significantly degrade performance.

  • SuggestWithFilterFunc(f FilterFunc)
    Replaces the default scoring/filtering function with a custom one.
    The function receives:

    • src: runes of the input word
    • candidate: runes of the dictionary word
    • count: frequency count of the candidate in the dictionary

    It must return:

    • a float64 score (higher = better suggestion)
    • a bool indicating whether the candidate should be kept

    The default filter uses Levenshtein distance (with costs: insert/delete=1, substitute=1, transpose=1), filters out candidates exceeding maxErrors, and boosts score based on word frequency and shared prefix/suffix length.

Example usage:

matches := sc.Suggest(
	"rang",
	10,
	spellchecker.SuggestWithMaxErrors(1),
	spellchecker.SuggestWithFilterFunc(myCustomFilter),
)
Add Options

These options are passed to Add, AddMany, or AddFrom.

  • AddWithWeight(weight uint) Sets the frequency weight for added word(s). Higher weight increases the chance that the word will appear higher in suggestion results. Default: 1.

  • AddWithSplitter(splitter bufio.SplitFunc) Customizes how AddFrom(reader) splits the input stream into words.

    The default splitter:

    • Uses bufio.ScanWords as base
    • Converts to lowercase
    • Keeps only sequences matching [-\pL]+ (letters and hyphens)

Example:

sc.AddFrom(
	file,
	spellchecker.AddWithWeight(10),          // these words are very common
	spellchecker.AddWithSplitter(customSplitter),
)

sc.AddMany([]string{"hello", "world"},
	spellchecker.AddWithWeight(5),
)
Save/load
	sc, err := spellchecker.New("abc")

	// Save data to any io.Writer
	out, err := os.Create("data/out.bin")
	if err != nil {
		panic(err)
	}
	sc.Save(out)

	// Load data back from io.Reader
	in, err = os.Open("data/out.bin")
	if err != nil {
		panic(err)
	}
	sc, err = spellchecker.Load(in)
	if err != nil {
		panic(err)
	}

Benchmarks

Tests are based on data from Peter Norvig's article about spelling correction

Test set 1:
Running tool: /usr/bin/go test -benchmem -run=^$ -bench ^Benchmark_Norvig1$ github.com/f1monkey/spellchecker -count=1

goos: linux
goarch: amd64
pkg: github.com/f1monkey/spellchecker
cpu: 13th Gen Intel(R) Core(TM) i9-13980HX
Benchmark_Norvig1-32    	     357	   3305052 ns/op	        74.44 success_percent	       201.0 success_words	       270.0 total_words	  768899 B/op	   13302 allocs/op
PASS
ok  	github.com/f1monkey/spellchecker	3.801s
Test set 2:
Running tool: /usr/bin/go test -benchmem -run=^$ -bench ^Benchmark_Norvig2$ github.com/f1monkey/spellchecker -count=1

goos: linux
goarch: amd64
pkg: github.com/f1monkey/spellchecker
cpu: 13th Gen Intel(R) Core(TM) i9-13980HX
Benchmark_Norvig2-32    	     236	   5257185 ns/op	        71.25 success_percent	       285.0 success_words	       400.0 total_words	 1201260 B/op	   19346 allocs/op
PASS
ok  	github.com/f1monkey/spellchecker	4.350s

Documentation

Index

Constants

View Source
const DefaultAlphabet = "abcdefghijklmnopqrstuvwxyz"
View Source
const DefaultMaxErrors = 2

Variables

This section is empty.

Functions

This section is empty.

Types

type AddOptionFunc

type AddOptionFunc func(opts *addOptions)

func AddWithSplitter

func AddWithSplitter(splitter bufio.SplitFunc) AddOptionFunc

AddWithSplitter sets a splitter func for AddFrom() reader

func AddWithWeight

func AddWithWeight(weight uint) AddOptionFunc

AddWithWeight sets weight for added words. The weight increases the likelihood that the word will be chosen as a correction.

type FilterFunc

type FilterFunc func(src, candidate []rune, count uint) (float64, bool)

FilterFunc compares the source word with a candidate word. It returns the candidate's score and a boolean flag. If the flag is false, the candidate will be completely filtered out.

type Match

type Match struct {
	Value string
	Score float64
}

type SearchOptionFunc

type SearchOptionFunc func(opts *searchOptions)

func SuggestWithFilterFunc

func SuggestWithFilterFunc(f FilterFunc) SearchOptionFunc

SuggestWithFilterFunc set a FilterFunc

func SuggestWithMaxErrors

func SuggestWithMaxErrors(maxErrors int) SearchOptionFunc

SuggestWithMaxErrors sets the maximum allowed difference in bits between the "search word" and a "dictionary word". - deletion is a 1-bit change (proble → problem) - insertion is a 1-bit change (problemm → problem) - substitution is a 2-bit change (problam → problem) - transposition is a 0-bit change (problme → problem)

It is not recommended to set this value greater than 2, as it can significantly affect performance.

type Spellchecker

type Spellchecker struct {
	// contains filtered or unexported fields
}

func Load

func Load(reader io.Reader) (*Spellchecker, error)

Load reads spellchecker data from the provided reader and decodes it

func New

func New(alphabet string) (*Spellchecker, error)

func (*Spellchecker) Add

func (m *Spellchecker) Add(word string, opts ...AddOptionFunc)

Add adds provided word to the dictionary

func (*Spellchecker) AddFrom

func (m *Spellchecker) AddFrom(input io.Reader, opts ...AddOptionFunc) error

AddFrom reads input, splits it with spellchecker splitter func and adds words to the dictionary

func (*Spellchecker) AddMany

func (m *Spellchecker) AddMany(words []string, opts ...AddOptionFunc)

AddMany adds provided words to the dictionary

func (*Spellchecker) IsCorrect

func (s *Spellchecker) IsCorrect(word string) bool

IsCorrect check if provided word is in the dictionary

func (*Spellchecker) Save

func (m *Spellchecker) Save(w io.Writer) error

Save encodes spellchecker data and writes it to the provided writer

func (*Spellchecker) Suggest

func (s *Spellchecker) Suggest(word string, n int, opts ...SearchOptionFunc) SuggestionResult

Suggest find top n suggestions for the word. Returns spellchecker scores along with words

type SuggestionResult

type SuggestionResult struct {
	ExactMatch  bool // if true, the word is correct
	Suggestions []Match
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL