spellcheck

package
v0.42.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 15, 2026 License: MIT Imports: 13 Imported by: 0

Documentation

Overview

Package spellcheck provides dictionary-backed spell checking for the composer.

Dictionaries follow the Hunspell .dic format (word list, optional /flags per line). Affix rules are ignored: each base form is added to a flat word set. Dictionaries are downloaded from the wooorm/dictionaries GitHub repository on demand.

Index

Constants

View Source
const DefaultLanguage = "en"

DefaultLanguage is the language code installed automatically the first time the composer opens.

Variables

View Source
var DictURLTemplate = "https://raw.githubusercontent.com/wooorm/dictionaries/main/dictionaries/%s/index.dic"

DictURLTemplate is the URL used to fetch Hunspell .dic files. It is a variable to allow tests and the CLI to override the source.

Functions

func DictInstalled

func DictInstalled(lang string) bool

DictInstalled reports whether the dictionary for lang exists on disk.

func DictPath

func DictPath(lang string) (string, error)

DictPath returns the on-disk path for a given language code.

func DictsDir

func DictsDir() (string, error)

DictsDir returns the directory where dictionaries are stored.

func Download

func Download(lang string) (string, error)

Download fetches the dictionary for lang from DictURLTemplate and writes it atomically to the dicts directory.

func EnsureDefault

func EnsureDefault() (string, error)

EnsureDefault downloads the default English dictionary if it is not already installed and returns the language code that is available.

func Highlight

func Highlight(text string, c *Checker, skipLine int) string

Highlight walks rendered text and wraps misspelled words in a red dotted underline. ANSI sequences already present in the input are preserved.

The text is processed line by line. The line at index skipLine is left untouched — pass -1 to highlight every line.

func IsCheckable

func IsCheckable(word string) bool

IsCheckable returns true when the token looks like a natural-language word worth spell-checking. URLs, email-like fragments, numbers, single letters, and all-uppercase short tokens (likely acronyms) are skipped.

Types

type Checker

type Checker struct {
	// contains filtered or unexported fields
}

Checker holds a loaded word set and reports whether tokens are known.

func NewChecker

func NewChecker() *Checker

NewChecker returns an empty checker. Load must be called before Check returns useful results.

func (*Checker) Check

func (c *Checker) Check(word string) bool

Check reports whether the word is recognised. Words shorter than 2 runes, numeric, or containing only punctuation are always treated as correct. Words that contain letter runes outside the loaded dictionary's alphabet (e.g. Cyrillic text against an English dictionary, or accented characters not present in the dictionary's base forms) are also treated as correct — we have no signal to judge them.

func (*Checker) Language

func (c *Checker) Language() string

Language returns the language code of the loaded dictionary.

func (*Checker) Load

func (c *Checker) Load(path, language string) error

Load reads a dictionary file from disk and replaces the current word set.

func (*Checker) LoadLang

func (c *Checker) LoadLang(lang string) error

LoadLang loads the dictionary for the given language code from the configured dicts directory.

func (*Checker) Loaded

func (c *Checker) Loaded() bool

Loaded reports whether the checker has a dictionary ready.

func (*Checker) Suggest

func (c *Checker) Suggest(word string, limit int) []string

Suggest returns up to limit candidate corrections for word, ranked by edit distance ascending then alphabetically. Returns nil when the checker has no dictionary loaded or when word is too short.

type Token

type Token struct {
	Word  string
	Start int
	End   int
}

Token records a word and its byte offsets inside the original text.

func Tokenize

func Tokenize(s string) []Token

Tokenize splits s into word tokens. A word is a maximal run of letters optionally containing internal apostrophes or hyphens. Leading and trailing connector characters are stripped.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL