suggest

module
v0.0.0-...-5eefd20 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 8, 2026 License: MIT

README

Suggest

Library for Top-k Approximate String Matching, autocomplete and spell checking.

Build Status Go Report Card GoDoc

The library was mostly inspired by

Library Usage

The library is organized into sub-packages under pkg/. Below are concrete examples for the most common use cases.

Find the top-K most similar strings from a dictionary:

import (
    "context"
    "fmt"
    "log"

    "github.com/suggest-go/suggest/pkg/suggest"
    "github.com/suggest-go/suggest/pkg/store"
    "github.com/suggest-go/suggest/pkg/metric"
)

func main() {
    ctx := context.Background()

    // Load a dictionary from disk
    source, err := store.OpenStoreFromFile(ctx, "cars.txt")
    if err != nil {
        log.Fatalf("open store: %v", err)
    }
    defer source.Close()

    // Configure the suggester: Jaro-Winkler distance, top-5 results
    config := suggest.Config{
        Source:        source,
        Metric:        metric.NewJaroWinkler(),
        SuggestAmount: 5,
    }
    suggester, err := suggest.New(config)
    if err != nil {
        log.Fatalf("create suggester: %v", err)
    }

    // Query
    results, err := suggester.Suggest(ctx, "teslla model 3")
    if err != nil {
        log.Fatalf("suggest: %v", err)
    }

    for _, r := range results {
        fmt.Printf("  %s (score=%.3f)\n", r.Value, r.Score)
    }
}
2. Spellchecking

Detect and correct misspelled words based on a language model:

import (
    "context"
    "fmt"

    "github.com/suggest-go/suggest/pkg/spellchecker"
)

func main() {
    ctx := context.Background()

    // Initialize from a pre-built language model directory
    sc, err := spellchecker.New(ctx, "path/to/lm-folder")
    if err != nil {
        panic(err)
    }
    defer sc.Close()

    // Check a single word
    if suggestions, err := sc.Suggest(ctx, "recieve", 5); err == nil {
        for _, s := range suggestions {
            fmt.Printf("  %s\n", s.Value)
        }
        // Output: receive
    }
}
3. Custom metric

Implement your own similarity metric by satisfying the metric.Metric interface:

import "github.com/suggest-go/suggest/pkg/metric"

type MyMetric struct{}

func (m *MyMetric) Compare(a, b string) float64 {
    // Return 1.0 for identical, 0.0 for unrelated
    // ... your custom comparison here ...
    return 0.0
}

func (m *MyMetric) IsSimilar(a, b string, threshold float64) bool {
    return m.Compare(a, b) >= threshold
}

Then plug it into suggest.Config{Metric: &MyMetric{}}.

4. HTTP service

The package ships with a built-in HTTP server. See cmd/suggest/service.go for an example. Quick start:

import (
    "log"
    "net/http"
    "github.com/suggest-go/suggest/internal/http"
)

func main() {
    handler, err := http.NewHandler("path/to/config.json")
    if err != nil {
        log.Fatal(err)
    }
    http.HandleFunc("/suggest", handler)
    log.Fatal(http.ListenAndServe(":8080", nil))
}

Configuration

suggest is configured via a JSON file. Minimal example:

{
    "name": "my-suggester",
    "source": {
        "type": "file",
        "path": "data/items.txt"
    },
    "metric": "jaro-winkler",
    "suggest_amount": 5,
    "min_score": 0.5
}

The schema is documented in pkg/suggest/config.go.

Package Overview

Sub-package Purpose
pkg/suggest Core suggester engine (Top-K retrieval)
pkg/spellchecker Context-aware spellchecking with language models
pkg/store Storage backends (in-memory, file-based)
pkg/metric Distance metrics (Jaro-Winkler, Levenshtein, Cosine)
pkg/dictionary Dictionary loaders (plain text, gzip)
pkg/index Inverted index for fast lookup
pkg/mph Minimal perfect hashing
pkg/vgram Variable-length n-grams
pkg/lm Language model integration (KenLM)
pkg/merger Result merging & deduplication
pkg/compression Compact storage formats
pkg/utils Shared helpers

Performance Tips

  • Use store.Memory for small dictionaries (<100k entries) — fastest
  • Use store.File for large dictionaries — saves RAM
  • For spellchecking, use the pre-built lm-folder shipped with the language model
  • For autocomplete at scale, batch queries with SuggestBatch(ctx, queries)

Further Reading

Docs

See the documentation with examples demo and API documentation.

Demo

Fuzzy string search in a dictionary

The demo shows an approximate string search in a vehicle dictionary with more than 2k model names.

You can also run it locally

$ make build
$ ./build/suggest eval -c pkg/suggest/testdata/config.json -d cars -s 0.5 -k 5

or by using Docker

$ make build-docker
$ docker run -p 8080:8080 -v $(pwd)/pkg/suggest/testdata:/data/testdata suggest /data/build/suggest service-run -c /data/testdata/config.json

Suggest eval Demo

Spellchecker

Spellchecker recognizes a misspelled word based on the context of the surrounding words. In order to run a spellchecker demo, please do the next

$ make build
$ ./build/./spellchecker eval -c lm-folder/config.json

Spellchecker eval Demo

Contributions

When contributing to this repository, please first discuss the change you wish to make via issue, email, or any other method with the owners of this repository before making a change.

Directories

Path Synopsis
cmd
language-model command
spellchecker command
suggest command
internal
pkg
alphabet
Package alphabet provides API for manipulating with the list of defined characters
Package alphabet provides API for manipulating with the list of defined characters
analysis
Package analysis represents API to convert text into indexable/searchable tokens
Package analysis represents API to convert text into indexable/searchable tokens
compression
Package compression holds the algorithms for compressing the list of sorted lists of integers
Package compression holds the algorithms for compressing the list of sorted lists of integers
dictionary
Package dictionary represents storage for keeping an index vocabulary
Package dictionary represents storage for keeping an index vocabulary
index
Package index provides an inverted index implementation and basic functionality for work with this data structure
Package index provides an inverted index implementation and basic functionality for work with this data structure
lm
Package lm provides a library for storing large n-gram language models in memory.
Package lm provides a library for storing large n-gram language models in memory.
merger
Package merger provides a different set of algorithms for solving T-overlap occurrence problem of sorted lists of integers.
Package merger provides a different set of algorithms for solving T-overlap occurrence problem of sorted lists of integers.
metric
Package metric holds different metrics for sets similarity compare
Package metric holds different metrics for sets similarity compare
mph
Package mph represents minimal perfect hash function implementation
Package mph represents minimal perfect hash function implementation
spellchecker
Package spellchecker provides spellcheck functionality
Package spellchecker provides spellcheck functionality
store
Package store provides an abstract way to work with i/o
Package store provides an abstract way to work with i/o
suggest
Package suggest provides fuzzy search and autocomplete functionality
Package suggest provides fuzzy search and autocomplete functionality
utils
Package utils holds the list of code helpers
Package utils holds the list of code helpers

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL