fuzzymatch-go

module
v0.4.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 18, 2019 License: LGPL-3.0

README

Fuzzy Match

GoDoc

This repository contains a Go language implementation of approximate string matching algorithms.

Preset functions

Inside the package github.com/abhabongse/fuzzymatch-go/factory/presets, there is the function PlainSimilarityScore which determines the similarity score between two generic input strings. Another function, ThaiNameSimilarityScore is a customized version for the similarity scoring function but it has extra pre-processings and string comparison logic for names of Thai people.

Both functions have output values between 0 and 1, where 0 means that two strings are very distinct whereas 0 means that two strings are very similar. Here are the signatures for both functions.

func PlainSimilarityScore(fst, snd string) float64 { ... }

func ThaiNameSimilarityScore(fst, snd string) float64 { ... }
Customization

The logical flow for string similarity scoring functions may be customized by using the factory (inside the package github.com/abhabongse/fuzzymatch-go/factory) to construct a new function. The factory accepts various options; see the documentation for more information.

Notes

All source code for this project is released under the GNU Lesser General Public License v3.0.

Directories

Path Synopsis
Package candidate provides functions which generate a sequence of all possible variants of a given input string.
Package candidate provides functions which generate a sequence of all possible variants of a given input string.
Package dicecoeff provides a function which computes the Sørensen–Dice Coefficient (sometimes called the Dice Similarity Coefficient; DSC).
Package dicecoeff provides a function which computes the Sørensen–Dice Coefficient (sometimes called the Dice Similarity Coefficient; DSC).
Package editdist provides a collection of functions to compute the distances between a pair of strings under various distance metrics in string spaces.
Package editdist provides a collection of functions to compute the distances between a pair of strings under various distance metrics in string spaces.
extra
Package extra provides additional editdist-related functions that are customized for certain language and scripts, such as Thai.
Package extra provides additional editdist-related functions that are customized for certain language and scripts, such as Thai.
Package factory provides a higher-ordered function which constructs string similarity scoring functions based on various configurable settings (such as how strings are sanitized before similarity scores are computed, or which rune distance metrics are used to measure distances between a pair of strings).
Package factory provides a higher-ordered function which constructs string similarity scoring functions based on various configurable settings (such as how strings are sanitized before similarity scores are computed, or which rune distance metrics are used to measure distances between a pair of strings).
preset
Package preset provides a collection of pre-built string similarity scoring functions generated by the higher-ordered function in the factory parent package.
Package preset provides a collection of pre-built string similarity scoring functions generated by the higher-ordered function in the factory parent package.
Package runedata provides additional data regarding Unicode characters which are not part of the built-in Go packages, especially those involved nuances in foreign languages such as Thai.
Package runedata provides additional data regarding Unicode characters which are not part of the built-in Go packages, especially those involved nuances in foreign languages such as Thai.
Package sanitary provides a collection of string processing functions that pre-process or clean up some user-input strings (the process is called the "sanitization").
Package sanitary provides a collection of string processing functions that pre-process or clean up some user-input strings (the process is called the "sanitization").
extra
Package extra provides additional string sanitization functions that are customized for certain language and scripts, such as Thai.
Package extra provides additional string sanitization functions that are customized for certain language and scripts, such as Thai.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL