BK-tree in go
Implementation of a simple BK-tree in go, using the Damerau-Levenshtein distance to calculate the distance between the strings.
A simple CLI tool was also made to use the program.
Why
I recently wanted to find out how fuzzy string matching works.
I read this article about fuzzy string matching. I then decided to try and implement a BK-tree in go, using the Damerau-Levenshtein distance to calculate the distance.
Code for the Damerau-Levenshtein distance heavily inspired from:
CLI
I then built a simple cli to call the program. In order to use the program, pipe data into it. You also must supply a search string. You can also optionally supply a tolerance. The default tolerance is 10.
A testfile is supplied in the repo to showcase example use.
Usage
- Without providing a tolerance
cat testfile | ./bk-treego -s dawg
Output
dog
cat
lion
mouse
tiger
turtle
elephant
cat testfile | ./bk-tree-go -s dawg -t 2
Output:
dog
Note the output is sorted by distance, therefore it is possible to get the closest match using head
cat testfile | ./bk-tree-go -s dawg | head -1
Output:
dog