# levenshtein

package module
Version: v0.0.0-...-cae8b0e Latest Latest

Go to latest
Published: Aug 5, 2020 License: MIT

## Documentation ¶

### Overview ¶

This package implements the Levenshtein algorithm for computing the similarity between two strings. The central function is MatrixForStrings, which computes the Levenshtein matrix. The functions DistanceForMatrix, EditScriptForMatrix and RatioForMatrix read various interesting properties off the matrix. The package also provides the convenience functions DistanceForStrings, EditScriptForStrings and RatioForStrings for going directly from two strings to the property of interest.

### Constants ¶

View Source
```const (
Ins = iota
Del
Sub
Match
)```

### Variables ¶

This section is empty.

### Functions ¶

#### func DistanceForMatrix ¶

`func DistanceForMatrix(matrix [][]int) int`

DistanceForMatrix reads the edit distance off the given Levenshtein matrix.

#### func DistanceForStrings ¶

`func DistanceForStrings(source []rune, target []rune, op Options) int`

DistanceForStrings returns the edit distance between source and target.

It has a runtime proportional to len(source) * len(target) and memory use proportional to len(target).

Example
```source := "a"
target := "aa"
distance := DistanceForStrings([]rune(source), []rune(target), DefaultOptions)
fmt.Printf(`Distance between "%s" and "%s" computed as %d`, source, target, distance)
```
```Output:

Distance between "a" and "aa" computed as 1
```

#### func IdenticalRunes ¶

`func IdenticalRunes(a rune, b rune) bool`

IdenticalRunes is the default MatchFunction: it checks whether two runes are identical.

#### func LogMatrix ¶

`func LogMatrix(source []rune, target []rune, matrix [][]int)`

LogMatrix writes a visual representation of the given matrix for the given strings to os.Stderr. This function is deprecated, use WriteMatrix(source, target, matrix, os.Stderr) instead.

#### func MatrixForStrings ¶

`func MatrixForStrings(source []rune, target []rune, op Options) [][]int`

MatrixForStrings generates a 2-D array representing the dynamic programming table used by the Levenshtein algorithm, as described e.g. here: http://www.let.rug.nl/kleiweg/lev/ The reason for putting the creation of the table into a separate function is that it cannot only be used for reading of the edit distance between two strings, but also e.g. to backtrace an edit script that provides an alignment between the characters of both strings.

#### func RatioForMatrix ¶

`func RatioForMatrix(matrix [][]int) float64`

RatioForMatrix returns the Levenshtein ratio for the given matrix. The ratio is computed as follows:

```(sourceLength + targetLength - distance) / (sourceLength + targetLength)
```

#### func RatioForStrings ¶

`func RatioForStrings(source []rune, target []rune, op Options) float64`

RatioForStrings returns the Levenshtein ratio for the given strings. The ratio is computed as follows:

```(sourceLength + targetLength - distance) / (sourceLength + targetLength)
```

#### func WriteMatrix ¶

`func WriteMatrix(source []rune, target []rune, matrix [][]int, writer io.Writer)`

WriteMatrix writes a visual representation of the given matrix for the given strings to the given writer.

Example
```source := []rune("neighbor")
target := []rune("Neighbour")
matrix := MatrixForStrings(source, target, DefaultOptions)
WriteMatrix(source, target, matrix, os.Stdout)
```
```Output:

N  e  i  g  h  b  o  u  r
0  1  2  3  4  5  6  7  8  9
n  1  2  3  4  5  6  7  8  9 10
e  2  3  2  3  4  5  6  7  8  9
i  3  4  3  2  3  4  5  6  7  8
g  4  5  4  3  2  3  4  5  6  7
h  5  6  5  4  3  2  3  4  5  6
b  6  7  6  5  4  3  2  3  4  5
o  7  8  7  6  5  4  3  2  3  4
r  8  9  8  7  6  5  4  3  4  3
```

### Types ¶

#### type EditOperation ¶

`type EditOperation int`

#### func (EditOperation) String ¶

`func (operation EditOperation) String() string`

#### type EditScript ¶

`type EditScript []EditOperation`

#### func EditScriptForMatrix ¶

`func EditScriptForMatrix(matrix [][]int, op Options) EditScript`

EditScriptForMatrix returns an optimal edit script based on the given Levenshtein matrix.

#### func EditScriptForStrings ¶

`func EditScriptForStrings(source []rune, target []rune, op Options) EditScript`

EditScriptForStrings returns an optimal edit script to turn source into target.

#### type MatchFunction ¶

`type MatchFunction func(rune, rune) bool`

#### type Options ¶

```type Options struct {
InsCost int
DelCost int
SubCost int
Matches MatchFunction
}```
```var DefaultOptions Options = Options{
InsCost: 1,
DelCost: 1,
SubCost: 2,
Matches: IdenticalRunes,
}```

DefaultOptions is the default options without substitution: insertion cost is 1, deletion cost is 1, substitution cost is 2 (meaning insert and delete will be used instead), and two runes match iff they are identical.

```var DefaultOptionsWithSub Options = Options{
InsCost: 1,
DelCost: 1,
SubCost: 1,
Matches: IdenticalRunes,
}```

DefaultOptionsWithSub is the default options with substitution: insertion cost is 1, deletion cost is 1, substitution cost is 1, and two runes match iff they are identical.