Documentation ¶
Overview ¶
Package similarity provides diff-like implementation to determine how similar two byte streams are.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func LCS ¶
LCS is an implementation of longest common subsequence problem1 optimized for space.
Most LCS implementations are optimized for time, do a lot of allocations to memoize, making them unsuitable for larger inputs.
Since in our use case we only need the length of the common chunk, we can avoid most of the allocations.
In worst case scenario (there's almost nothing in common between a and b) the time complexity is O(N^2). In best case scenario (a == b) the time complexity is O(N).
func MaxSimilarity ¶
MaxSimilarity returns the larger number between Similarity(a, b) / len(a) and Similarity(a, b) / len(b).
1 means either they are identical, or one is superset of the other. (for example, a = "abcdef" and b = "abcfoodef")
func MinSimilarity ¶
MinSimilarity returns the smaller number between Similarity(a, b) / len(a) and Similarity(a, b) / len(b).
1 means they are identical, 0 means they have nothing in common.
func Similarity ¶
Similarity returns how similar a and b are.
The return value is the total length of the chunks a and b have in common. For example, when a is "abcdef" and b is "abcfoodef", they have 2 chunks in common: "abc" and "def", thus 6 is returned.
Types ¶
This section is empty.