Documentation
¶
Overview ¶
Package diff calculates the differences between two sequences.
It implements the algorithm from "An Algorithm for Differential File Comparison" by Hunt and McIlroy: https://www.cs.dartmouth.edu/~doug/diff.pdf
For flexibility, the algorithm itself operates on a sequence of integers. This allows you to compare arbitrary sequences, as long as you can map their elements to a uint64.
To generate a diff for text, the inputs need to be split and hashed. Splitting should be done to reduce algorithmic complexity (which is O(m•n•log(m)) in the worst case). It also creates diffs that are better suited for human consumption. Hashing means that collisions are possible, but they should be rare enough in practice to not matter. If they do happen, the resulting diff might be subpoptimal.
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func SplitLines ¶
SplitLines splits at newlines, stripping them in the process.
Types ¶
type HashFunc ¶
A HashFunc maps a token to an integer.
func DefaultHash ¶
func DefaultHash() HashFunc
DefaultHash returns a sensible (but unspecified) hash function for TextDiff.
type Op ¶
type Op int
const (
OpA, OpEq, OpB Op = -1, 0, 1
)
func Uint64 ¶
Diff calculates a minimal diff between a and b as a series of operations. See the example for how to interpret the result.
Example ¶
package main import ( "fmt" "github.com/Merovius/diff" ) func main() { a := []uint64{1, 1, 1, 3, 4, 4} b := []uint64{0, 1, 0, 1, 0, 3, 1, 4, 5, 4, 6} for _, o := range diff.Uint64(a, b) { switch o { case diff.OpA: fmt.Printf("-%d\n", a[0]) a = a[1:] case diff.OpEq: fmt.Printf(" %d\n", a[0]) a, b = a[1:], b[1:] case diff.OpB: fmt.Printf("+%d\n", b[0]) b = b[1:] } } }
Output: +0 1 +0 1 +0 +3 1 -3 4 +5 4 +6
type SplitFunc ¶
A SplitFunc splits a token from b. tok specifies the length of the next token and skip specifies how many bytes to skip after the token. If neither of them is positive or either of them is negative, TextDiff will panic.
type TextDelta ¶
TextDelta describes a line of the resulting diff.
func Text ¶
TextDiff calculates a diff between a and b. s is used to separate them into tokens and h is used to map those to integers. If s is nil, SplitLines is used. If h is nil, DefaultHash is used.
The resulting diff will contain separate TextDelta values per token (even if consecutive elements use the same Op). See the example for how to use construct the diff from it.
In case of an EqOp delta where the corresponding tokens of a and b differ (but hash to the same value), it is unspecified which of the two is returned.
Example ¶
package main import ( "fmt" "github.com/Merovius/diff" ) func main() { a := []byte("a\nb\nc\nd\nf\ng\nh\nj\nq\nz") b := []byte("a\nb\nc\nd\ne\nf\ng\ni\nj\nk\nr\nx\ny\nz") for _, δ := range diff.Text(a, b, nil, nil) { switch δ.Op { case diff.OpA: fmt.Printf("- %s\n", δ.Text) case diff.OpEq: fmt.Printf(" %s\n", δ.Text) case diff.OpB: fmt.Printf("+ %s\n", δ.Text) } } }
Output: a b c d + e f g - h + i j - q + k + r + x + y z