Documentation
¶
Overview ¶
Package diff provides functions to efficiently compare two slices similar to the Unix diff command line tool to compare files.
The main functions are Hunks, which groups changes into contextual blocks, and Edits, which returns every individual change. By default, the algorithms are optimized for performance and may use heuristics for very large inputs. Use Optimal to disable these heuristics when you need the shortest possible diff.
Performance: Default complexity is O(N^1.5 log N) time and O(N) space. With Optimal, time complexity becomes O(ND) where N = len(x) + len(y) and D is the number of edits.
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Edit ¶
Edit describes a single edit of a diff.
- For Match, both X and Y contain the matching element.
- For Delete, X contains the deleted element and Y is unset (zero value).
- For Insert, Y contains the inserted element and X is unset (zero value).
func Edits ¶
func Edits[T comparable](x, y []T, opts ...Option) []Edit[T]
Edits compares the contents of x and y and returns the changes necessary to convert from one to the other.
Edits returns one edit for every element in the input slices. If x and y are identical, the output will consist of a match edit for every input element.
The following option is supported: diff.Optimal, diff.Fast
Important: The output is not guaranteed to be stable and may change with minor version upgrades. DO NOT rely on the output being stable.
Example (Runes) ¶
Compare two strings rune by rune.
package main import ( "fmt" "znkr.io/diff" ) func main() { x := []rune("Hello, World") y := []rune("Hello, 世界") edits := diff.Edits(x, y) for _, edit := range edits { switch edit.Op { case diff.Match: fmt.Printf("%s", string(edit.X)) case diff.Delete: fmt.Printf("[-%s-]", string(edit.X)) case diff.Insert: fmt.Printf("{+%s+}", string(edit.Y)) default: panic("never reached") } } }
Output: Hello, [-W-][-o-][-r-][-l-][-d-]{+世+}{+界+}
Example (Words) ¶
Compare two strings word by word.
package main import ( "fmt" "strings" "znkr.io/diff" ) func main() { x := strings.Fields("calm seas reflect the sky") y := strings.Fields("restless seas reflect the sky defiantly") edits := diff.Edits(x, y) for i, edit := range edits { if i > 0 { fmt.Print(" ") } switch edit.Op { case diff.Match: fmt.Printf("%s", edit.X) case diff.Delete: fmt.Printf("[-%s-]", edit.X) case diff.Insert: fmt.Printf("{+%s+}", edit.Y) default: panic("never reached") } } }
Output: [-calm-] {+restless+} seas reflect the sky {+defiantly+}
func EditsFunc ¶
EditsFunc compares the contents of x and y using the provided equality comparison and returns the changes necessary to convert from one to the other.
EditsFunc returns edits for every element in the input. If both x and y are identical, the output will consist of a match edit for every input element.
The following option is supported: diff.Optimal
Note that this function has generally worse performance than Edits for diffs with many changes.
Important: The output is not guaranteed to be stable and may change with minor version upgrades. DO NOT rely on the output being stable.
type Hunk ¶
type Hunk[T any] struct { PosX, EndX int // Start and end position in x. PosY, EndY int // Start and end position in y. Edits []Edit[T] // Edits to transform x[PosX:EndX] to y[PosY:EndY] }
Hunk describes a sequence of consecutive edits.
func Hunks ¶
func Hunks[T comparable](x, y []T, opts ...Option) []Hunk[T]
Hunks compares the contents of x and y and returns the changes necessary to convert from one to the other.
The output is a sequence of hunks. A hunk represents a contiguous block of changes (insertions and deletions) along with some surrounding context. The amount of context can be configured using Context.
If x and y are identical, the output has length zero.
The following options are supported: diff.Context, diff.Optimal, diff.Fast
Important: The output is not guaranteed to be stable and may change with minor version upgrades. DO NOT rely on the output being stable.
Example (PseudoUnified) ¶
Compare to strings line by line and output the difference as a pseudo-unified diff output (i.e. it's similar to what diff -u would produce). The format is not a correct unified diff though, in particular line endings (esp. at the end of the input) are handled differently.
More generally, comparing text line by line is better solved with the textdiff subpackage.
package main import ( "fmt" "strings" "znkr.io/diff" ) func main() { x := `this paragraph is not changed and barely long enough to create a new hunk this paragraph is going to be removed` y := `this is a new paragraph that is inserted at the top this paragraph is not changed and barely long enough to create a new hunk` xlines := strings.Split(x, "\n") ylines := strings.Split(y, "\n") hunks := diff.Hunks(xlines, ylines) for _, h := range hunks { fmt.Printf("@@ -%d,%d +%d,%d @@\n", h.PosX+1, h.EndX-h.PosX, h.PosY+1, h.EndY-h.PosY) for _, edit := range h.Edits { switch edit.Op { case diff.Match: fmt.Printf(" %s\n", edit.X) case diff.Delete: fmt.Printf("-%s\n", edit.X) case diff.Insert: fmt.Printf("+%s\n", edit.Y) default: panic("never reached") } } } }
Output: @@ -1,3 +1,6 @@ +this is a new paragraph +that is inserted at the top + this paragraph is not changed and @@ -5,7 +8,3 @@ enough to create a new hunk - -this paragraph -is going to be -removed
func HunksFunc ¶
HunksFunc compares the contents of x and y using the provided equality comparison and returns the changes necessary to convert from one to the other.
The output is a sequence of hunks that each describe a number of consecutive edits. Hunks include a number of matching elements before and after the last delete or insert operation. The number of elements can be configured using Context.
If x and y are identical, the output has length zero.
The following options are supported: diff.Context, diff.Optimal
Note that this function has generally worse performance than Hunks for diffs with many changes.
Important: The output is not guaranteed to be stable and may change with minor version upgrades. DO NOT rely on the output being stable.
type Option ¶
Option configures the behavior of comparison functions.
func Context ¶
Context sets the number of unchanged elements to include around each hunk. The default is 3.
Context anchors diffs in the surrounding context in addition to position information. For example, with Context(2), you'll see 2 unchanged elements before and after each group of changes.
Only supported by functions that return hunks.
Example ¶
package main import ( "fmt" "strings" "znkr.io/diff" ) func main() { x := strings.Fields("calm seas reflect the sky") y := strings.Fields("restless seas reflect the sky defiantly") hunks := diff.Hunks(x, y, diff.Context(1)) for i, h := range hunks { if i > 0 { fmt.Print(" … ") } for i, edit := range h.Edits { if i > 0 { fmt.Print(" ") } switch edit.Op { case diff.Match: fmt.Printf("%s", edit.X) case diff.Delete: fmt.Printf("[-%s-]", edit.X) case diff.Insert: fmt.Printf("{+%s+}", edit.Y) default: panic("never reached") } } } }
Output: [-calm-] {+restless+} seas … sky {+defiantly+}
func Fast ¶
func Fast() Option
Fast uses a heuristic to find a reasonable diff instead of trying to find a minimal diff.
This option trades diff minimality for runtime performance. The resulting diff can be a lot larger than the diff created by default. The speedup from using Fast only really manifests for relatively few, very large inputs because the default already use the underlying heuristic to speed up large inputs.
The heuristic only works for comparable types.
Performance impact: This option changes the complexity to O(N log N).
func Optimal ¶
func Optimal() Option
Optimal ensures the diff algorithm finds the shortest possible diff by disabling performance heuristics.
By default, the diff functions use heuristics to speed up computation for large inputs with many changes, which may produce slightly longer diffs. Use this option when you need the absolute shortest diff, at the cost of potentially slower performance.
Performance impact: Changes time complexity from O(N^1.5 log N) to O(ND) where N = len(x) + len(y) and D is the number of differences.
Directories
¶
Path | Synopsis |
---|---|
internal
|
|
byteview
Package byteview provides a mechanism to handle strings and []byte as immutable byte views.
|
Package byteview provides a mechanism to handle strings and []byte as immutable byte views. |
cmd/eval
command
eval provides a way to validate the diffing algorithm by applying the resulting diffs using the unix patch tool and checking that they produce the input again.
|
eval provides a way to validate the diffing algorithm by applying the resulting diffs using the unix patch tool and checking that they produce the input again. |
cmd/eval/internal/git
Package git provides a simplified git interface for reading a repository for evaluations
|
Package git provides a simplified git interface for reading a repository for evaluations |
cmd/gitdiff
command
gitdiff is a tool that can be used with git using GIT_EXTERNAL_DIFF.
|
gitdiff is a tool that can be used with git using GIT_EXTERNAL_DIFF. |
cmd/specializemyers
command
specializemyers is a bit of an abomination, it takes the myers implementation and generates a specialization for int.
|
specializemyers is a bit of an abomination, it takes the myers implementation and generates a specialization for int. |
config
Package config provides shared configuration mechanisms for packages this module.
|
Package config provides shared configuration mechanisms for packages this module. |
impl
Package impl contains an implementation of diff algorithms.
|
Package impl contains an implementation of diff algorithms. |
indentheuristic
Package indentheuristic is an implementation of the indentation heuristic by Michael Haggerty (https://github.com/mhagger/diff-slider-tools).
|
Package indentheuristic is an implementation of the indentation heuristic by Michael Haggerty (https://github.com/mhagger/diff-slider-tools). |
rvecs
Package rvecs contains functions to work with the result vectors, the internal representation that's used by the myers algorithm and is then translated to a user facing API.
|
Package rvecs contains functions to work with the result vectors, the internal representation that's used by the myers algorithm and is then translated to a user facing API. |
unixpatch
Package unixpatch provides a simple wrapper around the unix patch tool.
|
Package unixpatch provides a simple wrapper around the unix patch tool. |
Package textdiff provides functions to efficiently compare text line-by-line.
|
Package textdiff provides functions to efficiently compare text line-by-line. |