Documentation
¶
Overview ¶
specscore: feat-recordops/diff
specscore: feat-recordops/diff
specscore: feat-recordops/diff
Package recordops provides pure, dependency-free analytical helpers over collections of dalgo records.
The first and only capability in MVP is Diff (and its sibling DiffFunc) — a streaming, single-pass comparison of one baseline recordset against N candidate recordsets. Inputs are pull-based iter.Seq2 streams that MUST be sorted ascending by ID. Output is also iter.Seq2: one IDDiff per ID where at least one candidate diverges from baseline. Use WithIncludeMatched to emit fully matched IDs too.
The algorithm is a K-way merge over the N+1 input streams. Memory footprint at any point: O(N) records (one current per stream) plus the in-flight IDDiff being yielded.
Each IDDiff carries the baseline snapshot once (the single source of truth) and per-candidate deltas — never duplicates of baseline values across candidates.
Renderers translate the structured stream into output formats: RenderYAMLGitStyle (per-candidate git-diff style — the visual anchor that matches the source idea spec/ideas/recordops.md), RenderYAMLByID (cross-candidate divergence view), RenderYAML and RenderJSON (structured serialization).
Renderers consume the input stream exactly once; consumers that need multiple views must materialize first via slices.Collect or equivalent.
specscore: feat-recordops/diff
specscore: feat-recordops/diff
specscore: feat-recordops/diff
specscore: feat-recordops/diff
specscore: feat-recordops/diff
specscore: feat-recordops/diff
specscore: feat-recordops/diff
specscore: feat-recordops/diff
Index ¶
- Variables
- func Diff[K cmp.Ordered](baseline RecordSeq[K], candidates []RecordSeq[K], opts ...Option) iter.Seq2[IDDiff[K], error]
- func DiffFunc[K comparable](baseline RecordSeq[K], candidates []RecordSeq[K], less func(a, b K) bool, ...) iter.Seq2[IDDiff[K], error]
- func RenderJSON[K comparable](diffs iter.Seq2[IDDiff[K], error], collectionName string) (string, error)
- func RenderYAML[K comparable](diffs iter.Seq2[IDDiff[K], error], collectionName string) (string, error)
- func RenderYAMLByID[K comparable](diffs iter.Seq2[IDDiff[K], error], collectionName string) (string, error)
- func RenderYAMLGitStyle[K comparable](diffs iter.Seq2[IDDiff[K], error], candidateIndex int, collectionName string) (string, error)
- type CandidateState
- type FieldValue
- type IDDiff
- type Option
- type RecordSeq
- type RecordSnapshot
- type RecordStatus
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var ErrDuplicateID = errors.New("recordops: duplicate ID in input stream")
ErrDuplicateID indicates an input stream yielded two records with the same ID. Within a single stream, IDs must be unique.
var ErrIncomparableField = errors.New("recordops: incomparable field")
ErrIncomparableField indicates field comparison via reflect.DeepEqual panicked (e.g., a func or chan field). The panic is recovered and surfaced as a stream error wrapping this sentinel.
var ErrInvalidArgument = errors.New("recordops: invalid argument")
ErrInvalidArgument indicates a programmer error in calling Diff/DiffFunc (e.g., nil less function passed to DiffFunc).
var ErrUnsortedInput = errors.New("recordops: input stream not sorted ascending by ID")
ErrUnsortedInput indicates an input stream yielded a record whose ID is not strictly greater than the previously yielded ID from the same stream. Diff requires ID-sorted input streams.
Functions ¶
func Diff ¶
func Diff[K cmp.Ordered]( baseline RecordSeq[K], candidates []RecordSeq[K], opts ...Option, ) iter.Seq2[IDDiff[K], error]
Diff compares baseline against candidates via K-way merge over ID-sorted streams and yields one IDDiff per ID where at least one candidate diverges (default) or every ID touched by any input (with WithIncludeMatched).
Inputs MUST be sorted ascending by ID. Monotonicity is validated per stream; violations terminate with ErrUnsortedInput. Duplicate IDs within a stream terminate with ErrDuplicateID. Upstream stream errors propagate verbatim.
Diff requires K to be cmp.Ordered (string/int/float/etc.). For types that are comparable but not orderable (e.g., [16]byte UUIDs), use DiffFunc with an explicit less function.
See Package recordops doc for the K-way merge model and memory footprint. See spec/features/recordops/diff for the full contract.
Renderers consume the returned stream once; multi-view consumers must materialize first (slices.Collect-equivalent).
func DiffFunc ¶
func DiffFunc[K comparable]( baseline RecordSeq[K], candidates []RecordSeq[K], less func(a, b K) bool, opts ...Option, ) iter.Seq2[IDDiff[K], error]
DiffFunc is Diff for any K comparable, with caller-supplied strict weak order. For UUID-keyed records typed as [16]byte, pass bytes.Compare(a[:], b[:]) < 0 as less.
less MUST be a strict weak order (irreflexive, antisymmetric, transitive). If less is nil, the returned stream yields exactly one (zero, ErrInvalidArgument) and stops.
Example ¶
ExampleDiffFunc shows the canonical DiffFunc use case: comparing two recordsets keyed by [16]byte UUIDs, where ID ordering is provided by bytes.Compare instead of cmp.Ordered. The baseline has u1 only; the candidate has u2 only — one Missing emission and one Extra emission.
package main
import (
"bytes"
"encoding/hex"
"fmt"
"github.com/dal-go/dalgo/dal"
"github.com/dal-go/dalgo/record"
"github.com/dal-go/dalgo/recordops"
)
func main() {
type uuid = [16]byte
u1 := uuid{0x01}
u2 := uuid{0x02}
mk := func(id uuid) record.WithID[uuid] {
key := dal.NewKeyWithID("Users", hex.EncodeToString(id[:]))
r := dal.NewRecordWithData(key, map[string]any{"name": "alice"})
r.SetError(nil)
return record.WithID[uuid]{ID: id, Record: r}
}
// Inputs MUST be sorted ascending by ID.
baseline := recordops.SliceToSeq([]record.WithID[uuid]{mk(u1)})
cand := recordops.SliceToSeq([]record.WithID[uuid]{mk(u2)})
less := func(a, b uuid) bool { return bytes.Compare(a[:], b[:]) < 0 }
for d, err := range recordops.DiffFunc[uuid](
baseline,
[]recordops.RecordSeq[uuid]{cand},
less,
) {
if err != nil {
fmt.Println("err:", err)
return
}
fmt.Printf("id=%s status=%d\n", hex.EncodeToString(d.ID[:]), d.Candidates[0].Status)
}
}
Output: id=01000000000000000000000000000000 status=0 id=02000000000000000000000000000000 status=1
func RenderJSON ¶
func RenderJSON[K comparable]( diffs iter.Seq2[IDDiff[K], error], collectionName string, ) (string, error)
RenderJSON serializes the entire diff stream as a JSON document with a single top-level key matching collectionName, whose value is the array of IDDiff entries in stream order.
The diff stream is consumed exactly once. If the stream yields an error, RenderJSON returns ("", err) verbatim. If nothing was emitted, the output is a one-key object with an empty array, e.g. {"users":[]}.
Output is deterministic for a given input: encoding/json marshals slices in order and the wrapper map has a single key. Indented for diffability.
RecordStatus is serialized as its numeric int8 value (0=Missing, 1=Extra, 2=Matched, 3=Changed). Consumers needing the string form should map the int themselves; this keeps the renderer faithful to the wire types.
FieldValue's `absent` flag round-trips natively via the json struct tag, preserving the Absent vs. nil-value distinction.
func RenderYAML ¶
func RenderYAML[K comparable]( diffs iter.Seq2[IDDiff[K], error], collectionName string, ) (string, error)
RenderYAML serializes the entire diff stream as a YAML document with a single top-level key matching collectionName, whose value is the sequence of IDDiff entries in stream order.
The diff stream is consumed exactly once. If the stream yields an error, RenderYAML returns ("", err) verbatim. If nothing was emitted, the output is a one-key mapping with an empty sequence.
Output is deterministic for a given input: yaml.v3 marshals slices in order and the wrapper map has a single key.
RecordStatus is serialized as its numeric int8 value (0=Missing, 1=Extra, 2=Matched, 3=Changed). Consumers needing the string form should map the int themselves; this keeps the renderer faithful to the wire types.
FieldValue's `absent` flag round-trips natively via the yaml struct tag, preserving the Absent vs. nil-value distinction.
func RenderYAMLByID ¶
func RenderYAMLByID[K comparable]( diffs iter.Seq2[IDDiff[K], error], collectionName string, ) (string, error)
RenderYAMLByID emits the cross-candidate divergence view — one block per emitted IDDiff in the stream, showing baseline (if present) and each candidate (in index order) with its status and any deltas.
The top-level YAML container is keyed by collectionName. Each ID maps to a block with an optional baseline section and a candidates section keyed by stringified integer index ("0", "1", ...).
Per-candidate field encoding:
- Changed candidates emit a fields map. A normal value delta renders as {new: <value>}. A field absent from the candidate (FieldValue.Absent == true) renders as {absent: true} — structurally distinct from a real nil value, which renders as YAML null inside {new: null}.
The renderer consumes the stream ONCE. Callers wanting multiple renders of the same Diff result must materialize first. If the stream yields a (zero, err) pair, RenderYAMLByID returns ("", err).
Output is valid YAML and is deterministic for a given input stream. Empty streams still emit a valid empty mapping: "<collectionName>: {}\n".
func RenderYAMLGitStyle ¶
func RenderYAMLGitStyle[K comparable]( diffs iter.Seq2[IDDiff[K], error], candidateIndex int, collectionName string, ) (string, error)
RenderYAMLGitStyle renders a single candidate's diff view as a YAML-shaped string with git-diff markers ("- " for missing IDs, "+ " for extra IDs, and per-field "- " / "+ " lines for changed records).
The diff stream is consumed exactly once. Callers needing multi-view rendering must materialize the stream first.
Matched candidates and candidateIndex values outside the [0, len(Candidates)) range are silently skipped. If the stream yields an error, RenderYAMLGitStyle returns ("", err) verbatim. If nothing was emitted, the output is "<collectionName>: {}\n" — an explicit empty collection.
Types ¶
type CandidateState ¶
type CandidateState struct {
Status RecordStatus
Fields []FieldValue
}
CandidateState describes one candidate's state for one ID:
- Status: Missing | Extra | Matched | Changed
- Fields: deltas only — never duplicates baseline values. See the per-Status semantics in the Feature spec (spec/features/recordops/diff/ REQ id-diff-shape).
type FieldValue ¶
type FieldValue struct {
Name string `json:"name" yaml:"name"`
Value any `json:"value,omitempty" yaml:"value,omitempty"`
Absent bool `json:"absent,omitempty" yaml:"absent,omitempty"`
}
FieldValue is used in BOTH RecordSnapshot.Fields and CandidateState.Fields. In RecordSnapshot.Fields, Value is the baseline's value for Name; Absent is always false. In CandidateState.Fields, Value is the candidate's value (only for Extra and Changed statuses; Missing and Matched have Fields == nil). When a field exists in baseline but is absent from a Changed candidate's record, Absent is true and Value is the zero value — consumers MUST NOT interpret Value when Absent is true. This is structurally distinct from Value == nil with Absent == false (a real Go-nil value the candidate explicitly holds).
Name may be empty for future helpers that ingest positional/unnamed-column records; MVP comparison paths always produce non-empty Name.
type IDDiff ¶
type IDDiff[K comparable] struct { ID K Baseline *RecordSnapshot Candidates []CandidateState }
IDDiff is the per-ID emission of Diff/DiffFunc. It carries the baseline snapshot (if baseline had this ID) and each candidate's state for this ID, in parallel-index order with the input candidates slice — Candidates[i] always describes input candidates[i].
type Option ¶
type Option func(*options)
Option configures Diff/DiffFunc behavior. The package exports four orthogonal options:
- WithIgnoreFields(names...) — exclude named fields from comparison.
- WithIncludeMatched() — emit IDDiff for every ID, including fully matched.
- WithOnlyChangedFields() — trim Baseline.Fields to only fields with deltas.
- WithAbsentEqualsNil() — treat field-absent as equivalent to field-with-nil-value during comparison.
func WithAbsentEqualsNil ¶
func WithAbsentEqualsNil() Option
WithAbsentEqualsNil instructs Diff to treat "field absent from a record" as equivalent to "field present with nil value" during comparison. Default is to distinguish the two via FieldValue.Absent. Use this when the dataset is sourced from heterogeneous backends where one stores "no value" as an absent column and another stores it as NULL.
When set: a baseline field with nil value and a candidate that lacks the field (or vice versa) produces no delta. Records whose differences all reduce to absent-vs-nil report Status == Matched.
func WithIgnoreFields ¶
WithIgnoreFields instructs Diff to omit named fields from comparison. Matching is by Go struct field name (when Record.Data() returns a struct) or by map key (when Record.Data() returns a map[string]any). Case-sensitive. Multiple calls compose additively. Unknown names are silently ignored.
Canonical use case: WithIgnoreFields("UpdatedAt") drops a timestamp field that always changes between snapshots.
func WithIncludeMatched ¶
func WithIncludeMatched() Option
WithIncludeMatched instructs Diff to emit IDDiff for every ID touched by any input — including IDs where every candidate is Matched. Default is to skip those.
func WithOnlyChangedFields ¶
func WithOnlyChangedFields() Option
WithOnlyChangedFields trims IDDiff.Baseline.Fields to only the fields that have a delta on at least one candidate. Default is to populate the full baseline record snapshot for context.
type RecordSeq ¶
RecordSeq is the streaming input shape for Diff and DiffFunc. Implementations MUST yield records sorted ascending by ID and MUST propagate any source error as a (zero, err) pair (after which iteration stops).
func ReaderToSeq ¶
func ReaderToSeq[K comparable](r dal.RecordsReader, idOf func(dal.Record) (K, error)) RecordSeq[K]
ReaderToSeq adapts a dalgo dal.RecordsReader to a RecordSeq. idOf extracts the ID from each dal.Record yielded by the reader. Reader errors propagate via the seq2 error channel.
The underlying reader is Closed exactly once when iteration ends — whether by exhausting records (dal.ErrNoMoreRecords), by the consumer breaking out of the range loop early, or by any upstream stream error.
dal.Reader.Cursor() is NOT surfaced through this bridge in MVP; callers needing pagination must drive the reader directly. See spec/ideas/dal-records-reader-iter-seq.md.
func SliceToSeq ¶
func SliceToSeq[K comparable](records []record.WithID[K]) RecordSeq[K]
SliceToSeq turns an already-sorted slice into a RecordSeq. The slice MUST be sorted ascending by ID; SliceToSeq does NOT sort. A nil or empty slice produces a stream that yields zero items.
type RecordSnapshot ¶
type RecordSnapshot struct {
Fields []FieldValue
}
RecordSnapshot is baseline's record contents for a given ID — the single source of truth for field values. Candidates carry only deltas; consumers reading "the old value for a changed field" look it up here by Name.
type RecordStatus ¶
type RecordStatus int8
RecordStatus classifies one candidate's relationship to baseline for one ID.
const ( // Missing — baseline has this ID; this candidate doesn't. Missing RecordStatus = iota // Extra — this candidate has this ID; baseline doesn't. Extra // Matched — both have the ID; all fields equal. Matched // Changed — both have the ID; at least one field differs. Changed )