gedcomgo

package module
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 9, 2026 License: MIT Imports: 6 Imported by: 0

README

gedcom-go

CI codecov Go Report Card GoDoc Release Go Version

A pure Go library for parsing and validating GEDCOM (GEnealogical Data COMmunication) files.

Features

  • Multi-version Support: Parse and write GEDCOM 5.5, 5.5.1, and 7.0 with automatic version detection
  • Version Conversion: Bidirectional conversion between versions with transformation tracking
  • Historical Calendar Support: Parse dates in Julian, Hebrew, and French Republican calendars with conversion
  • Streaming APIs: Memory-efficient parsing and encoding for very large files (1M+ records)
  • Comprehensive Validation: Date logic, orphaned references, duplicates, and quality reports
  • Vendor Extensions: Parse Ancestry.com and FamilySearch custom tags
  • Zero Dependencies: Uses only the Go standard library
  • Well-tested: 93% test coverage with multi-platform CI

See FEATURES.md for the complete feature list including all supported record types, events, attributes, and encoding details.

Compatibility

Support status for common genealogy software:

Software Status
RootsMagic ⚠️ Tested (older version)
Legacy Family Tree ⚠️ Tested (older version)
Family Tree Maker ⚠️ Tested (older version)
Gramps 🧪 Synthetic test only
Ancestry 🧪 Synthetic test only

Full compatibility matrix: docs/COMPATIBILITY.md

GEDCOM Specification: Full support for 5.5, 5.5.1, and 7.0

Installation

go get github.com/cacack/gedcom-go

Requirements

  • Go 1.24 or later

This library tracks Go's release policy, supporting the two most recent major versions. When a Go version reaches end-of-life and no longer receives security patches, we bump our minimum accordingly.

Quick Start

The library provides a simple, single-import API for common operations. Import with an alias for cleaner code:

import gedcomgo "github.com/cacack/gedcom-go"
Parse a GEDCOM File
package main

import (
    "fmt"
    "log"
    "os"

    gedcomgo "github.com/cacack/gedcom-go"
)

func main() {
    f, err := os.Open("family.ged")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()

    doc, err := gedcomgo.Decode(f)
    if err != nil {
        log.Fatal(err)
    }

    fmt.Printf("GEDCOM Version: %s\n", doc.Header.Version)
    fmt.Printf("Individuals: %d\n", len(doc.Individuals()))
    fmt.Printf("Families: %d\n", len(doc.Families()))
}
Validate a Document
// Basic validation (returns []error)
errors := gedcomgo.Validate(doc)

// Comprehensive validation with severity levels (returns []Issue)
issues := gedcomgo.ValidateAll(doc)
for _, issue := range issues {
    fmt.Printf("[%s] %s\n", issue.Severity, issue.Message)
}
Write a GEDCOM File
f, _ := os.Create("output.ged")
defer f.Close()

err := gedcomgo.Encode(f, doc)
Convert Between Versions
// Convert to GEDCOM 7.0
converted, report, err := gedcomgo.Convert(doc, gedcomgo.Version70)
if report.HasDataLoss() {
    for _, item := range report.DataLoss {
        fmt.Printf("Lost: %s - %s\n", item.Feature, item.Reason)
    }
}
Working with Records
// Find and display individuals
for _, individual := range doc.Individuals() {
    if len(individual.Names) > 0 {
        fmt.Printf("Name: %s\n", individual.Names[0].Full)
    }

    // Access events
    for _, event := range individual.Events {
        fmt.Printf("  %s: %s\n", event.Tag, event.Date)
    }
}

// O(1) lookup by cross-reference ID
person := doc.GetIndividual("@I1@")
if person != nil {
    fmt.Printf("Found: %s\n", person.Names[0].Full)
}

// Navigate family relationships
family := doc.GetFamily("@F1@")
if family != nil {
    husband := doc.GetIndividual(family.Husband)
    wife := doc.GetIndividual(family.Wife)
}
Parse with Diagnostics

Process GEDCOM files with errors while extracting as much valid data as possible:

result, err := gedcomgo.DecodeWithDiagnostics(f)
if err != nil {
    log.Fatal(err) // Fatal I/O error
}

// Check for parse issues
if result.Diagnostics.HasErrors() {
    fmt.Printf("Found %d errors\n", len(result.Diagnostics.Errors()))
    for _, d := range result.Diagnostics {
        fmt.Printf("  Line %d: %s\n", d.Line, d.Message)
    }
}

// Use the partial document
doc := result.Document
fmt.Printf("Parsed %d individuals\n", len(doc.Individuals()))

Documentation

Advanced Usage

For advanced use cases requiring custom options, import the underlying packages directly:

import (
    "github.com/cacack/gedcom-go/decoder"
    "github.com/cacack/gedcom-go/encoder"
    "github.com/cacack/gedcom-go/validator"
    "github.com/cacack/gedcom-go/converter"
)
Custom Decode Options
// Decode with progress reporting and timeout
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

opts := &decoder.DecodeOptions{
    Context:   ctx,
    TotalSize: fileInfo.Size(),
    OnProgress: func(bytesRead, totalBytes int64) {
        fmt.Printf("\rProgress: %d%%", bytesRead*100/totalBytes)
    },
}
doc, err := decoder.DecodeWithOptions(reader, opts)
Custom Validation Configuration
// Configure validation strictness and duplicate detection
config := &validator.ValidatorConfig{
    Strictness: validator.StrictnessStrict,
    Duplicates: &validator.DuplicateConfig{
        RequireExactSurname: true,
        MinNameSimilarity:   0.8,
    },
}
v := validator.NewWithConfig(config)
issues := v.ValidateAll(doc)
Custom Encoder Options
// Encode with custom line endings and line length
opts := &encoder.EncodeOptions{
    LineEnding:    encoder.LineEndingLF,
    MaxLineLength: 255,
}
err := encoder.EncodeWithOptions(writer, doc, opts)
Custom Conversion Options
// Convert with strict data loss checking
opts := &converter.ConvertOptions{
    Validate:       true,
    StrictDataLoss: true,  // Fail on any data loss
}
converted, report, err := converter.ConvertWithOptions(doc, gedcom.Version55, opts)

Packages

For fine-grained control, these packages are available:

  • charset - Character encoding utilities with UTF-8 validation
  • converter - Version conversion with transformation tracking
  • decoder - High-level GEDCOM decoding with automatic version detection
  • encoder - GEDCOM document writing with configurable line endings
  • gedcom - Core data types (Document, Individual, Family, Source, etc.)
  • parser - Low-level line parsing with detailed error reporting
  • validator - Document validation with error categorization
  • version - GEDCOM version detection (header and heuristic-based)

API Stability

This library follows Semantic Versioning. We do not break exported types in v1+ without a major version bump.

Stable Packages
Package Key APIs
gedcom Document, Individual, Family, Event, Date
decoder Decode(), DecodeWithOptions()
encoder Encode(), EncodeWithOptions()
converter Convert(), ConvertWithOptions()
parser Parse(), ParseLine()
validator Validate(), ValidateAll()
charset NewReader()
version Detect()
What May Change
  • Experimental features (streaming APIs, duplicate detection) may evolve in minor versions
GEDCOM Spec Evolution

As GEDCOM 7.x evolves, we add support additively. New tags and structures are added without breaking existing code.

Vendor Extensions

Vendor extensions (Ancestry, FamilySearch) are best-effort and not covered by stability guarantees.

For the complete policy including deprecation process, see docs/API_STABILITY.md.

Development

Quick Start with Makefile

The project includes a Makefile for common development tasks:

# Show all available commands
make help

# Run all checks and build
make all

# Run tests
make test

# Run tests with coverage (93% coverage)
make test-coverage

# Generate HTML coverage report
make coverage-html

# Run benchmarks
make bench

# Format code
make fmt

# Run linters
make vet
make lint

# Run pre-commit checks
make pre-commit

# Clean build artifacts
make clean
Manual Commands

You can also use Go commands directly:

# Run all tests
go test ./...

# Run tests with coverage
go test -cover ./...

# Run benchmarks
go test -bench=. ./...

# Download dependencies
go mod download

# Build all packages
go build ./...

# Format code
go fmt ./...

# Run static analysis
go vet ./...

Performance

The library is designed for high performance with efficient memory usage:

  • Parser: 66ns/op for simple lines, ~700μs for 1000 individuals
  • Decoder: 13ms for 1000 individuals with full document structure
  • Encoder: 1.15ms for 1000 individuals
  • Validator: 5.91μs for 1000 individuals, zero allocations for valid documents
Benchmarking
# Run all benchmarks
make bench

# Run specific package benchmarks
make bench-parse
make bench-decode
make bench-encode

# Save baseline for comparison
make bench-save

# Compare current performance with baseline
make bench-compare
Performance Regression Testing

Automated regression detection with 10% threshold:

# Run regression tests
make perf-regression

For detailed performance metrics, profiling guides, and optimization opportunities, see docs/PERFORMANCE.md.

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please ensure:

  • All tests pass (go test ./...)
  • Code coverage remains ≥85%
  • Code is formatted (go fmt ./...)
  • No linter warnings (go vet ./...)

See CONTRIBUTING.md for detailed guidelines.

Documentation

Overview

Package gedcomgo provides a unified API for processing GEDCOM genealogical data files.

This package is the recommended entry point for most users. It provides simple, high-level functions for common operations while re-exporting the most frequently used types for single-import convenience.

Quick Start

Parse a GEDCOM file:

file, _ := os.Open("family.ged")
doc, err := gedcomgo.Decode(file)
if err != nil {
    log.Fatal(err)
}

for _, ind := range doc.Individuals() {
    fmt.Println(ind.Names[0].Full)
}

Write a GEDCOM file:

file, _ := os.Create("output.ged")
err := gedcomgo.Encode(file, doc)

Validate a document:

errors := gedcomgo.Validate(doc)
for _, err := range errors {
    fmt.Println(err)
}

Convert between versions:

converted, report, err := gedcomgo.Convert(doc, gedcomgo.Version70)

Power Users

For advanced use cases requiring custom options, import the underlying packages directly:

  • github.com/cacack/gedcom-go/decoder - Custom decode options, progress callbacks, diagnostics
  • github.com/cacack/gedcom-go/encoder - Custom line endings, encoding options
  • github.com/cacack/gedcom-go/validator - Configurable validation rules, quality reports
  • github.com/cacack/gedcom-go/converter - Custom conversion options, strict mode

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Convert

func Convert(doc *Document, targetVersion Version) (converted *Document, report *ConversionReport, err error)

Convert converts a GEDCOM document to the target version. It returns the converted document, a report detailing any transformations or data loss, and an error if conversion failed.

Supported conversions:

  • 5.5 <-> 5.5.1 (minimal changes, mostly compatible)
  • 5.5 <-> 7.0 (text handling, xref normalization)
  • 5.5.1 <-> 7.0 (text handling, xref normalization)

For custom options (strict data loss mode, validation), use the converter package directly: converter.ConvertWithOptions().

func Encode

func Encode(w io.Writer, doc *Document) error

Encode writes a GEDCOM document to a writer using default options. The output uses CRLF line endings as per the GEDCOM specification.

For custom options (line endings, encoding), use the encoder package directly: encoder.EncodeWithOptions().

func Validate

func Validate(doc *Document) []error

Validate validates a GEDCOM document and returns any validation errors. This performs basic structural validation including cross-reference checks and required field validation.

For comprehensive validation with severity levels, use ValidateAll(). For custom validation configuration, use the validator package directly.

Types

type ConversionReport

type ConversionReport = gedcom.ConversionReport

ConversionReport contains the results of a GEDCOM version conversion.

type DecodeResult

type DecodeResult = decoder.DecodeResult

DecodeResult contains the result of decoding a GEDCOM file with diagnostics. In lenient mode, Document may contain partial data even when diagnostics are present.

func DecodeWithDiagnostics

func DecodeWithDiagnostics(r io.Reader) (*DecodeResult, error)

DecodeWithDiagnostics parses a GEDCOM file and returns both the document and any diagnostics. In lenient mode (the default), parse errors are collected as diagnostics rather than stopping parsing, allowing partial documents to be returned.

For custom options (strict mode, progress callbacks), use the decoder package directly: decoder.DecodeWithDiagnostics() with custom options.

type Document

type Document = gedcom.Document

Document represents a complete GEDCOM file with all its records. Use Individuals(), Families(), Sources() to access typed collections.

func Decode

func Decode(r io.Reader) (*Document, error)

Decode parses a GEDCOM file from an io.Reader and returns a Document. This is the simplest way to parse a GEDCOM file using default options.

For custom options (progress callbacks, context cancellation), use the decoder package directly: decoder.DecodeWithOptions().

type Family

type Family = gedcom.Family

Family represents a family unit (husband, wife, and children).

type Individual

type Individual = gedcom.Individual

Individual represents a person in the GEDCOM file.

type Issue

type Issue = validator.Issue

Issue represents a validation finding with severity, context, and actionable information.

func ValidateAll

func ValidateAll(doc *Document) []Issue

ValidateAll returns comprehensive validation as Issues with severity levels. This is the enhanced API that provides more detail than Validate(), including date logic validation, reference checking, and quality analysis.

Issues are categorized by severity: Error, Warning, and Info. For custom validation configuration (strictness, thresholds), use the validator package directly: validator.NewWithConfig().

type Version

type Version = gedcom.Version

Version represents a GEDCOM specification version.

const (
	// Version55 represents GEDCOM 5.5 specification.
	Version55 Version = gedcom.Version55

	// Version551 represents GEDCOM 5.5.1 specification.
	Version551 Version = gedcom.Version551

	// Version70 represents GEDCOM 7.0 specification.
	Version70 Version = gedcom.Version70
)

Version constants for convenience.

Directories

Path Synopsis
Package charset provides character encoding utilities for GEDCOM files.
Package charset provides character encoding utilities for GEDCOM files.
Package converter provides GEDCOM version conversion functionality.
Package converter provides GEDCOM version conversion functionality.
Package decoder provides high-level GEDCOM file decoding functionality.
Package decoder provides high-level GEDCOM file decoding functionality.
Package encoder provides functionality to write GEDCOM documents to files.
Package encoder provides functionality to write GEDCOM documents to files.
examples
date-parsing command
Package main demonstrates parsing GEDCOM date values including modifiers, ranges, periods, and date comparison.
Package main demonstrates parsing GEDCOM date values including modifiers, ranges, periods, and date comparison.
encode command
Package main demonstrates creating GEDCOM documents programmatically and encoding them to standard GEDCOM format.
Package main demonstrates creating GEDCOM documents programmatically and encoding them to standard GEDCOM format.
parse command
Package main demonstrates basic GEDCOM file parsing with summary statistics, record counting, and validation.
Package main demonstrates basic GEDCOM file parsing with summary statistics, record counting, and validation.
query command
Package main demonstrates querying GEDCOM data including individual lookups, family traversal, and relationship navigation.
Package main demonstrates querying GEDCOM data including individual lookups, family traversal, and relationship navigation.
validate command
Package main demonstrates GEDCOM file validation with error categorization, grouping, and detailed reporting.
Package main demonstrates GEDCOM file validation with error categorization, grouping, and detailed reporting.
Package gedcom defines the core data types for representing GEDCOM genealogy data.
Package gedcom defines the core data types for representing GEDCOM genealogy data.
testing
Package testing provides round-trip test helpers for GEDCOM documents.
Package testing provides round-trip test helpers for GEDCOM documents.
Package parser provides low-level GEDCOM line parsing functionality.
Package parser provides low-level GEDCOM line parsing functionality.
Package validator provides GEDCOM document validation functionality.
Package validator provides GEDCOM document validation functionality.
Package version provides GEDCOM version detection and validation.
Package version provides GEDCOM version detection and validation.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL