swhid

package module
v0.0.0-...-4b2addb Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 11, 2026 License: MIT Imports: 14 Imported by: 0

README

swhid-go

A Go library and CLI for computing Software Heritage Identifiers (SWHIDs).

SWHIDs are intrinsic identifiers for digital objects based on cryptographic hashes. They're used by Software Heritage to uniquely identify source code artifacts.

Installation

go get github.com/andrew/swhid-go

For the CLI:

go install github.com/andrew/swhid-go/cmd/swhid@latest

Library Usage

package main

import (
    "fmt"
    "github.com/andrew/swhid-go"
    "github.com/andrew/swhid-go/objects"
)

func main() {
    // Compute SWHID for content
    id := swhid.FromContent([]byte("hello\n"))
    fmt.Println(id) // swh:1:cnt:ce013625030ba8dba906f756967f9e9ca394464a

    // Parse an existing SWHID
    parsed, _ := swhid.Parse("swh:1:cnt:ce013625030ba8dba906f756967f9e9ca394464a")
    fmt.Println(parsed.ObjectType) // cnt
    fmt.Println(parsed.ObjectHash) // ce013625030ba8dba906f756967f9e9ca394464a

    // Compute SWHID for a directory
    entries := []objects.DirectoryEntry{
        {Name: "hello.txt", Type: objects.EntryTypeFile, Target: "ce013625030ba8dba906f756967f9e9ca394464a"},
    }
    dirID := swhid.FromDirectory(entries)
    fmt.Println(dirID) // swh:1:dir:...

    // Hash a directory from the filesystem
    fsID, _ := swhid.FromDirectoryPath("/path/to/dir")
    fmt.Println(fsID)

    // Hash a git commit
    revID, _ := swhid.FromRevision("/path/to/repo", "HEAD")
    fmt.Println(revID)
}

CLI Usage

# Parse and validate a SWHID
swhid parse swh:1:cnt:ce013625030ba8dba906f756967f9e9ca394464a

# Generate SWHID from file content (stdin)
echo "hello" | swhid content

# Generate SWHID from directory
swhid directory /path/to/dir

# Generate SWHID from git commit
swhid revision /path/to/repo
swhid revision /path/to/repo main
swhid revision /path/to/repo abc123

# Generate SWHID from annotated git tag
swhid release /path/to/repo v1.0.0

# Generate SWHID for repository snapshot
swhid snapshot /path/to/repo

# JSON output (flag before positional args)
swhid parse -f json swh:1:cnt:ce013625030ba8dba906f756967f9e9ca394464a

# Add qualifiers
echo "hello" | swhid content -q origin=https://github.com/example/repo

Object Types

Type Code Description
Content cnt File content (blob)
Directory dir Directory tree
Revision rev Git commit
Release rel Annotated tag
Snapshot snp Repository state

SWHID Format

swh:1:<type>:<hash>[;<qualifier>=<value>...]
  • swh - scheme
  • 1 - version
  • <type> - object type (cnt, dir, rev, rel, snp)
  • <hash> - 40-character SHA1 hex digest
  • <qualifier> - optional qualifiers (origin, visit, anchor, path, lines, bytes)

License

MIT

Documentation

Overview

Package swhid provides functionality for computing and parsing Software Heritage Identifiers (SWHIDs).

SWHIDs are intrinsic identifiers for digital objects (source code files, directories, commits, etc.) based on cryptographic hashes. This package implements the SWHID specification v1.

Basic usage:

// Compute SWHID for file content
id := swhid.FromContent([]byte("hello world"))
fmt.Println(id) // swh:1:cnt:...

// Parse an existing SWHID
id, err := swhid.Parse("swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2")

Index

Constants

View Source
const (
	Scheme        = "swh"
	SchemeVersion = 1
	ObjectIDLen   = 40
)

Variables

View Source
var (
	ErrEmptySWHID        = errors.New("SWHID string cannot be nil or empty")
	ErrInvalidFormat     = errors.New("invalid SWHID format")
	ErrInvalidScheme     = errors.New("invalid scheme")
	ErrInvalidVersion    = errors.New("invalid version")
	ErrInvalidObjectType = errors.New("invalid object type")
	ErrInvalidObjectHash = errors.New("invalid object hash")
)

Error types

Functions

This section is empty.

Types

type Identifier

type Identifier struct {
	Scheme     string
	Version    int
	ObjectType ObjectType
	ObjectHash string
	Qualifiers map[string]string
}

Identifier represents a parsed SWHID.

func FromContent

func FromContent(data []byte) *Identifier

FromContent computes the SWHID for file content.

func FromDirectory

func FromDirectory(entries []objects.DirectoryEntry) *Identifier

FromDirectory computes the SWHID for a directory with the given entries.

func FromDirectoryPath

func FromDirectoryPath(path string) (*Identifier, error)

FromDirectoryPath computes the SWHID for a directory on the filesystem. It recursively hashes all files and subdirectories. If the directory is within a Git repository, it uses the Git index for file permissions.

func FromDirectoryPathWithOptions

func FromDirectoryPathWithOptions(path string, gitRepo *git.Repository, permissions map[string]os.FileMode) (*Identifier, error)

FromDirectoryPathWithOptions computes the SWHID with custom options. gitRepo can be provided to use Git index for permissions. permissions can be provided as a map of path -> mode for explicit permissions.

func FromRelease

func FromRelease(repoPath, tagName string) (*Identifier, error)

FromRelease computes the SWHID for a Git release (annotated tag).

func FromReleaseMetadata

func FromReleaseMetadata(meta objects.ReleaseMetadata) *Identifier

FromReleaseMetadata computes the SWHID for a release with the given metadata.

func FromRevision

func FromRevision(repoPath, ref string) (*Identifier, error)

FromRevision computes the SWHID for a Git revision (commit).

func FromRevisionMetadata

func FromRevisionMetadata(meta objects.RevisionMetadata) *Identifier

FromRevisionMetadata computes the SWHID for a revision with the given metadata.

func FromSnapshot

func FromSnapshot(repoPath string) (*Identifier, error)

FromSnapshot computes the SWHID for a Git repository snapshot.

func FromSnapshotBranches

func FromSnapshotBranches(branches []objects.Branch) *Identifier

FromSnapshotBranches computes the SWHID for a snapshot with the given branches.

func NewIdentifier

func NewIdentifier(objectType ObjectType, objectHash string, qualifiers map[string]string) (*Identifier, error)

NewIdentifier creates a new Identifier with validation.

func Parse

func Parse(swhidString string) (*Identifier, error)

Parse parses a SWHID string into an Identifier.

func (*Identifier) CoreSWHID

func (id *Identifier) CoreSWHID() string

CoreSWHID returns the core SWHID without qualifiers.

func (*Identifier) Equal

func (id *Identifier) Equal(other *Identifier) bool

Equal returns true if two identifiers are equal.

func (*Identifier) String

func (id *Identifier) String() string

String returns the canonical SWHID string representation.

func (*Identifier) WithQualifiers

func (id *Identifier) WithQualifiers(qualifiers map[string]string) *Identifier

WithQualifiers returns a new Identifier with the given qualifiers.

type ObjectType

type ObjectType string

ObjectType represents the type of object identified by a SWHID.

const (
	ObjectTypeContent   ObjectType = "cnt"
	ObjectTypeDirectory ObjectType = "dir"
	ObjectTypeRevision  ObjectType = "rev"
	ObjectTypeRelease   ObjectType = "rel"
	ObjectTypeSnapshot  ObjectType = "snp"
)

Directories

Path Synopsis
cmd
swhid command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL