manifest

package
v0.0.0-...-a6c8558 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 7, 2026 License: GPL-3.0 Imports: 14 Imported by: 0

Documentation

Overview

Package manifest defines a YAML-based DSL for declaring datasets to download and verify in bulk. A manifest lists entries; each entry maps a canonical accession (compact "source:id" form) to a folder identifier and an optional list of expected files with hashes for post-download verification.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func RenderEntry

func RenderEntry(e *Entry) (string, error)

RenderEntry emits a YAML snippet for a single entry as one item of a top-level sequence (suitable for appending to a manifest file).

func ResolveSource

func ResolveSource(e Entry) (source, id string, err error)

ResolveSource returns (source, id) for an entry, handling both the "accession: source:id" form and the "url: https://..." shorthand.

func SplitAccession

func SplitAccession(acc string) (source, id string, err error)

SplitAccession parses "source:id" into its parts.

func VerifyFile

func VerifyFile(path, spec string) error

VerifyFile checks a file at path against an "<algo>:<hex>" hash spec. An empty spec is a no-op (returns nil). Unknown algorithms return an error.

Types

type Entry

type Entry struct {
	Identifier string   `yaml:"identifier"`
	Accession  string   `yaml:"accession,omitempty"`
	URL        string   `yaml:"url,omitempty"`
	Hash       string   `yaml:"hash,omitempty"`
	Files      []File   `yaml:"files,omitempty"`
	Options    *Options `yaml:"options,omitempty"`
}

Entry declares one dataset to download into <parent>/<identifier>.

func FromWitness

func FromWitness(witnessPath string) (*Entry, error)

FromWitness builds a manifest entry from a hapiq.json witness file. The identifier defaults to the basename of the directory containing the witness.

func Load

func Load(path string) ([]Entry, error)

Load parses a manifest YAML file from disk. The top-level document is a sequence of entries. Unknown fields are rejected to keep the schema flat and explicit.

type File

type File struct {
	Name string `yaml:"name"`
	Hash string `yaml:"hash,omitempty"`
}

File names a specific downloaded file (relative to the entry folder) and its expected hash in "<algo>:<hex>" form (e.g. md5:abc123).

type Options

type Options struct {
	IncludeExts          []string `yaml:"include_ext,omitempty"`
	ExcludeExts          []string `yaml:"exclude_ext,omitempty"`
	MaxFileSize          string   `yaml:"max_file_size,omitempty"`
	FilenameGlob         string   `yaml:"filename_pattern,omitempty"`
	Subset               []string `yaml:"subset,omitempty"`
	Organism             string   `yaml:"organism,omitempty"`
	ExcludeRaw           bool     `yaml:"exclude_raw,omitempty"`
	ExcludeSupplementary bool     `yaml:"exclude_supplementary,omitempty"`
	IncludeSRA           bool     `yaml:"include_sra,omitempty"`
	LimitFiles           int      `yaml:"limit_files,omitempty"`
}

Options is a per-entry subset of downloader options.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL