archive

package
v1.4.7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 3, 2026 License: MIT Imports: 23 Imported by: 1

Documentation

Overview

Package archive: write, read, copy, append, list primitives across all supported formats

  • Copyright (c) 2018-2026, NVIDIA CORPORATION. All rights reserved.

Package archive: write, read, copy, append, list primitives across all supported formats

  • Copyright (c) 2018-2025, NVIDIA CORPORATION. All rights reserved.

Package archive: write, read, copy, append, list primitives across all supported formats

  • Copyright (c) 2018-2026, NVIDIA CORPORATION. All rights reserved.

Package archive: write, read, copy, append, list primitives across all supported formats

  • Copyright (c) 2018-2025, NVIDIA CORPORATION. All rights reserved.

Package archive: write, read, copy, append, list primitives across all supported formats

  • Copyright (c) 2018-2025, NVIDIA CORPORATION. All rights reserved.

Package archive: write, read, copy, append, list primitives across all supported formats

  • Copyright (c) 2018-2025, NVIDIA CORPORATION. All rights reserved.

Package archive: write, read, copy, append, list primitives across all supported formats

  • Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.

Package archive: write, read, copy, append, list primitives across all supported formats

  • Copyright (c) 2018-2025, NVIDIA CORPORATION. All rights reserved.

Index

Constants

View Source
const (
	ExtTar    = ".tar"
	ExtTgz    = ".tgz"
	ExtTarGz  = ".tar.gz"
	ExtZip    = ".zip"
	ExtTarLz4 = ".tar.lz4"
)

supported archive types (file extensions); see also archExts in cmd/cli/cli/const.go NOTE: when adding/removing formats - update:

  • FileExtensions
  • allMagics
  • ext/dsort/shard/rw.go
View Source
const (
	ExtGz  = ".gz"
	ExtLz4 = ".lz4"
)

compression formats - not necessarily compressed TAR

View Source
const (

	// ShardIdxMinLen: preamble + four single-byte uvarints (lower bound, not exact).
	ShardIdxMinLen = shardIdxPrefLen + 4
)
View Source
const TarBlockSize = 512 // Size of each block in a tar stream

Variables

View Source
var ErrShardIdxStale = errors.New("shard index: stale")

ErrShardIdxStale is returned by core.LoadShardIndex when the stored index was built from a prior version of the shard (checksum or size mismatch). The caller should rebuild.

View Source
var ErrTarIsEmpty = errors.New("tar is empty")
View Source
var FileExtensions = [...]string{ExtTar, ExtTgz, ExtTarGz, ExtZip, ExtTarLz4}
View Source
var MatchMode = [...]string{
	"regexp",
	"prefix",
	"suffix",
	"substr",
	"wdskey",
}

Functions

func ContentTypeFromExt added in v1.4.1

func ContentTypeFromExt(ext string) string

func DetectCompression added in v1.3.30

func DetectCompression(r io.ReaderAt) (string, error)

inspect the first bytes of r and return a compression extension (ExtGz, ExtLz4); an empty `ext` indicates plain-text (or rather: no compression)

func EqExt added in v1.3.19

func EqExt(ext1, ext2 string) bool

(currently, dsort only usage)

func ExtFromContentType added in v1.4.1

func ExtFromContentType(ct string) string

func IsErrUnknownFileExt

func IsErrUnknownFileExt(err error) bool

func IsErrUnknownMime

func IsErrUnknownMime(err error) bool

func Mime

func Mime(mime, filename string) (string, error)

func MimeFQN

func MimeFQN(smm *memsys.MMSA, mime, archname string) (m string, err error)

NOTE: - on purpose redundant vs the above - not to open file if can be avoided - convention: caller may pass nil `smm` _not_ to spend time (usage: listing and reading)

func MimeFile

func MimeFile(lh cos.LomReader, smm *memsys.MMSA, mime, archname string) (m string, err error)

func OpenTarForAppend added in v1.3.24

func OpenTarForAppend(cname, workFQN string) (rwfh *os.File, tarFormat tar.Format, offset int64, err error)

func SetTarHeader

func SetTarHeader(hdr any)

func SplitAtExtension added in v1.4.1

func SplitAtExtension(path string) (shardName, fileName string)

func Strict

func Strict(mime, filename string) (m string, err error)

motivation: prevent from creating archives with non-standard extensions

func ValidateMatchMode added in v1.3.23

func ValidateMatchMode(mmode string) (_ string, err error)

Types

type ArchRCB added in v1.3.23

type ArchRCB interface {
	Call(filename string, reader cos.ReadCloseSizer, hdr any) (bool, error)
}

to use, construct (`NewReader`) and iterate (`RangeUntil`) (all supported formats) simple/single selection is also supported (`ReadOne`)

type Drain added in v1.4.1

type Drain struct {
	// contains filtered or unexported fields
}

func (*Drain) Call added in v1.4.1

func (drain *Drain) Call(_ string, r cos.ReadCloseSizer, _ any) (bool, error)

func (*Drain) Totals added in v1.4.1

func (drain *Drain) Totals() (size, num int64)

type Entry

type Entry struct {
	Name string
	Size int64 // uncompressed size
}

archived file entry

func List

func List(fqn string) ([]*Entry, error)

type ErrMatchMode added in v1.3.23

type ErrMatchMode struct {
	// contains filtered or unexported fields
}

func (*ErrMatchMode) Error added in v1.3.23

func (e *ErrMatchMode) Error() string

type HeaderCallback

type HeaderCallback func(any)

type Opts

type Opts struct {
	CB        HeaderCallback
	TarFormat tar.Format
	Serialize bool
}

type Reader

type Reader interface {
	// - call rcb (reader's callback) with each matching archived file, where:
	//   - `regex` is the matching string that gets interpreted according
	//      to one of the enumerated "matching modes" (see MatchMode);
	//   - an empty `regex` is just another case of cos.EmptyMatchAll - i.e., matches all archived files
	// - stop upon EOF, or when rcb returns true (ie., stop) or any error
	ReadUntil(rcb ArchRCB, regex, mmode string) error

	// simple/single selection of a given archived filename (full path)
	ReadOne(filename string) (cos.ReadCloseSizer, error)
	// contains filtered or unexported methods
}

to use, construct (`NewReader`) and iterate (`RangeUntil`) (all supported formats) simple/single selection is also supported (`ReadOne`)

func NewReader

func NewReader(mime string, fh io.Reader, size ...int64) (ar Reader, err error)

type ShardIndex added in v1.4.5

type ShardIndex struct {
	Entries map[string]ShardIndexEntry
	// SrcCksum and SrcSize are the LOM's checksum and size captured at index-build time.
	// to detect re-uploaded shards without reading the TAR content.
	// Set by the caller before passing the index to SaveShardIndex.
	SrcCksum *cos.Cksum
	SrcSize  int64
	// contains filtered or unexported fields
}

func BuildShardIndex added in v1.4.5

func BuildShardIndex(r io.ReaderAt, size int64) (*ShardIndex, error)

BuildShardIndex performs one sequential scan of a TAR and returns an index mapping each regular file's name to its exact byte location within the archive.

func (*ShardIndex) IsStale added in v1.4.5

func (idx *ShardIndex) IsStale(cksum *cos.Cksum, size int64) bool

IsStale reports whether the index was built from a different version of the shard. Always checks size; also compares cksum when SrcCksum is set.

func (*ShardIndex) Pack added in v1.4.5

func (idx *ShardIndex) Pack() ([]byte, error)

Pack serializes the index into a compact binary format.

func (*ShardIndex) Unpack added in v1.4.5

func (idx *ShardIndex) Unpack(b []byte) error

Unpack deserializes a packed ShardIndex produced by Pack.

type ShardIndexEntry added in v1.4.5

type ShardIndexEntry struct {
	// Offset is the byte offset of the file's 512-byte TAR header block within the archive.
	// File data begins immediately after: Offset + TarBlockSize.
	// Always a multiple of TarBlockSize; the first entry in a shard can be at offset 0.
	Offset int64

	// File size in bytes (as recorded in the TAR header).
	Size int64
}

func (ShardIndexEntry) DataOffset added in v1.4.5

func (e ShardIndexEntry) DataOffset() int64

DataOffset returns the byte offset of the file's data within the archive. Callers use this for direct random access: io.NewSectionReader(r, entry.DataOffset(), entry.Size).

type Writer

type Writer interface {
	// Init specific writer
	Write(nameInArch string, oah cos.OAH, reader io.Reader) error
	// Close, cleanup
	Fini() error
	// Copy arch, with potential subsequent APPEND
	Copy(src io.Reader, size ...int64) error

	Flush() error
	// contains filtered or unexported methods
}

TODO: consider adding Size() for the number of bytes already written to the underlying writer - compressed bytes for compressed formats (note that cos.CksumHashSize has size)

func NewWriter

func NewWriter(mime string, w io.Writer, cksum *cos.CksumHashSize, opts *Opts) (aw Writer)

calls init() -> open(),alloc()

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL