Documentation
¶
Overview ¶
Package archive: write, read, copy, append, list primitives across all supported formats
- Copyright (c) 2018-2026, NVIDIA CORPORATION. All rights reserved.
Package archive: write, read, copy, append, list primitives across all supported formats
- Copyright (c) 2018-2025, NVIDIA CORPORATION. All rights reserved.
Package archive: write, read, copy, append, list primitives across all supported formats
- Copyright (c) 2018-2026, NVIDIA CORPORATION. All rights reserved.
Package archive: write, read, copy, append, list primitives across all supported formats
- Copyright (c) 2018-2025, NVIDIA CORPORATION. All rights reserved.
Package archive: write, read, copy, append, list primitives across all supported formats
- Copyright (c) 2018-2025, NVIDIA CORPORATION. All rights reserved.
Package archive: write, read, copy, append, list primitives across all supported formats
- Copyright (c) 2018-2025, NVIDIA CORPORATION. All rights reserved.
Package archive: write, read, copy, append, list primitives across all supported formats
- Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
Package archive: write, read, copy, append, list primitives across all supported formats
- Copyright (c) 2018-2025, NVIDIA CORPORATION. All rights reserved.
Index ¶
- Constants
- Variables
- func ContentTypeFromExt(ext string) string
- func DetectCompression(r io.ReaderAt) (string, error)
- func EqExt(ext1, ext2 string) bool
- func ExtFromContentType(ct string) string
- func IsErrUnknownFileExt(err error) bool
- func IsErrUnknownMime(err error) bool
- func Mime(mime, filename string) (string, error)
- func MimeFQN(smm *memsys.MMSA, mime, archname string) (m string, err error)
- func MimeFile(lh cos.LomReader, smm *memsys.MMSA, mime, archname string) (m string, err error)
- func OpenTarForAppend(cname, workFQN string) (rwfh *os.File, tarFormat tar.Format, offset int64, err error)
- func SetTarHeader(hdr any)
- func SplitAtExtension(path string) (shardName, fileName string)
- func Strict(mime, filename string) (m string, err error)
- func ValidateMatchMode(mmode string) (_ string, err error)
- type ArchRCB
- type Drain
- type Entry
- type ErrMatchMode
- type HeaderCallback
- type Opts
- type Reader
- type ShardIndex
- type ShardIndexEntry
- type Writer
Constants ¶
const ( ExtTar = ".tar" ExtTgz = ".tgz" ExtTarGz = ".tar.gz" ExtZip = ".zip" ExtTarLz4 = ".tar.lz4" )
supported archive types (file extensions); see also archExts in cmd/cli/cli/const.go NOTE: when adding/removing formats - update:
- FileExtensions
- allMagics
- ext/dsort/shard/rw.go
const ( ExtGz = ".gz" ExtLz4 = ".lz4" )
compression formats - not necessarily compressed TAR
const (
// ShardIdxMinLen: preamble + four single-byte uvarints (lower bound, not exact).
ShardIdxMinLen = shardIdxPrefLen + 4
)
const TarBlockSize = 512 // Size of each block in a tar stream
Variables ¶
var ErrShardIdxStale = errors.New("shard index: stale")
ErrShardIdxStale is returned by core.LoadShardIndex when the stored index was built from a prior version of the shard (checksum or size mismatch). The caller should rebuild.
var ErrTarIsEmpty = errors.New("tar is empty")
var MatchMode = [...]string{
"regexp",
"prefix",
"suffix",
"substr",
"wdskey",
}
Functions ¶
func ContentTypeFromExt ¶ added in v1.4.1
func DetectCompression ¶ added in v1.3.30
inspect the first bytes of r and return a compression extension (ExtGz, ExtLz4); an empty `ext` indicates plain-text (or rather: no compression)
func ExtFromContentType ¶ added in v1.4.1
func IsErrUnknownFileExt ¶
func IsErrUnknownMime ¶
func MimeFQN ¶
NOTE: - on purpose redundant vs the above - not to open file if can be avoided - convention: caller may pass nil `smm` _not_ to spend time (usage: listing and reading)
func OpenTarForAppend ¶ added in v1.3.24
func SetTarHeader ¶
func SetTarHeader(hdr any)
func SplitAtExtension ¶ added in v1.4.1
func ValidateMatchMode ¶ added in v1.3.23
Types ¶
type ArchRCB ¶ added in v1.3.23
to use, construct (`NewReader`) and iterate (`RangeUntil`) (all supported formats) simple/single selection is also supported (`ReadOne`)
type Drain ¶ added in v1.4.1
type Drain struct {
// contains filtered or unexported fields
}
type ErrMatchMode ¶ added in v1.3.23
type ErrMatchMode struct {
// contains filtered or unexported fields
}
func (*ErrMatchMode) Error ¶ added in v1.3.23
func (e *ErrMatchMode) Error() string
type HeaderCallback ¶
type HeaderCallback func(any)
type Reader ¶
type Reader interface {
// - call rcb (reader's callback) with each matching archived file, where:
// - `regex` is the matching string that gets interpreted according
// to one of the enumerated "matching modes" (see MatchMode);
// - an empty `regex` is just another case of cos.EmptyMatchAll - i.e., matches all archived files
// - stop upon EOF, or when rcb returns true (ie., stop) or any error
ReadUntil(rcb ArchRCB, regex, mmode string) error
// simple/single selection of a given archived filename (full path)
ReadOne(filename string) (cos.ReadCloseSizer, error)
// contains filtered or unexported methods
}
to use, construct (`NewReader`) and iterate (`RangeUntil`) (all supported formats) simple/single selection is also supported (`ReadOne`)
type ShardIndex ¶ added in v1.4.5
type ShardIndex struct {
Entries map[string]ShardIndexEntry
// SrcCksum and SrcSize are the LOM's checksum and size captured at index-build time.
// to detect re-uploaded shards without reading the TAR content.
// Set by the caller before passing the index to SaveShardIndex.
SrcCksum *cos.Cksum
SrcSize int64
// contains filtered or unexported fields
}
func BuildShardIndex ¶ added in v1.4.5
func BuildShardIndex(r io.ReaderAt, size int64) (*ShardIndex, error)
BuildShardIndex performs one sequential scan of a TAR and returns an index mapping each regular file's name to its exact byte location within the archive.
func (*ShardIndex) IsStale ¶ added in v1.4.5
func (idx *ShardIndex) IsStale(cksum *cos.Cksum, size int64) bool
IsStale reports whether the index was built from a different version of the shard. Always checks size; also compares cksum when SrcCksum is set.
func (*ShardIndex) Pack ¶ added in v1.4.5
func (idx *ShardIndex) Pack() ([]byte, error)
Pack serializes the index into a compact binary format.
func (*ShardIndex) Unpack ¶ added in v1.4.5
func (idx *ShardIndex) Unpack(b []byte) error
Unpack deserializes a packed ShardIndex produced by Pack.
type ShardIndexEntry ¶ added in v1.4.5
type ShardIndexEntry struct {
// Offset is the byte offset of the file's 512-byte TAR header block within the archive.
// File data begins immediately after: Offset + TarBlockSize.
// Always a multiple of TarBlockSize; the first entry in a shard can be at offset 0.
Offset int64
// File size in bytes (as recorded in the TAR header).
Size int64
}
func (ShardIndexEntry) DataOffset ¶ added in v1.4.5
func (e ShardIndexEntry) DataOffset() int64
DataOffset returns the byte offset of the file's data within the archive. Callers use this for direct random access: io.NewSectionReader(r, entry.DataOffset(), entry.Size).
type Writer ¶
type Writer interface {
// Init specific writer
Write(nameInArch string, oah cos.OAH, reader io.Reader) error
// Close, cleanup
Fini() error
// Copy arch, with potential subsequent APPEND
Copy(src io.Reader, size ...int64) error
Flush() error
// contains filtered or unexported methods
}
TODO: consider adding Size() for the number of bytes already written to the underlying writer - compressed bytes for compressed formats (note that cos.CksumHashSize has size)