Documentation
¶
Overview ¶
Package archive owns local archive operations: format identification, compact-extraction (with up to MaxCompactFlattenLayers of duplicate- folder flattening), creation, and listing.
The package is deliberately isolated from CLI concerns — the cmd layer resolves sources (URL fetch, git clone, cwd auto-pick) and then feeds concrete local paths into the functions defined here. That separation is what lets the same archive engine power Slice 3 of the downloader feature later without an import cycle.
Format coverage (via github.com/mholt/archives):
read + write : zip, tar, tar.gz, tar.bz2, tar.xz, tar.zst, gz, bz2, xz, zst read-only : 7z, rar
7z/rar writing is rejected with a clear error in CreateArchive so the CLI can surface "use zip/tar.* for outputs" without crashing.
Package archive — write side. Builds zip / tar / tar.* archives from a heterogeneous list of local source paths using mholt/archives.
Compression mode → library knobs:
Best → DEFLATE max / gzip 9 / bz2 9 Standard → DEFLATE def / gzip default / bz2 default Fast → DEFLATE 1 / gzip 1 / bz2 1
Filtering: optional include / exclude glob lists run against the in-archive name (NameInArchive). An entry survives when either no includes are set OR it matches at least one include, AND it does NOT match any exclude.
Package archive — source resolution helpers used by the cmd layer to turn user-supplied strings (local paths, HTTPS URLs, git URLs) into concrete on-disk paths the extract / create engines can consume.
Network operations are deliberately kept small: we shell out to aria2c when available, fall back to net/http otherwise, and shell out to git for clone. The downloader package will replace this with the full engine in a later slice — until then this keeps `gitmap uzc <url>` and `gitmap zip <git-url>` functional.
Index ¶
- Variables
- func AutoDetectSingleArchive(dir string) (string, error)
- func CleanupResolved(r ResolvedSource)
- func FlateLevelForMode(mode CompressionMode) int
- func ListEntries(ctx context.Context, path string) ([]Entry, Format, error)
- type CompressionMode
- type CreateOptions
- type CreateResult
- type Entry
- type ExtractResult
- type Format
- type ResolvedSource
- type SourceKind
Constants ¶
This section is empty.
Variables ¶
var ErrUnknownFormat = errors.New("unknown archive format")
ErrUnknownFormat is returned by CreateArchive when the output extension is not recognized. Surfaced as a typed error so the cmd layer can translate it into a friendly user message.
Functions ¶
func AutoDetectSingleArchive ¶
AutoDetectSingleArchive scans dir for exactly one file with a recognized archive extension. Returns the absolute path on success, an error describing 0 or N>1 matches otherwise. Used by `gitmap uzc` when the user passes no explicit source.
func CleanupResolved ¶
func CleanupResolved(r ResolvedSource)
CleanupResolved removes any temp workspace recorded on the source. Always safe to call.
func FlateLevelForMode ¶
func FlateLevelForMode(mode CompressionMode) int
FlateLevelForMode is the exported helper for the cmd layer's --list banner so users can see what they signed up for.
Types ¶
type CompressionMode ¶
type CompressionMode string
CompressionMode is the user-facing knob persisted in ArchiveHistory.CompressionMode.
const ( ModeStandard CompressionMode = constants.CompressionStandard ModeBest CompressionMode = constants.CompressionBest ModeFast CompressionMode = constants.CompressionFast )
type CreateOptions ¶
type CreateOptions struct {
OutputPath string
Sources []string // absolute local paths
Mode CompressionMode
Includes []string // optional glob list
Excludes []string // optional glob list
}
CreateOptions bundles every knob `gitmap zip` exposes.
type CreateResult ¶
CreateResult is returned to the cmd layer for printing + history rows.
func CreateArchive ¶
func CreateArchive(ctx context.Context, opts CreateOptions) (CreateResult, error)
CreateArchive walks every source, applies include/exclude filters, and writes the archive to opts.OutputPath using the format derived from the output extension.
type Entry ¶
ListEntries walks the archive and returns a flat list of entry names + sizes for the `--list` mode. Bounded internally to 50_000 entries to keep a malicious archive from exhausting memory.
type ExtractResult ¶
type ExtractResult struct {
OutputDir string
Format Format
EntriesWritten int
UsedTempDir bool
FlattenedLayers int
}
ExtractResult is what a compact-extract returns to the caller so it can be persisted into ArchiveHistory and printed to the user.
func CompactExtract ¶
func CompactExtract(ctx context.Context, srcArchive, destBaseDir string) (ExtractResult, error)
CompactExtract extracts srcArchive into a single normalized directory under destBaseDir, named after the archive's base name (sans extension).
Algorithm: temp-dir-then-move. We always extract into a fresh temp dir inside destBaseDir, then walk it to find the "real root" — the first directory that either holds >1 entry OR holds at least one non-dir entry. That real root is then moved (or its contents merged) into `<destBaseDir>/<archiveBaseName>/`. This guarantees:
xap.zip → xap/xap/<files> becomes destBaseDir/xap/<files> (any number of duplicate-name layers up to MaxCompactFlattenLayers is collapsed; we do not require the inner names to match xap — we just promote single-child directories until we hit content.)
xlt.zip → <files> becomes destBaseDir/xlt/<files> (no flatten, just a wrap.)
mixed.zip → README + src/ becomes destBaseDir/mixed/{README,src} (no flatten, the temp dir contents move directly under the wrap.)
The temp dir is always cleaned, even on failure mid-extract.
type Format ¶
type Format string
Format is a string tag persisted in ArchiveHistory.ArchiveFormat. It reads cleanly in PascalCase logs ("Zip", "TarGz") yet round-trips through the canonical extension via FormatFromExt / Format.Extension.
const ( FormatZip Format = "Zip" FormatTar Format = "Tar" FormatTarGz Format = "TarGz" FormatTarBz2 Format = "TarBz2" FormatTarXz Format = "TarXz" FormatTarZst Format = "TarZst" FormatGz Format = "Gz" FormatBz2 Format = "Bz2" FormatXz Format = "Xz" FormatZst Format = "Zst" Format7z Format = "SevenZip" FormatRar Format = "Rar" FormatUnknown Format = "" )
func FormatFromPath ¶
FormatFromPath inspects a file name and returns the matching Format, or FormatUnknown when nothing matches. Multi-extension forms (".tar.gz", ".tar.bz2", ".tar.xz", ".tar.zst") are checked first so a plain ".gz" never wins over ".tar.gz".
func IdentifyArchive ¶
IdentifyArchive opens the file and asks mholt/archives to sniff the magic bytes. Used as the authoritative format check after extension- based guesses, since a misnamed file (foo.zip that is really a tarball) would otherwise produce a misleading ArchiveHistory.ArchiveFormat row.
type ResolvedSource ¶
type ResolvedSource struct {
Original string
Kind SourceKind
LocalPath string
CleanupDir string
}
ResolvedSource is the materialized form of one user-supplied input. LocalPath is always populated; CleanupDir, when non-empty, must be removed by the caller after the operation completes (this is how the HTTP and git branches signal they used a temp workspace).
func ResolveSource ¶
func ResolveSource(ctx context.Context, raw string) (ResolvedSource, error)
ResolveSource turns one input string into a usable local path. The caller is responsible for invoking CleanupResolved afterwards.
type SourceKind ¶
type SourceKind int
SourceKind classifies one entry on the `gitmap zip` / `gitmap uzc` command line. The cmd layer dispatches per-kind; the archive engine only ever sees concrete local paths.
const ( SourceLocal SourceKind = iota SourceHTTP SourceGit )
func ClassifySource ¶
func ClassifySource(s string) SourceKind
ClassifySource is the cheap, pure-function classifier the command layer uses BEFORE doing any IO. Decision order matters: a path like `git@github.com:foo/bar.git` parses as a URL with no scheme, so we detect git first.