transfer

package

v0.27.0 Latest Latest Go to latest Published: May 20, 2026 License: MIT Imports: 15 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/pbs-plus/pxar

Links

Open Source Insights

Documentation ¶

Overview ¶

Package transfer provides utilities for transferring files between pxar archives.

Transfer implements archive-to-archive copy, directory tree walking, catalog extraction, lazy chunk loading, dedup-aware writing, and streaming read/write adapters for PBS remote stores, local chunk stores, and raw io.Writer streams.

Readers ¶

All source formats implement ArchiveReader:

FileReader: standalone .pxar files via io.ReadSeeker
ChunkedReader: lazy on-demand chunk loading from .didx indexes
SplitReader: v2 split archives (.mpxar.didx + .ppxar.didx)
PBSReader: PBS remote stores via H2 reader protocol
DecryptingReader: wraps any ArchiveReader for encrypted archives

Writers ¶

All target formats implement ArchiveWriter:

StreamWriter: encodes to io.Writer (v1 or v2 split)
DedupWriter: same-datastore dedup with chunk reuse
RemoteDedupWriter: PBS remote dedup with chunk injection
SessionWriter: uploads via BackupSession

Transfer Functions ¶

Copy copies specific paths between archives with optional path mapping. CopyTree copies entire directory trees.

Walk Functions ¶

WalkTree visits every entry with optional content reading. WalkTreeWith supports metadata-only mode, type filters, and skip counts. WalkTreeMetadata performs metadata-only traversal with a type filter.

Dedup Utilities ¶

RecordMax provides monotonic offset validation for dedup writers. MapFileToPayloadChunks maps file content to payload chunk ranges. ReadChunkedFile reads file content from specific chunks. ComputeContentDigest computes SHA-256 without full stream reconstruction.

Lazy Chunk Loading ¶

ReadSeeker implements io.ReadSeeker over chunked data with configurable chunk cache. DecryptSource wraps ChunkSource for encrypted chunks.

Package transfer provides utilities for transferring files between pxar archives.

Index ¶

Variables
func ComputeContentDigest(source datastore.ChunkSource, payloadIdx *datastore.DynamicIndexReader, ...) ([32]byte, error)
func Copy(src ArchiveReader, dst ArchiveWriter, mappings []PathMapping, opts CopyOption) error
func CopyTree(src ArchiveReader, dst ArchiveWriter, srcPath, dstPath string, opts CopyOption) error
func ReadChunkedFile(source datastore.ChunkSource, payloadIdx *datastore.DynamicIndexReader, ...) ([]byte, error)
func RecordMax(last **uint64, offset uint64) bool
func WalkTree(reader ArchiveReader, rootPath string, fn WalkFunc) error
func WalkTreeMetadata(reader ArchiveReader, rootPath string, filter WalkFilter, fn MetadataWalkFunc) error
func WalkTreeWith(reader ArchiveReader, rootPath string, opts WalkOption, fn WalkFunc) error
type ArchiveReader
type ArchiveWriter
type CatalogEntry
type ChunkRange
- func MapFileToPayloadChunks(payloadIdx *datastore.DynamicIndexReader, payloadOffset, fileSize uint64) []ChunkRange
type ChunkedReader
- func NewChunkedReader(idxData []byte, source datastore.ChunkSource) (*ChunkedReader, error)
- func NewChunkedReaderEager(idxData []byte, source datastore.ChunkSource) (*ChunkedReader, error)
- func (r *ChunkedReader) Close() error
- func (r *ChunkedReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
- func (r *ChunkedReader) Lookup(path string) (*pxar.Entry, error)
- func (r *ChunkedReader) ReadCatalog(fn func(CatalogEntry) error) error
- func (r *ChunkedReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
- func (r *ChunkedReader) ReadEntryAtMinimal(offset int64) (*pxar.Entry, error)
- func (r *ChunkedReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
- func (r *ChunkedReader) ReadRoot() (*pxar.Entry, error)
- func (r *ChunkedReader) ReaderAt() io.ReaderAt
type CopyOption
type DecryptSource
- func NewDecryptSource(inner datastore.ChunkSource, cc *datastore.CryptConfig) *DecryptSource
- func (d *DecryptSource) GetChunk(digest [32]byte) ([]byte, error)
type DecryptingReader
- func NewDecryptingReader(inner ArchiveReader) *DecryptingReader
- func (r *DecryptingReader) Close() error
- func (r *DecryptingReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
- func (r *DecryptingReader) Lookup(path string) (*pxar.Entry, error)
- func (r *DecryptingReader) ReadCatalog(fn func(CatalogEntry) error) error
- func (r *DecryptingReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
- func (r *DecryptingReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
- func (r *DecryptingReader) ReadRoot() (*pxar.Entry, error)
type DedupWriter
- func NewDedupWriter(store *datastore.ChunkStore, source datastore.ChunkSource, ...) *DedupWriter
- func (w *DedupWriter) Begin(rootMeta *pxar.Metadata, opts Options) error
- func (w *DedupWriter) BeginDirectory(name string, meta *pxar.Metadata) error
- func (w *DedupWriter) Close() error
- func (w *DedupWriter) DedupStats() (hits, total int)
- func (w *DedupWriter) EndDirectory() error
- func (w *DedupWriter) Finish() error
- func (w *DedupWriter) MetaIndexData() []byte
- func (w *DedupWriter) PayloadIndexData() []byte
- func (w *DedupWriter) ReferenceSourcePayloadChunks()
- func (w *DedupWriter) WriteEntry(entry *pxar.Entry, content []byte) error
- func (w *DedupWriter) WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error
type FileReader
- func NewFileReader(reader io.ReadSeeker) *FileReader
- func NewSplitFileReader(metaReader, payloadReader io.ReadSeeker) *FileReader
- func (r *FileReader) Close() error
- func (r *FileReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
- func (r *FileReader) Lookup(path string) (*pxar.Entry, error)
- func (r *FileReader) ReadCatalog(fn func(CatalogEntry) error) error
- func (r *FileReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
- func (r *FileReader) ReadEntryAtMinimal(offset int64) (*pxar.Entry, error)
- func (r *FileReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
- func (r *FileReader) ReadRoot() (*pxar.Entry, error)
type MetadataWalkFunc
type Options
type PBSReader
- func NewPBSReader(ctx context.Context, cfg PBSReaderConfig) (*PBSReader, error)
- func (r *PBSReader) Close() error
- func (r *PBSReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
- func (r *PBSReader) Lookup(path string) (*pxar.Entry, error)
- func (r *PBSReader) ReadCatalog(fn func(CatalogEntry) error) error
- func (r *PBSReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
- func (r *PBSReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
- func (r *PBSReader) ReadRoot() (*pxar.Entry, error)
type PBSReaderConfig
type PathMapping
type ReadSeeker
- func NewReadSeeker(idx *datastore.DynamicIndexReader, source datastore.ChunkSource, maxCache int) *ReadSeeker
- func (r *ReadSeeker) Close() error
- func (r *ReadSeeker) Read(p []byte) (int, error)
- func (r *ReadSeeker) ReadAt(p []byte, offset int64) (int, error)
- func (r *ReadSeeker) Seek(offset int64, whence int) (int64, error)
- func (r *ReadSeeker) SetCacheSize(n int)
type RemoteDedupWriter
- func NewRemoteDedupWriter(ctx context.Context, session backupproxy.BackupSession, ...) (*RemoteDedupWriter, error)
- func (w *RemoteDedupWriter) AdvancePayloadPosition(n uint64) error
- func (w *RemoteDedupWriter) Begin(rootMeta *pxar.Metadata, opts Options) error
- func (w *RemoteDedupWriter) BeginDirectory(name string, meta *pxar.Metadata) error
- func (w *RemoteDedupWriter) Close() error
- func (w *RemoteDedupWriter) Encoder() *encoder.Encoder
- func (w *RemoteDedupWriter) EndDirectory() error
- func (w *RemoteDedupWriter) Finish() error
- func (w *RemoteDedupWriter) WriteEntry(entry *pxar.Entry, content []byte) error
- func (w *RemoteDedupWriter) WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error
- func (w *RemoteDedupWriter) WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error
type SessionWriter
- func NewSessionWriter(ctx context.Context, session backupproxy.BackupSession, ...) *SessionWriter
- func (w *SessionWriter) Begin(rootMeta *pxar.Metadata, opts Options) error
- func (w *SessionWriter) BeginDirectory(name string, meta *pxar.Metadata) error
- func (w *SessionWriter) Close() error
- func (w *SessionWriter) Encoder() *encoder.Encoder
- func (w *SessionWriter) EndDirectory() error
- func (w *SessionWriter) Finish() error
- func (w *SessionWriter) WriteEntry(entry *pxar.Entry, content []byte) error
- func (w *SessionWriter) WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error
- func (w *SessionWriter) WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error
type SplitReader
- func NewSplitReader(metaIdxData, payloadIdxData []byte, source datastore.ChunkSource) (*SplitReader, error)
- func NewSplitReaderEager(metaIdxData, payloadIdxData []byte, source datastore.ChunkSource) (*SplitReader, error)
- func NewSplitReaderMetaOnly(metaIdxData []byte, source datastore.ChunkSource) (*SplitReader, error)
- func (r *SplitReader) Close() error
- func (r *SplitReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
- func (r *SplitReader) Lookup(path string) (*pxar.Entry, error)
- func (r *SplitReader) PayloadReaderAt() io.ReaderAt
- func (r *SplitReader) ReadCatalog(fn func(CatalogEntry) error) error
- func (r *SplitReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
- func (r *SplitReader) ReadEntryAtMinimal(offset int64) (*pxar.Entry, error)
- func (r *SplitReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
- func (r *SplitReader) ReadRoot() (*pxar.Entry, error)
- func (r *SplitReader) SetPayloadCacheSize(n int)
type StreamWriter
- func NewSplitStreamWriter(output, payloadOut io.Writer) *StreamWriter
- func NewStreamWriter(output io.Writer) *StreamWriter
- func (w *StreamWriter) Begin(rootMeta *pxar.Metadata, opts Options) error
- func (w *StreamWriter) BeginDirectory(name string, meta *pxar.Metadata) error
- func (w *StreamWriter) Close() error
- func (w *StreamWriter) Encoder() *encoder.Encoder
- func (w *StreamWriter) EndDirectory() error
- func (w *StreamWriter) Finish() error
- func (w *StreamWriter) WriteEntry(entry *pxar.Entry, content []byte) error
- func (w *StreamWriter) WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error
- func (w *StreamWriter) WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error
- func (w *StreamWriter) WriteHardlink(name string, target string, targetOffset encoder.LinkOffset) error
type TreeWalker
- func NewTreeWalker(reader ArchiveReader, opts WalkOption) *TreeWalker
- func (w *TreeWalker) Entry() *pxar.Entry
- func (w *TreeWalker) Err() error
- func (w *TreeWalker) Init(rootPath string) error
- func (w *TreeWalker) Next() bool
type WalkFilter
type WalkFunc
type WalkOption

Constants ¶

This section is empty.

Variables ¶

View Source

var ErrSkipDir = fmt.Errorf("skip directory")

ErrSkipDir can be returned by a WalkFunc to skip a directory's children.

View Source

var WalkMetadataOnly = WalkOption{MetaOnly: true}

WalkMetadataOnly is a convenience WalkOption for metadata-only walks.

Functions ¶

func ComputeContentDigest ¶

func ComputeContentDigest(source datastore.ChunkSource, payloadIdx *datastore.DynamicIndexReader, payloadOffset, fileSize uint64) ([32]byte, error)

ComputeContentDigest computes SHA-256 of a file's content from the source archive without reconstructing the entire payload stream. Only loads the chunks needed for that specific file.

func Copy ¶

func Copy(src ArchiveReader, dst ArchiveWriter, mappings []PathMapping, opts CopyOption) error

Copy copies files from the source archive to the target writer. Each PathMapping specifies a source path (inside the source archive) and a destination path (inside the target archive). If the source entry is a directory, the entire subtree is copied with paths remapped from Src to Dst prefix. If the source entry is a file, the file is written with its path remapped.

func CopyTree ¶

func CopyTree(src ArchiveReader, dst ArchiveWriter, srcPath, dstPath string, opts CopyOption) error

CopyTree copies a directory tree from srcPath in the source archive to dstPath in the target. All entries under the source directory have their paths remapped from the srcPath prefix to the dstPath prefix.

func ReadChunkedFile ¶ added in v0.24.0

func ReadChunkedFile(source datastore.ChunkSource, payloadIdx *datastore.DynamicIndexReader, payloadOffset, fileSize uint64) ([]byte, error)

ReadChunkedFile reads a file's content by loading only the necessary payload chunks. This is more efficient than reconstructing the entire payload stream when you only need specific files.

func RecordMax ¶ added in v0.24.0

func RecordMax(last **uint64, offset uint64) bool

RecordMax mirrors Rust's try_record_strictly_greater from pbs-client/src/pxar/create.rs. It records a strictly-monotonically-increasing offset into `last`. Returns true if offset > *last (or first call when *last == nil), false otherwise. Rejected offsets do not update state.

This prevents a corrupt previous archive from injecting backwards PXAR_PAYLOAD_REF offsets, which the encoder's strict offset check would reject, aborting the backup.

func WalkTree ¶

func WalkTree(reader ArchiveReader, rootPath string, fn WalkFunc) error

WalkTree walks a directory tree from an ArchiveReader, calling fn for each entry in encoder-compatible order. For directories, fn is called before children are walked. Content is populated for regular files; for other entry types, content is nil.

If fn returns ErrSkipDir for a directory entry, the directory's children are skipped. If fn returns any other error, walking stops.

func WalkTreeMetadata ¶ added in v0.18.0

func WalkTreeMetadata(reader ArchiveReader, rootPath string, filter WalkFilter, fn MetadataWalkFunc) error

WalkTreeMetadata walks a directory tree in metadata-only mode with a simplified callback. Content is never read, and the filter mask controls which entry types are visited. Use WalkAll to visit all types.

func WalkTreeWith ¶ added in v0.9.0

func WalkTreeWith(reader ArchiveReader, rootPath string, opts WalkOption, fn WalkFunc) error

WalkTreeWith walks a directory tree with the given options. When opts.MetaOnly is true, file content is never read and content is always nil. When opts.Filter is non-zero, entries not matching the filter are skipped.

Types ¶

type ArchiveReader ¶

type ArchiveReader interface {
	// ReadRoot returns the root directory entry.
	ReadRoot() (*pxar.Entry, error)

	// Lookup finds an entry by archive-internal path.
	Lookup(path string) (*pxar.Entry, error)

	// ListDirectory streams directory entries without materializing a slice.
	// For each entry, fn is called with a pointer valid only during the callback.
	// If fn returns a non-nil error, iteration stops.
	ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error

	// ReadFileContentReader returns a streaming reader for file content.
	// The caller must close the reader. Use this for large files to avoid
	// buffering the entire content in memory.
	ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)

	// ReadEntryAt reads a full pxar entry at the given archive byte offset.
	// Used for re-reading entries with full metadata after a minimal ListDirectory.
	ReadEntryAt(offset int64) (*pxar.Entry, error)

	// ReadCatalog streams the full directory tree via a callback with
	// minimal decoding. For each entry, fn is called. If fn returns a
	// non-nil error, iteration stops and the error is returned.
	// Significantly faster than WalkTree for indexing.
	ReadCatalog(fn func(CatalogEntry) error) error

	// Close releases resources.
	Close() error
}

ArchiveReader provides unified read access to any pxar archive format.

type ArchiveWriter ¶

type ArchiveWriter interface {
	// Begin starts writing to a new archive with the given root metadata.
	Begin(rootMeta *pxar.Metadata, opts Options) error

	// WriteEntry writes an entry (file, symlink, device, etc.) to the archive.
	// For regular files, content is the file data. For other types, content may be nil.
	WriteEntry(entry *pxar.Entry, content []byte) error

	// WriteEntryRef writes an entry that references existing payload data
	// without writing the payload itself. The payloadOffset is the byte offset
	// in the original payload stream. Used for chunk-level deduplication.
	WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error

	// WriteEntryReader writes a file entry with content streamed from r.
	// size is the total byte count. For non-file entries (symlink, device,
	// fifo, socket), r and size are ignored and content is nil.
	// The caller must ensure r provides exactly size bytes.
	WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error

	// BeginDirectory pushes a directory context.
	BeginDirectory(name string, meta *pxar.Metadata) error

	// EndDirectory pops a directory context.
	EndDirectory() error

	// Finish finalizes the archive.
	Finish() error

	// Close releases resources.
	Close() error
}

ArchiveWriter provides unified write access to any pxar archive format.

type CatalogEntry ¶ added in v0.10.0

type CatalogEntry struct {
	Path       string
	ParentPath string
	Kind       pxar.EntryKind
	FileSize   uint64
}

CatalogEntry is a stripped-down entry for index-building. It contains only the fields needed for cataloging: path, kind, size, and parent.

type ChunkRange ¶

type ChunkRange struct {
	ChunkIndex   int
	Digest       [32]byte
	ChunkStart   uint64 // start offset in the payload stream
	ChunkEnd     uint64 // end offset in the payload stream
	ContentStart uint64 // start of overlap with file content
	ContentEnd   uint64 // end of overlap with file content
	IsFullChunk  bool   // true if the entire chunk is within the file's content
}

ChunkRange describes a chunk's overlap with a file's content.

func MapFileToPayloadChunks ¶

func MapFileToPayloadChunks(payloadIdx *datastore.DynamicIndexReader, payloadOffset, fileSize uint64) []ChunkRange

MapFileToPayloadChunks maps a file's content in the source payload stream to the chunk digests that contain it. This is used to know which source chunks are needed for a specific file without downloading the entire payload stream.

Returns a list of (chunkIndex, digest) pairs for chunks that overlap the file's content range, and the byte offsets within those chunks that correspond to the file content.

type ChunkedReader ¶ added in v0.24.0

type ChunkedReader struct {
	// contains filtered or unexported fields
}

ChunkedReader reads from a chunked archive (.pxar.didx). It lazily loads chunks on demand using a ReadSeeker, avoiding full-stream-in-memory reconstruction. For small archives where full reconstruction is acceptable, use NewChunkedReaderEager.

func NewChunkedReader ¶ added in v0.24.0

func NewChunkedReader(idxData []byte, source datastore.ChunkSource) (*ChunkedReader, error)

NewChunkedReader creates a reader for a chunked .pxar.didx archive using lazy on-demand chunk loading. This avoids reconstructing the entire stream into memory — only chunks needed for Lookups and ReadFileContent calls are loaded.

func NewChunkedReaderEager ¶ added in v0.24.0

func NewChunkedReaderEager(idxData []byte, source datastore.ChunkSource) (*ChunkedReader, error)

NewChunkedReaderEager creates a reader that reconstructs the entire stream into memory upfront. Use this for small archives or when you need guaranteed sequential access performance.

func (*ChunkedReader) Close ¶ added in v0.24.0

func (r *ChunkedReader) Close() error

func (*ChunkedReader) ListDirectory ¶ added in v0.24.0

func (r *ChunkedReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error

func (*ChunkedReader) Lookup ¶ added in v0.24.0

func (r *ChunkedReader) Lookup(path string) (*pxar.Entry, error)

func (*ChunkedReader) ReadCatalog ¶ added in v0.24.0

func (r *ChunkedReader) ReadCatalog(fn func(CatalogEntry) error) error

func (*ChunkedReader) ReadEntryAt ¶ added in v0.24.0

func (r *ChunkedReader) ReadEntryAt(offset int64) (*pxar.Entry, error)

func (*ChunkedReader) ReadEntryAtMinimal ¶ added in v0.24.0

func (r *ChunkedReader) ReadEntryAtMinimal(offset int64) (*pxar.Entry, error)

func (*ChunkedReader) ReadFileContentReader ¶ added in v0.24.0

func (r *ChunkedReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)

func (*ChunkedReader) ReadRoot ¶ added in v0.24.0

func (r *ChunkedReader) ReadRoot() (*pxar.Entry, error)

func (*ChunkedReader) ReaderAt ¶ added in v0.24.0

func (r *ChunkedReader) ReaderAt() io.ReaderAt

ReaderAt returns the underlying io.ReaderAt for the archive stream. Returns nil for eager readers backed by bytes.Reader (use the FileReader directly if you need ReaderAt on those). The returned ReaderAt is safe for concurrent use.

type CopyOption ¶ added in v0.24.0

type CopyOption struct {
	SourceCryptConfig *datastore.CryptConfig
	TargetCryptConfig *datastore.CryptConfig
	OnProgress        func(path string, bytes uint64)
	TargetFormat      format.FormatVersion
	Overwrite         bool
}

CopyOption configures a file transfer operation.

type DecryptSource ¶ added in v0.24.0

type DecryptSource struct {
	// contains filtered or unexported fields
}

DecryptSource wraps a ChunkSource and decrypts/decompresses chunks on the fly. This is used when reading from an encrypted archive where the raw chunks are encrypted blobs that need to be decoded before restoration.

When a CryptConfig is provided, encrypted blobs are decrypted. All blobs are decoded (uncompressed/decrypted) before being returned, producing the raw chunk data that the Restorer expects.

func NewDecryptSource ¶ added in v0.24.0

func NewDecryptSource(inner datastore.ChunkSource, cc *datastore.CryptConfig) *DecryptSource

NewDecryptSource creates a chunk source that decrypts chunks after retrieval. Pass nil for cc if the archive is not encrypted (only decompression is needed).

func (*DecryptSource) GetChunk ¶ added in v0.24.0

func (d *DecryptSource) GetChunk(digest [32]byte) ([]byte, error)

type DecryptingReader ¶

type DecryptingReader struct {
	// contains filtered or unexported fields
}

DecryptingReader is a placeholder for per-file decryption support. For most cases, using DecryptSource when constructing the underlying reader (ChunkedReader/SplitReader) is preferred since it handles decryption at the chunk level before stream reconstruction. This type simply delegates to the inner reader.

func NewDecryptingReader ¶

func NewDecryptingReader(inner ArchiveReader) *DecryptingReader

NewDecryptingReader wraps an ArchiveReader for transparent access. The underlying reader should already be configured with a DecryptSource if decryption is needed.

func (*DecryptingReader) Close ¶

func (r *DecryptingReader) Close() error

func (*DecryptingReader) ListDirectory ¶

func (r *DecryptingReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error

func (*DecryptingReader) Lookup ¶

func (r *DecryptingReader) Lookup(path string) (*pxar.Entry, error)

func (*DecryptingReader) ReadCatalog ¶ added in v0.18.0

func (r *DecryptingReader) ReadCatalog(fn func(CatalogEntry) error) error

func (*DecryptingReader) ReadEntryAt ¶ added in v0.20.0

func (r *DecryptingReader) ReadEntryAt(offset int64) (*pxar.Entry, error)

func (*DecryptingReader) ReadFileContentReader ¶ added in v0.18.0

func (r *DecryptingReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)

func (*DecryptingReader) ReadRoot ¶

func (r *DecryptingReader) ReadRoot() (*pxar.Entry, error)

type DedupWriter ¶ added in v0.24.0

type DedupWriter struct {
	// contains filtered or unexported fields
}

DedupWriter writes a v2 split archive into the same chunk store as the source, reusing existing payload chunks instead of re-uploading them.

For same-datastore transfers, this avoids:

Re-encoding file content into a new payload buffer
Re-chunking the payload stream (which would produce different chunks)
Re-uploading chunks that already exist in the store

Instead, it builds the new payload index by referencing source payload chunk digests directly. Only the metadata stream needs to be re-chunked and uploaded (it's small since it only contains filenames and metadata).

The writer assumes the source's payload chunks exist in the chunk store. For files not present in the source (new content), it falls back to normal encoding and chunking.

func NewDedupWriter ¶ added in v0.24.0

func NewDedupWriter(
	store *datastore.ChunkStore,
	source datastore.ChunkSource,
	config buzhash.Config,
	compress bool,
	sourcePayloadIdx *datastore.DynamicIndexReader,
) *DedupWriter

NewDedupWriter creates a writer that reuses source payload chunks. sourcePayloadIdx is the source archive's .ppxar.didx index. source is the ChunkSource for reading source chunks (same store as target).

func (*DedupWriter) Begin ¶ added in v0.24.0

func (w *DedupWriter) Begin(rootMeta *pxar.Metadata, opts Options) error

func (*DedupWriter) BeginDirectory ¶ added in v0.24.0

func (w *DedupWriter) BeginDirectory(name string, meta *pxar.Metadata) error

func (*DedupWriter) Close ¶ added in v0.24.0

func (w *DedupWriter) Close() error

func (*DedupWriter) DedupStats ¶ added in v0.24.0

func (w *DedupWriter) DedupStats() (hits, total int)

DedupStats returns (already_existed, total) payload chunk counts.

func (*DedupWriter) EndDirectory ¶ added in v0.24.0

func (w *DedupWriter) EndDirectory() error

func (*DedupWriter) Finish ¶ added in v0.24.0

func (w *DedupWriter) Finish() error

func (*DedupWriter) MetaIndexData ¶ added in v0.24.0

func (w *DedupWriter) MetaIndexData() []byte

MetaIndexData returns the .mpxar.didx index data after Finish.

func (*DedupWriter) PayloadIndexData ¶ added in v0.24.0

func (w *DedupWriter) PayloadIndexData() []byte

PayloadIndexData returns the .ppxar.didx index data after Finish.

func (*DedupWriter) ReferenceSourcePayloadChunks ¶ added in v0.24.0

func (w *DedupWriter) ReferenceSourcePayloadChunks()

ReferenceSourcePayloadChunks marks chunks from the source's payload index as already existing in the store. Call this before Finish to enable dedup tracking. The ChunkStore.InsertChunk call will skip these chunks since they already exist on disk.

This is mainly useful for reporting — the actual dedup happens automatically via ChunkStore.InsertChunk.

func (*DedupWriter) WriteEntry ¶ added in v0.24.0

func (w *DedupWriter) WriteEntry(entry *pxar.Entry, content []byte) error

func (*DedupWriter) WriteEntryReader ¶ added in v0.24.0

func (w *DedupWriter) WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error

type FileReader ¶ added in v0.24.0

type FileReader struct {
	// contains filtered or unexported fields
}

FileReader reads from a standalone .pxar file using an io.ReadSeeker. For split archives (v2), provide both the metadata and payload readers.

func NewFileReader ¶ added in v0.24.0

func NewFileReader(reader io.ReadSeeker) *FileReader

NewFileReader creates a reader for a standalone .pxar file.

func NewSplitFileReader ¶ added in v0.24.0

func NewSplitFileReader(metaReader, payloadReader io.ReadSeeker) *FileReader

NewSplitFileReader creates a reader for a split (v2) archive with separate metadata and payload streams.

func (*FileReader) Close ¶ added in v0.24.0

func (r *FileReader) Close() error

func (*FileReader) ListDirectory ¶ added in v0.24.0

func (r *FileReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error

func (*FileReader) Lookup ¶ added in v0.24.0

func (r *FileReader) Lookup(path string) (*pxar.Entry, error)

func (*FileReader) ReadCatalog ¶ added in v0.24.0

func (r *FileReader) ReadCatalog(fn func(CatalogEntry) error) error

func (*FileReader) ReadEntryAt ¶ added in v0.24.0

func (r *FileReader) ReadEntryAt(offset int64) (*pxar.Entry, error)

func (*FileReader) ReadEntryAtMinimal ¶ added in v0.24.0

func (r *FileReader) ReadEntryAtMinimal(offset int64) (*pxar.Entry, error)

ReadEntryAtMinimal reads a pxar entry with minimal decoding (stat only).

func (*FileReader) ReadFileContentReader ¶ added in v0.24.0

func (r *FileReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)

func (*FileReader) ReadRoot ¶ added in v0.24.0

func (r *FileReader) ReadRoot() (*pxar.Entry, error)

type MetadataWalkFunc ¶ added in v0.18.0

type MetadataWalkFunc func(entry *pxar.Entry) error

MetadataWalkFunc is called for each entry during a metadata-only walk. Unlike WalkFunc, no content parameter is provided since content is never read.

type Options ¶ added in v0.24.0

type Options struct {
	Prelude []byte
	Format  format.FormatVersion
}

Options configures how an ArchiveWriter creates archives.

type PBSReader ¶ added in v0.24.0

type PBSReader struct {
	// contains filtered or unexported fields
}

PBSReader reads archives from a PBS remote store. It downloads the index file(s) and reconstructs the archive stream using chunks from the PBS reader protocol.

func NewPBSReader ¶ added in v0.24.0

func NewPBSReader(ctx context.Context, cfg PBSReaderConfig) (*PBSReader, error)

NewPBSReader creates a reader for a PBS remote archive. For v1 archives, set ArchiveName. For v2 split archives, set MetaName and PayloadName.

func (*PBSReader) Close ¶ added in v0.24.0

func (r *PBSReader) Close() error

func (*PBSReader) ListDirectory ¶ added in v0.24.0

func (r *PBSReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error

func (*PBSReader) Lookup ¶ added in v0.24.0

func (r *PBSReader) Lookup(path string) (*pxar.Entry, error)

func (*PBSReader) ReadCatalog ¶ added in v0.24.0

func (r *PBSReader) ReadCatalog(fn func(CatalogEntry) error) error

func (*PBSReader) ReadEntryAt ¶ added in v0.24.0

func (r *PBSReader) ReadEntryAt(offset int64) (*pxar.Entry, error)

func (*PBSReader) ReadFileContentReader ¶ added in v0.24.0

func (r *PBSReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)

func (*PBSReader) ReadRoot ¶ added in v0.24.0

func (r *PBSReader) ReadRoot() (*pxar.Entry, error)

type PBSReaderConfig ¶ added in v0.24.0

type PBSReaderConfig struct {
	BackupType  string
	BackupID    string
	ArchiveName string
	MetaName    string
	PayloadName string
	Config      backupproxy.PBSConfig
	BackupTime  int64
	MetaOnly    bool
}

PBSReaderConfig holds the configuration for opening a PBS archive.

type PathMapping ¶

type PathMapping struct {
	Src string // path in the source archive
	Dst string // path in the target archive
}

PathMapping maps a source path to a destination path inside the archives.

type ReadSeeker ¶ added in v0.24.0

type ReadSeeker struct {
	// contains filtered or unexported fields
}

ReadSeeker implements io.ReadSeeker over a chunked archive stream. Instead of reconstructing the entire stream into memory, it lazily loads and decodes chunks on demand using the dynamic index and a chunk source. This is critical for same-datastore transfers where only a subset of files are needed — it avoids downloading the entire payload stream from PBS.

func NewReadSeeker ¶ added in v0.24.0

func NewReadSeeker(idx *datastore.DynamicIndexReader, source datastore.ChunkSource, maxCache int) *ReadSeeker

NewReadSeeker creates a lazy read-seeker over chunked data. maxCache controls how many decoded chunks are kept in memory (0 = unlimited).

func (*ReadSeeker) Close ¶ added in v0.24.0

func (r *ReadSeeker) Close() error

Close clears the chunk cache.

func (*ReadSeeker) Read ¶ added in v0.24.0

func (r *ReadSeeker) Read(p []byte) (int, error)

func (*ReadSeeker) ReadAt ¶ added in v0.24.0

func (r *ReadSeeker) ReadAt(p []byte, offset int64) (int, error)

ReadAt reads len(p) bytes starting at the given offset without mutating the seeker's internal position. It is safe for concurrent use.

func (*ReadSeeker) Seek ¶ added in v0.24.0

func (r *ReadSeeker) Seek(offset int64, whence int) (int64, error)

func (*ReadSeeker) SetCacheSize ¶ added in v0.24.0

func (r *ReadSeeker) SetCacheSize(n int)

SetCacheSize adjusts the maximum number of decoded chunks kept in memory. Setting to 0 disables caching entirely — each chunk is decoded on demand and immediately discarded. This is appropriate for payload streams where content is streamed sequentially and caching would accumulate unbounded memory. Existing cached entries are evicted if the new size is lower.

type RemoteDedupWriter ¶ added in v0.24.0

type RemoteDedupWriter struct {
	// contains filtered or unexported fields
}

RemoteDedupWriter writes a split archive to PBS with chunk-level dedup.

For files that are unchanged from the original archive (pxar-only entries), it uses AddPayloadRef to reference original payload offsets without reading file content. The original payload chunks are injected into the new DIDX directly.

For new/modified files (backed entries), it writes payload data normally.

Architecture mirrors the Rust PBS client (pxar_backup_stream.rs): the encoder writes to a bufio.Writer wrapping a bounded channel sender. A separate goroutine reads the channel and presents an io.Reader to UploadPayloadWithInjection. This decouples encoding from uploading with bounded memory (~10 × bufioSize = ~2.5 MB in-flight payload data).

func NewRemoteDedupWriter ¶ added in v0.24.0

func NewRemoteDedupWriter(
	ctx context.Context,
	session backupproxy.BackupSession,
	metaName, payloadName string,
	origPayloadIndex []byte,
) (*RemoteDedupWriter, error)

NewRemoteDedupWriter creates a dedup writer for PBS uploads. origPayloadIndex is the raw DIDX bytes from the original .ppxar.didx.

func (*RemoteDedupWriter) AdvancePayloadPosition ¶ added in v0.24.0

func (w *RemoteDedupWriter) AdvancePayloadPosition(n uint64) error

AdvancePayloadPosition advances the encoder's payload write position. Call after all AddPayloadRef calls to account for the original stream's TAIL_MARKER before writing new files.

func (*RemoteDedupWriter) Begin ¶ added in v0.24.0

func (w *RemoteDedupWriter) Begin(rootMeta *pxar.Metadata, opts Options) error

func (*RemoteDedupWriter) BeginDirectory ¶ added in v0.24.0

func (w *RemoteDedupWriter) BeginDirectory(name string, meta *pxar.Metadata) error

func (*RemoteDedupWriter) Close ¶ added in v0.24.0

func (w *RemoteDedupWriter) Close() error

func (*RemoteDedupWriter) Encoder ¶ added in v0.24.0

func (w *RemoteDedupWriter) Encoder() *encoder.Encoder

Encoder returns the underlying encoder.

func (*RemoteDedupWriter) EndDirectory ¶ added in v0.24.0

func (w *RemoteDedupWriter) EndDirectory() error

func (*RemoteDedupWriter) Finish ¶ added in v0.24.0

func (w *RemoteDedupWriter) Finish() error

func (*RemoteDedupWriter) WriteEntry ¶ added in v0.24.0

func (w *RemoteDedupWriter) WriteEntry(entry *pxar.Entry, content []byte) error

func (*RemoteDedupWriter) WriteEntryReader ¶ added in v0.24.0

func (w *RemoteDedupWriter) WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error

func (*RemoteDedupWriter) WriteEntryRef ¶ added in v0.24.0

func (w *RemoteDedupWriter) WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error

WriteEntryRef writes an entry referencing existing payload data. Returns an error if payloadOffset is not strictly greater than the last accepted offset (mirrors Rust's try_record_strictly_greater validation).

type SessionWriter ¶ added in v0.24.0

type SessionWriter struct {
	SplitResult *backupproxy.SplitArchiveResult
	// contains filtered or unexported fields
}

SessionWriter writes a split (v2) archive by uploading both metadata and payload streams through a BackupSession.

func NewSessionWriter ¶ added in v0.24.0

func NewSessionWriter(ctx context.Context, session backupproxy.BackupSession, metaName, payloadName string) *SessionWriter

NewSessionWriter creates a split writer that uploads via a BackupSession.

func (*SessionWriter) Begin ¶ added in v0.24.0

func (w *SessionWriter) Begin(rootMeta *pxar.Metadata, opts Options) error

func (*SessionWriter) BeginDirectory ¶ added in v0.24.0

func (w *SessionWriter) BeginDirectory(name string, meta *pxar.Metadata) error

func (*SessionWriter) Close ¶ added in v0.24.0

func (w *SessionWriter) Close() error

func (*SessionWriter) Encoder ¶ added in v0.24.0

func (w *SessionWriter) Encoder() *encoder.Encoder

Encoder returns the underlying encoder for advanced operations.

func (*SessionWriter) EndDirectory ¶ added in v0.24.0

func (w *SessionWriter) EndDirectory() error

func (*SessionWriter) Finish ¶ added in v0.24.0

func (w *SessionWriter) Finish() error

func (*SessionWriter) WriteEntry ¶ added in v0.24.0

func (w *SessionWriter) WriteEntry(entry *pxar.Entry, content []byte) error

func (*SessionWriter) WriteEntryReader ¶ added in v0.24.0

func (w *SessionWriter) WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error

func (*SessionWriter) WriteEntryRef ¶ added in v0.24.0

func (w *SessionWriter) WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error

type SplitReader ¶ added in v0.24.0

type SplitReader struct {
	// contains filtered or unexported fields
}

SplitReader reads from a split chunked archive (.mpxar.didx + .ppxar.didx). It uses lazy on-demand chunk loading for both metadata and payload streams, avoiding full-stream-in-memory reconstruction. For small archives, use NewSplitReaderEager.

func NewSplitReader ¶ added in v0.24.0

func NewSplitReader(metaIdxData, payloadIdxData []byte, source datastore.ChunkSource) (*SplitReader, error)

NewSplitReader creates a reader for a split chunked archive using lazy on-demand chunk loading. Only chunks needed for Lookups and ReadFileContent calls are loaded, which is critical for same-datastore PBS transfers where downloading the entire payload stream is expensive.

func NewSplitReaderEager ¶ added in v0.24.0

func NewSplitReaderEager(metaIdxData, payloadIdxData []byte, source datastore.ChunkSource) (*SplitReader, error)

NewSplitReaderEager creates a reader that reconstructs both streams into memory upfront. Use for small archives or when you need guaranteed sequential access performance.

func NewSplitReaderMetaOnly ¶ added in v0.24.0

func NewSplitReaderMetaOnly(metaIdxData []byte, source datastore.ChunkSource) (*SplitReader, error)

NewSplitReaderMetaOnly creates a reader for a split archive that only downloads and uses the metadata stream. The payload stream is never fetched. ReadFileContent/ReadFileContentReader will return errors for files stored in the payload stream (PayloadOffset > 0).

func (*SplitReader) Close ¶ added in v0.24.0

func (r *SplitReader) Close() error

func (*SplitReader) ListDirectory ¶ added in v0.24.0

func (r *SplitReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error

func (*SplitReader) Lookup ¶ added in v0.24.0

func (r *SplitReader) Lookup(path string) (*pxar.Entry, error)

func (*SplitReader) PayloadReaderAt ¶ added in v0.24.0

func (r *SplitReader) PayloadReaderAt() io.ReaderAt

PayloadReaderAt returns the underlying io.ReaderAt for the payload stream. Returns nil for meta-only or eager readers that don't use a ReadSeeker. The returned ReaderAt is safe for concurrent use.

func (*SplitReader) ReadCatalog ¶ added in v0.24.0

func (r *SplitReader) ReadCatalog(fn func(CatalogEntry) error) error

func (*SplitReader) ReadEntryAt ¶ added in v0.24.0

func (r *SplitReader) ReadEntryAt(offset int64) (*pxar.Entry, error)

func (*SplitReader) ReadEntryAtMinimal ¶ added in v0.24.0

func (r *SplitReader) ReadEntryAtMinimal(offset int64) (*pxar.Entry, error)

func (*SplitReader) ReadFileContentReader ¶ added in v0.24.0

func (r *SplitReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)

func (*SplitReader) ReadRoot ¶ added in v0.24.0

func (r *SplitReader) ReadRoot() (*pxar.Entry, error)

func (*SplitReader) SetPayloadCacheSize ¶ added in v0.24.0

func (r *SplitReader) SetPayloadCacheSize(n int)

SetPayloadCacheSize adjusts the payload chunk cache size. See ReadSeeker.SetCacheSize for details.

type StreamWriter ¶ added in v0.24.0

type StreamWriter struct {
	// contains filtered or unexported fields
}

StreamWriter writes a pxar archive to one or two io.Writer streams. For v1 format, only output is used. For v2 format, both output and payloadOut are used.

func NewSplitStreamWriter ¶ added in v0.24.0

func NewSplitStreamWriter(output, payloadOut io.Writer) *StreamWriter

NewSplitStreamWriter creates a writer for v2 (split) format.

func NewStreamWriter ¶ added in v0.24.0

func NewStreamWriter(output io.Writer) *StreamWriter

NewStreamWriter creates a writer for v1 (unified) format.

func (*StreamWriter) Begin ¶ added in v0.24.0

func (w *StreamWriter) Begin(rootMeta *pxar.Metadata, opts Options) error

func (*StreamWriter) BeginDirectory ¶ added in v0.24.0

func (w *StreamWriter) BeginDirectory(name string, meta *pxar.Metadata) error

func (*StreamWriter) Close ¶ added in v0.24.0

func (w *StreamWriter) Close() error

func (*StreamWriter) Encoder ¶ added in v0.24.0

func (w *StreamWriter) Encoder() *encoder.Encoder

Encoder returns the underlying encoder for advanced operations. This is useful for getting file offsets for hardlink tracking.

func (*StreamWriter) EndDirectory ¶ added in v0.24.0

func (w *StreamWriter) EndDirectory() error

func (*StreamWriter) Finish ¶ added in v0.24.0

func (w *StreamWriter) Finish() error

func (*StreamWriter) WriteEntry ¶ added in v0.24.0

func (w *StreamWriter) WriteEntry(entry *pxar.Entry, content []byte) error

func (*StreamWriter) WriteEntryReader ¶ added in v0.24.0

func (w *StreamWriter) WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error

func (*StreamWriter) WriteEntryRef ¶ added in v0.24.0

func (w *StreamWriter) WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error

func (*StreamWriter) WriteHardlink ¶ added in v0.24.0

func (w *StreamWriter) WriteHardlink(name string, target string, targetOffset encoder.LinkOffset) error

WriteHardlink writes a hard link entry with an explicit target offset.

type TreeWalker ¶ added in v0.9.0

type TreeWalker struct {
	// contains filtered or unexported fields
}

TreeWalker provides a pull-based iterator for walking a pxar archive tree. It reuses a single Entry across all Next() calls, producing zero heap allocations per iteration.

Example:

walker := transfer.NewTreeWalker(reader, transfer.WalkOption{
    MetaOnly: true,
    Filter:   transfer.WalkFiles | transfer.WalkDirs,
})
if err := walker.Init("/"); err != nil { ... }
for walker.Next() {
    entry := walker.Entry()
    // entry is reused each iteration — copy values you need to keep
}
if err := walker.Err(); err != nil { ... }

func NewTreeWalker ¶ added in v0.9.0

func NewTreeWalker(reader ArchiveReader, opts WalkOption) *TreeWalker

NewTreeWalker creates a pull-based walker for the archive. Call Init to set the root path before calling Next.

func (*TreeWalker) Entry ¶ added in v0.9.0

func (w *TreeWalker) Entry() *pxar.Entry

Entry returns the current entry. The returned pointer is valid only until the next call to Next. The same Entry memory is reused each iteration.

func (*TreeWalker) Err ¶ added in v0.9.0

func (w *TreeWalker) Err() error

Err returns the error that stopped iteration, if any.

func (*TreeWalker) Init ¶ added in v0.9.0

func (w *TreeWalker) Init(rootPath string) error

Init resolves the root entry and prepares the walker for iteration. Must be called before Next.

func (*TreeWalker) Next ¶ added in v0.9.0

func (w *TreeWalker) Next() bool

Next advances to the next entry matching the walk filter. Returns false when there are no more entries or on error (check Err).

type WalkFilter ¶ added in v0.9.0

type WalkFilter uint

WalkFilter is a bitmask that controls which entry types are visited during a walk. Entries whose type is not in the mask are skipped entirely — the callback is never invoked for them, and directories are not descended into.

const (
	WalkFiles     WalkFilter = 1 << iota // regular files
	WalkDirs                             // directories
	WalkSymlinks                         // symbolic links
	WalkHardlinks                        // hard links
	WalkDevices                          // device nodes
	WalkFifos                            // named pipes (FIFOs)
	WalkSockets                          // unix sockets

	WalkAll WalkFilter = WalkFiles | WalkDirs | WalkSymlinks |
		WalkHardlinks | WalkDevices | WalkFifos | WalkSockets
)

type WalkFunc ¶

type WalkFunc func(entry *pxar.Entry, content []byte) error

WalkFunc is called for each entry encountered during WalkTree. entry is the archive entry. content is the file data (nil for non-files). Return nil to continue, or an error to stop.

type WalkOption ¶ added in v0.9.0

type WalkOption struct {
	// MetaOnly skips reading file content. When true, content is never read
	// from the archive and the content parameter passed to WalkFunc is always nil.
	MetaOnly bool

	// Filter is a bitmask of entry types to include. Entries not matching
	// the filter are skipped without invoking the callback. Directories that
	// are filtered out are not descended into. Zero means accept all types.
	Filter WalkFilter

	// SkipCount fast-forwards past the first N entries without invoking the
	// callback. Entries are still decoded but the walk callback is skipped.
	// Useful for resuming a previous walk.
	SkipCount int
}

WalkOption configures walk behavior. The zero value walks all entry types and reads file content (equivalent to the original WalkTree behavior).

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL