Documentation
¶
Overview ¶
Package transfer provides utilities for transferring files between pxar archives.
Transfer implements archive-to-archive copy, directory tree walking, catalog extraction, lazy chunk loading, dedup-aware writing, and streaming read/write adapters for PBS remote stores, local chunk stores, and raw io.Writer streams.
Readers ¶
All source formats implement ArchiveReader:
- FileReader: standalone .pxar files via io.ReadSeeker
- ChunkedReader: lazy on-demand chunk loading from .didx indexes
- SplitReader: v2 split archives (.mpxar.didx + .ppxar.didx)
- PBSReader: PBS remote stores via H2 reader protocol
- DecryptingReader: wraps any ArchiveReader for encrypted archives
Writers ¶
All target formats implement ArchiveWriter:
- StreamWriter: encodes to io.Writer (v1 or v2 split)
- DedupWriter: same-datastore dedup with chunk reuse
- RemoteDedupWriter: PBS remote dedup with chunk injection
- SessionWriter: uploads via BackupSession
Transfer Functions ¶
Copy copies specific paths between archives with optional path mapping. CopyTree copies entire directory trees.
Walk Functions ¶
WalkTree visits every entry with optional content reading. WalkTreeWith supports metadata-only mode, type filters, and skip counts. WalkTreeMetadata performs metadata-only traversal with a type filter.
Dedup Utilities ¶
RecordMax provides monotonic offset validation for dedup writers. MapFileToPayloadChunks maps file content to payload chunk ranges. ReadChunkedFile reads file content from specific chunks. ComputeContentDigest computes SHA-256 without full stream reconstruction.
Lazy Chunk Loading ¶
ReadSeeker implements io.ReadSeeker over chunked data with configurable chunk cache. DecryptSource wraps ChunkSource for encrypted chunks.
Package transfer provides utilities for transferring files between pxar archives.
Index ¶
- Variables
- func ComputeContentDigest(source datastore.ChunkSource, payloadIdx *datastore.DynamicIndexReader, ...) ([32]byte, error)
- func Copy(src ArchiveReader, dst ArchiveWriter, mappings []PathMapping, opts CopyOption) error
- func CopyTree(src ArchiveReader, dst ArchiveWriter, srcPath, dstPath string, opts CopyOption) error
- func ReadChunkedFile(source datastore.ChunkSource, payloadIdx *datastore.DynamicIndexReader, ...) ([]byte, error)
- func RecordMax(last **uint64, offset uint64) bool
- func WalkTree(reader ArchiveReader, rootPath string, fn WalkFunc) error
- func WalkTreeMetadata(reader ArchiveReader, rootPath string, filter WalkFilter, fn MetadataWalkFunc) error
- func WalkTreeWith(reader ArchiveReader, rootPath string, opts WalkOption, fn WalkFunc) error
- type ArchiveReader
- type ArchiveWriter
- type CatalogEntry
- type ChunkRange
- type ChunkedReader
- func (r *ChunkedReader) Close() error
- func (r *ChunkedReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
- func (r *ChunkedReader) Lookup(path string) (*pxar.Entry, error)
- func (r *ChunkedReader) ReadCatalog(fn func(CatalogEntry) error) error
- func (r *ChunkedReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
- func (r *ChunkedReader) ReadEntryAtMinimal(offset int64) (*pxar.Entry, error)
- func (r *ChunkedReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
- func (r *ChunkedReader) ReadRoot() (*pxar.Entry, error)
- func (r *ChunkedReader) ReaderAt() io.ReaderAt
- type CopyOption
- type DecryptSource
- type DecryptingReader
- func (r *DecryptingReader) Close() error
- func (r *DecryptingReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
- func (r *DecryptingReader) Lookup(path string) (*pxar.Entry, error)
- func (r *DecryptingReader) ReadCatalog(fn func(CatalogEntry) error) error
- func (r *DecryptingReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
- func (r *DecryptingReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
- func (r *DecryptingReader) ReadRoot() (*pxar.Entry, error)
- type DedupWriter
- func (w *DedupWriter) Begin(rootMeta *pxar.Metadata, opts Options) error
- func (w *DedupWriter) BeginDirectory(name string, meta *pxar.Metadata) error
- func (w *DedupWriter) Close() error
- func (w *DedupWriter) DedupStats() (hits, total int)
- func (w *DedupWriter) EndDirectory() error
- func (w *DedupWriter) Finish() error
- func (w *DedupWriter) MetaIndexData() []byte
- func (w *DedupWriter) PayloadIndexData() []byte
- func (w *DedupWriter) ReferenceSourcePayloadChunks()
- func (w *DedupWriter) WriteEntry(entry *pxar.Entry, content []byte) error
- func (w *DedupWriter) WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error
- type FileReader
- func (r *FileReader) Close() error
- func (r *FileReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
- func (r *FileReader) Lookup(path string) (*pxar.Entry, error)
- func (r *FileReader) ReadCatalog(fn func(CatalogEntry) error) error
- func (r *FileReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
- func (r *FileReader) ReadEntryAtMinimal(offset int64) (*pxar.Entry, error)
- func (r *FileReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
- func (r *FileReader) ReadRoot() (*pxar.Entry, error)
- type MetadataWalkFunc
- type Options
- type PBSReader
- func (r *PBSReader) Close() error
- func (r *PBSReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
- func (r *PBSReader) Lookup(path string) (*pxar.Entry, error)
- func (r *PBSReader) ReadCatalog(fn func(CatalogEntry) error) error
- func (r *PBSReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
- func (r *PBSReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
- func (r *PBSReader) ReadRoot() (*pxar.Entry, error)
- type PBSReaderConfig
- type PathMapping
- type ReadSeeker
- type RemoteDedupWriter
- func (w *RemoteDedupWriter) AdvancePayloadPosition(n uint64) error
- func (w *RemoteDedupWriter) Begin(rootMeta *pxar.Metadata, opts Options) error
- func (w *RemoteDedupWriter) BeginDirectory(name string, meta *pxar.Metadata) error
- func (w *RemoteDedupWriter) Close() error
- func (w *RemoteDedupWriter) Encoder() *encoder.Encoder
- func (w *RemoteDedupWriter) EndDirectory() error
- func (w *RemoteDedupWriter) Finish() error
- func (w *RemoteDedupWriter) WriteEntry(entry *pxar.Entry, content []byte) error
- func (w *RemoteDedupWriter) WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error
- func (w *RemoteDedupWriter) WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error
- type SessionWriter
- func (w *SessionWriter) Begin(rootMeta *pxar.Metadata, opts Options) error
- func (w *SessionWriter) BeginDirectory(name string, meta *pxar.Metadata) error
- func (w *SessionWriter) Close() error
- func (w *SessionWriter) Encoder() *encoder.Encoder
- func (w *SessionWriter) EndDirectory() error
- func (w *SessionWriter) Finish() error
- func (w *SessionWriter) WriteEntry(entry *pxar.Entry, content []byte) error
- func (w *SessionWriter) WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error
- func (w *SessionWriter) WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error
- type SplitReader
- func NewSplitReader(metaIdxData, payloadIdxData []byte, source datastore.ChunkSource) (*SplitReader, error)
- func NewSplitReaderEager(metaIdxData, payloadIdxData []byte, source datastore.ChunkSource) (*SplitReader, error)
- func NewSplitReaderMetaOnly(metaIdxData []byte, source datastore.ChunkSource) (*SplitReader, error)
- func (r *SplitReader) Close() error
- func (r *SplitReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
- func (r *SplitReader) Lookup(path string) (*pxar.Entry, error)
- func (r *SplitReader) PayloadReaderAt() io.ReaderAt
- func (r *SplitReader) ReadCatalog(fn func(CatalogEntry) error) error
- func (r *SplitReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
- func (r *SplitReader) ReadEntryAtMinimal(offset int64) (*pxar.Entry, error)
- func (r *SplitReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
- func (r *SplitReader) ReadRoot() (*pxar.Entry, error)
- func (r *SplitReader) SetPayloadCacheSize(n int)
- type StreamWriter
- func (w *StreamWriter) Begin(rootMeta *pxar.Metadata, opts Options) error
- func (w *StreamWriter) BeginDirectory(name string, meta *pxar.Metadata) error
- func (w *StreamWriter) Close() error
- func (w *StreamWriter) Encoder() *encoder.Encoder
- func (w *StreamWriter) EndDirectory() error
- func (w *StreamWriter) Finish() error
- func (w *StreamWriter) WriteEntry(entry *pxar.Entry, content []byte) error
- func (w *StreamWriter) WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error
- func (w *StreamWriter) WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error
- func (w *StreamWriter) WriteHardlink(name string, target string, targetOffset encoder.LinkOffset) error
- type TreeWalker
- type WalkFilter
- type WalkFunc
- type WalkOption
Constants ¶
This section is empty.
Variables ¶
var ErrSkipDir = fmt.Errorf("skip directory")
ErrSkipDir can be returned by a WalkFunc to skip a directory's children.
var WalkMetadataOnly = WalkOption{MetaOnly: true}
WalkMetadataOnly is a convenience WalkOption for metadata-only walks.
Functions ¶
func ComputeContentDigest ¶
func ComputeContentDigest(source datastore.ChunkSource, payloadIdx *datastore.DynamicIndexReader, payloadOffset, fileSize uint64) ([32]byte, error)
ComputeContentDigest computes SHA-256 of a file's content from the source archive without reconstructing the entire payload stream. Only loads the chunks needed for that specific file.
func Copy ¶
func Copy(src ArchiveReader, dst ArchiveWriter, mappings []PathMapping, opts CopyOption) error
Copy copies files from the source archive to the target writer. Each PathMapping specifies a source path (inside the source archive) and a destination path (inside the target archive). If the source entry is a directory, the entire subtree is copied with paths remapped from Src to Dst prefix. If the source entry is a file, the file is written with its path remapped.
func CopyTree ¶
func CopyTree(src ArchiveReader, dst ArchiveWriter, srcPath, dstPath string, opts CopyOption) error
CopyTree copies a directory tree from srcPath in the source archive to dstPath in the target. All entries under the source directory have their paths remapped from the srcPath prefix to the dstPath prefix.
func ReadChunkedFile ¶ added in v0.24.0
func ReadChunkedFile(source datastore.ChunkSource, payloadIdx *datastore.DynamicIndexReader, payloadOffset, fileSize uint64) ([]byte, error)
ReadChunkedFile reads a file's content by loading only the necessary payload chunks. This is more efficient than reconstructing the entire payload stream when you only need specific files.
func RecordMax ¶ added in v0.24.0
RecordMax mirrors Rust's try_record_strictly_greater from pbs-client/src/pxar/create.rs. It records a strictly-monotonically-increasing offset into `last`. Returns true if offset > *last (or first call when *last == nil), false otherwise. Rejected offsets do not update state.
This prevents a corrupt previous archive from injecting backwards PXAR_PAYLOAD_REF offsets, which the encoder's strict offset check would reject, aborting the backup.
func WalkTree ¶
func WalkTree(reader ArchiveReader, rootPath string, fn WalkFunc) error
WalkTree walks a directory tree from an ArchiveReader, calling fn for each entry in encoder-compatible order. For directories, fn is called before children are walked. Content is populated for regular files; for other entry types, content is nil.
If fn returns ErrSkipDir for a directory entry, the directory's children are skipped. If fn returns any other error, walking stops.
func WalkTreeMetadata ¶ added in v0.18.0
func WalkTreeMetadata(reader ArchiveReader, rootPath string, filter WalkFilter, fn MetadataWalkFunc) error
WalkTreeMetadata walks a directory tree in metadata-only mode with a simplified callback. Content is never read, and the filter mask controls which entry types are visited. Use WalkAll to visit all types.
func WalkTreeWith ¶ added in v0.9.0
func WalkTreeWith(reader ArchiveReader, rootPath string, opts WalkOption, fn WalkFunc) error
WalkTreeWith walks a directory tree with the given options. When opts.MetaOnly is true, file content is never read and content is always nil. When opts.Filter is non-zero, entries not matching the filter are skipped.
Types ¶
type ArchiveReader ¶
type ArchiveReader interface {
// ReadRoot returns the root directory entry.
ReadRoot() (*pxar.Entry, error)
// Lookup finds an entry by archive-internal path.
Lookup(path string) (*pxar.Entry, error)
// ListDirectory streams directory entries without materializing a slice.
// For each entry, fn is called with a pointer valid only during the callback.
// If fn returns a non-nil error, iteration stops.
ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
// ReadFileContentReader returns a streaming reader for file content.
// The caller must close the reader. Use this for large files to avoid
// buffering the entire content in memory.
ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
// ReadEntryAt reads a full pxar entry at the given archive byte offset.
// Used for re-reading entries with full metadata after a minimal ListDirectory.
ReadEntryAt(offset int64) (*pxar.Entry, error)
// ReadCatalog streams the full directory tree via a callback with
// minimal decoding. For each entry, fn is called. If fn returns a
// non-nil error, iteration stops and the error is returned.
// Significantly faster than WalkTree for indexing.
ReadCatalog(fn func(CatalogEntry) error) error
// Close releases resources.
Close() error
}
ArchiveReader provides unified read access to any pxar archive format.
type ArchiveWriter ¶
type ArchiveWriter interface {
// Begin starts writing to a new archive with the given root metadata.
Begin(rootMeta *pxar.Metadata, opts Options) error
// WriteEntry writes an entry (file, symlink, device, etc.) to the archive.
// For regular files, content is the file data. For other types, content may be nil.
WriteEntry(entry *pxar.Entry, content []byte) error
// WriteEntryRef writes an entry that references existing payload data
// without writing the payload itself. The payloadOffset is the byte offset
// in the original payload stream. Used for chunk-level deduplication.
WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error
// WriteEntryReader writes a file entry with content streamed from r.
// size is the total byte count. For non-file entries (symlink, device,
// fifo, socket), r and size are ignored and content is nil.
// The caller must ensure r provides exactly size bytes.
WriteEntryReader(entry *pxar.Entry, r io.Reader, size uint64) error
// BeginDirectory pushes a directory context.
BeginDirectory(name string, meta *pxar.Metadata) error
// EndDirectory pops a directory context.
EndDirectory() error
// Finish finalizes the archive.
Finish() error
// Close releases resources.
Close() error
}
ArchiveWriter provides unified write access to any pxar archive format.
type CatalogEntry ¶ added in v0.10.0
CatalogEntry is a stripped-down entry for index-building. It contains only the fields needed for cataloging: path, kind, size, and parent.
type ChunkRange ¶
type ChunkRange struct {
ChunkIndex int
Digest [32]byte
ChunkStart uint64 // start offset in the payload stream
ChunkEnd uint64 // end offset in the payload stream
ContentStart uint64 // start of overlap with file content
ContentEnd uint64 // end of overlap with file content
IsFullChunk bool // true if the entire chunk is within the file's content
}
ChunkRange describes a chunk's overlap with a file's content.
func MapFileToPayloadChunks ¶
func MapFileToPayloadChunks(payloadIdx *datastore.DynamicIndexReader, payloadOffset, fileSize uint64) []ChunkRange
MapFileToPayloadChunks maps a file's content in the source payload stream to the chunk digests that contain it. This is used to know which source chunks are needed for a specific file without downloading the entire payload stream.
Returns a list of (chunkIndex, digest) pairs for chunks that overlap the file's content range, and the byte offsets within those chunks that correspond to the file content.
type ChunkedReader ¶ added in v0.24.0
type ChunkedReader struct {
// contains filtered or unexported fields
}
ChunkedReader reads from a chunked archive (.pxar.didx). It lazily loads chunks on demand using a ReadSeeker, avoiding full-stream-in-memory reconstruction. For small archives where full reconstruction is acceptable, use NewChunkedReaderEager.
func NewChunkedReader ¶ added in v0.24.0
func NewChunkedReader(idxData []byte, source datastore.ChunkSource) (*ChunkedReader, error)
NewChunkedReader creates a reader for a chunked .pxar.didx archive using lazy on-demand chunk loading. This avoids reconstructing the entire stream into memory — only chunks needed for Lookups and ReadFileContent calls are loaded.
func NewChunkedReaderEager ¶ added in v0.24.0
func NewChunkedReaderEager(idxData []byte, source datastore.ChunkSource) (*ChunkedReader, error)
NewChunkedReaderEager creates a reader that reconstructs the entire stream into memory upfront. Use this for small archives or when you need guaranteed sequential access performance.
func (*ChunkedReader) Close ¶ added in v0.24.0
func (r *ChunkedReader) Close() error
func (*ChunkedReader) ListDirectory ¶ added in v0.24.0
func (r *ChunkedReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
func (*ChunkedReader) Lookup ¶ added in v0.24.0
func (r *ChunkedReader) Lookup(path string) (*pxar.Entry, error)
func (*ChunkedReader) ReadCatalog ¶ added in v0.24.0
func (r *ChunkedReader) ReadCatalog(fn func(CatalogEntry) error) error
func (*ChunkedReader) ReadEntryAt ¶ added in v0.24.0
func (r *ChunkedReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
func (*ChunkedReader) ReadEntryAtMinimal ¶ added in v0.24.0
func (r *ChunkedReader) ReadEntryAtMinimal(offset int64) (*pxar.Entry, error)
func (*ChunkedReader) ReadFileContentReader ¶ added in v0.24.0
func (r *ChunkedReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
func (*ChunkedReader) ReadRoot ¶ added in v0.24.0
func (r *ChunkedReader) ReadRoot() (*pxar.Entry, error)
func (*ChunkedReader) ReaderAt ¶ added in v0.24.0
func (r *ChunkedReader) ReaderAt() io.ReaderAt
ReaderAt returns the underlying io.ReaderAt for the archive stream. Returns nil for eager readers backed by bytes.Reader (use the FileReader directly if you need ReaderAt on those). The returned ReaderAt is safe for concurrent use.
type CopyOption ¶ added in v0.24.0
type CopyOption struct {
SourceCryptConfig *datastore.CryptConfig
TargetCryptConfig *datastore.CryptConfig
OnProgress func(path string, bytes uint64)
TargetFormat format.FormatVersion
Overwrite bool
}
CopyOption configures a file transfer operation.
type DecryptSource ¶ added in v0.24.0
type DecryptSource struct {
// contains filtered or unexported fields
}
DecryptSource wraps a ChunkSource and decrypts/decompresses chunks on the fly. This is used when reading from an encrypted archive where the raw chunks are encrypted blobs that need to be decoded before restoration.
When a CryptConfig is provided, encrypted blobs are decrypted. All blobs are decoded (uncompressed/decrypted) before being returned, producing the raw chunk data that the Restorer expects.
func NewDecryptSource ¶ added in v0.24.0
func NewDecryptSource(inner datastore.ChunkSource, cc *datastore.CryptConfig) *DecryptSource
NewDecryptSource creates a chunk source that decrypts chunks after retrieval. Pass nil for cc if the archive is not encrypted (only decompression is needed).
type DecryptingReader ¶
type DecryptingReader struct {
// contains filtered or unexported fields
}
DecryptingReader is a placeholder for per-file decryption support. For most cases, using DecryptSource when constructing the underlying reader (ChunkedReader/SplitReader) is preferred since it handles decryption at the chunk level before stream reconstruction. This type simply delegates to the inner reader.
func NewDecryptingReader ¶
func NewDecryptingReader(inner ArchiveReader) *DecryptingReader
NewDecryptingReader wraps an ArchiveReader for transparent access. The underlying reader should already be configured with a DecryptSource if decryption is needed.
func (*DecryptingReader) Close ¶
func (r *DecryptingReader) Close() error
func (*DecryptingReader) ListDirectory ¶
func (r *DecryptingReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
func (*DecryptingReader) Lookup ¶
func (r *DecryptingReader) Lookup(path string) (*pxar.Entry, error)
func (*DecryptingReader) ReadCatalog ¶ added in v0.18.0
func (r *DecryptingReader) ReadCatalog(fn func(CatalogEntry) error) error
func (*DecryptingReader) ReadEntryAt ¶ added in v0.20.0
func (r *DecryptingReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
func (*DecryptingReader) ReadFileContentReader ¶ added in v0.18.0
func (r *DecryptingReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
type DedupWriter ¶ added in v0.24.0
type DedupWriter struct {
// contains filtered or unexported fields
}
DedupWriter writes a v2 split archive into the same chunk store as the source, reusing existing payload chunks instead of re-uploading them.
For same-datastore transfers, this avoids:
- Re-encoding file content into a new payload buffer
- Re-chunking the payload stream (which would produce different chunks)
- Re-uploading chunks that already exist in the store
Instead, it builds the new payload index by referencing source payload chunk digests directly. Only the metadata stream needs to be re-chunked and uploaded (it's small since it only contains filenames and metadata).
The writer assumes the source's payload chunks exist in the chunk store. For files not present in the source (new content), it falls back to normal encoding and chunking.
func NewDedupWriter ¶ added in v0.24.0
func NewDedupWriter( store *datastore.ChunkStore, source datastore.ChunkSource, config buzhash.Config, compress bool, sourcePayloadIdx *datastore.DynamicIndexReader, ) *DedupWriter
NewDedupWriter creates a writer that reuses source payload chunks. sourcePayloadIdx is the source archive's .ppxar.didx index. source is the ChunkSource for reading source chunks (same store as target).
func (*DedupWriter) Begin ¶ added in v0.24.0
func (w *DedupWriter) Begin(rootMeta *pxar.Metadata, opts Options) error
func (*DedupWriter) BeginDirectory ¶ added in v0.24.0
func (w *DedupWriter) BeginDirectory(name string, meta *pxar.Metadata) error
func (*DedupWriter) Close ¶ added in v0.24.0
func (w *DedupWriter) Close() error
func (*DedupWriter) DedupStats ¶ added in v0.24.0
func (w *DedupWriter) DedupStats() (hits, total int)
DedupStats returns (already_existed, total) payload chunk counts.
func (*DedupWriter) EndDirectory ¶ added in v0.24.0
func (w *DedupWriter) EndDirectory() error
func (*DedupWriter) Finish ¶ added in v0.24.0
func (w *DedupWriter) Finish() error
func (*DedupWriter) MetaIndexData ¶ added in v0.24.0
func (w *DedupWriter) MetaIndexData() []byte
MetaIndexData returns the .mpxar.didx index data after Finish.
func (*DedupWriter) PayloadIndexData ¶ added in v0.24.0
func (w *DedupWriter) PayloadIndexData() []byte
PayloadIndexData returns the .ppxar.didx index data after Finish.
func (*DedupWriter) ReferenceSourcePayloadChunks ¶ added in v0.24.0
func (w *DedupWriter) ReferenceSourcePayloadChunks()
ReferenceSourcePayloadChunks marks chunks from the source's payload index as already existing in the store. Call this before Finish to enable dedup tracking. The ChunkStore.InsertChunk call will skip these chunks since they already exist on disk.
This is mainly useful for reporting — the actual dedup happens automatically via ChunkStore.InsertChunk.
func (*DedupWriter) WriteEntry ¶ added in v0.24.0
func (w *DedupWriter) WriteEntry(entry *pxar.Entry, content []byte) error
func (*DedupWriter) WriteEntryReader ¶ added in v0.24.0
type FileReader ¶ added in v0.24.0
type FileReader struct {
// contains filtered or unexported fields
}
FileReader reads from a standalone .pxar file using an io.ReadSeeker. For split archives (v2), provide both the metadata and payload readers.
func NewFileReader ¶ added in v0.24.0
func NewFileReader(reader io.ReadSeeker) *FileReader
NewFileReader creates a reader for a standalone .pxar file.
func NewSplitFileReader ¶ added in v0.24.0
func NewSplitFileReader(metaReader, payloadReader io.ReadSeeker) *FileReader
NewSplitFileReader creates a reader for a split (v2) archive with separate metadata and payload streams.
func (*FileReader) Close ¶ added in v0.24.0
func (r *FileReader) Close() error
func (*FileReader) ListDirectory ¶ added in v0.24.0
func (r *FileReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
func (*FileReader) Lookup ¶ added in v0.24.0
func (r *FileReader) Lookup(path string) (*pxar.Entry, error)
func (*FileReader) ReadCatalog ¶ added in v0.24.0
func (r *FileReader) ReadCatalog(fn func(CatalogEntry) error) error
func (*FileReader) ReadEntryAt ¶ added in v0.24.0
func (r *FileReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
func (*FileReader) ReadEntryAtMinimal ¶ added in v0.24.0
func (r *FileReader) ReadEntryAtMinimal(offset int64) (*pxar.Entry, error)
ReadEntryAtMinimal reads a pxar entry with minimal decoding (stat only).
func (*FileReader) ReadFileContentReader ¶ added in v0.24.0
func (r *FileReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
type MetadataWalkFunc ¶ added in v0.18.0
MetadataWalkFunc is called for each entry during a metadata-only walk. Unlike WalkFunc, no content parameter is provided since content is never read.
type Options ¶ added in v0.24.0
type Options struct {
Prelude []byte
Format format.FormatVersion
}
Options configures how an ArchiveWriter creates archives.
type PBSReader ¶ added in v0.24.0
type PBSReader struct {
// contains filtered or unexported fields
}
PBSReader reads archives from a PBS remote store. It downloads the index file(s) and reconstructs the archive stream using chunks from the PBS reader protocol.
func NewPBSReader ¶ added in v0.24.0
func NewPBSReader(ctx context.Context, cfg PBSReaderConfig) (*PBSReader, error)
NewPBSReader creates a reader for a PBS remote archive. For v1 archives, set ArchiveName. For v2 split archives, set MetaName and PayloadName.
func (*PBSReader) ListDirectory ¶ added in v0.24.0
func (*PBSReader) ReadCatalog ¶ added in v0.24.0
func (r *PBSReader) ReadCatalog(fn func(CatalogEntry) error) error
func (*PBSReader) ReadEntryAt ¶ added in v0.24.0
func (*PBSReader) ReadFileContentReader ¶ added in v0.24.0
type PBSReaderConfig ¶ added in v0.24.0
type PBSReaderConfig struct {
BackupType string
BackupID string
ArchiveName string
MetaName string
PayloadName string
Config backupproxy.PBSConfig
BackupTime int64
MetaOnly bool
}
PBSReaderConfig holds the configuration for opening a PBS archive.
type PathMapping ¶
type PathMapping struct {
Src string // path in the source archive
Dst string // path in the target archive
}
PathMapping maps a source path to a destination path inside the archives.
type ReadSeeker ¶ added in v0.24.0
type ReadSeeker struct {
// contains filtered or unexported fields
}
ReadSeeker implements io.ReadSeeker over a chunked archive stream. Instead of reconstructing the entire stream into memory, it lazily loads and decodes chunks on demand using the dynamic index and a chunk source. This is critical for same-datastore transfers where only a subset of files are needed — it avoids downloading the entire payload stream from PBS.
func NewReadSeeker ¶ added in v0.24.0
func NewReadSeeker(idx *datastore.DynamicIndexReader, source datastore.ChunkSource, maxCache int) *ReadSeeker
NewReadSeeker creates a lazy read-seeker over chunked data. maxCache controls how many decoded chunks are kept in memory (0 = unlimited).
func (*ReadSeeker) Close ¶ added in v0.24.0
func (r *ReadSeeker) Close() error
Close clears the chunk cache.
func (*ReadSeeker) ReadAt ¶ added in v0.24.0
func (r *ReadSeeker) ReadAt(p []byte, offset int64) (int, error)
ReadAt reads len(p) bytes starting at the given offset without mutating the seeker's internal position. It is safe for concurrent use.
func (*ReadSeeker) Seek ¶ added in v0.24.0
func (r *ReadSeeker) Seek(offset int64, whence int) (int64, error)
func (*ReadSeeker) SetCacheSize ¶ added in v0.24.0
func (r *ReadSeeker) SetCacheSize(n int)
SetCacheSize adjusts the maximum number of decoded chunks kept in memory. Setting to 0 disables caching entirely — each chunk is decoded on demand and immediately discarded. This is appropriate for payload streams where content is streamed sequentially and caching would accumulate unbounded memory. Existing cached entries are evicted if the new size is lower.
type RemoteDedupWriter ¶ added in v0.24.0
type RemoteDedupWriter struct {
// contains filtered or unexported fields
}
RemoteDedupWriter writes a split archive to PBS with chunk-level dedup.
For files that are unchanged from the original archive (pxar-only entries), it uses AddPayloadRef to reference original payload offsets without reading file content. The original payload chunks are injected into the new DIDX directly.
For new/modified files (backed entries), it writes payload data normally.
Architecture mirrors the Rust PBS client (pxar_backup_stream.rs): the encoder writes to a bufio.Writer wrapping a bounded channel sender. A separate goroutine reads the channel and presents an io.Reader to UploadPayloadWithInjection. This decouples encoding from uploading with bounded memory (~10 × bufioSize = ~2.5 MB in-flight payload data).
func NewRemoteDedupWriter ¶ added in v0.24.0
func NewRemoteDedupWriter( ctx context.Context, session backupproxy.BackupSession, metaName, payloadName string, origPayloadIndex []byte, ) (*RemoteDedupWriter, error)
NewRemoteDedupWriter creates a dedup writer for PBS uploads. origPayloadIndex is the raw DIDX bytes from the original .ppxar.didx.
func (*RemoteDedupWriter) AdvancePayloadPosition ¶ added in v0.24.0
func (w *RemoteDedupWriter) AdvancePayloadPosition(n uint64) error
AdvancePayloadPosition advances the encoder's payload write position. Call after all AddPayloadRef calls to account for the original stream's TAIL_MARKER before writing new files.
func (*RemoteDedupWriter) Begin ¶ added in v0.24.0
func (w *RemoteDedupWriter) Begin(rootMeta *pxar.Metadata, opts Options) error
func (*RemoteDedupWriter) BeginDirectory ¶ added in v0.24.0
func (w *RemoteDedupWriter) BeginDirectory(name string, meta *pxar.Metadata) error
func (*RemoteDedupWriter) Close ¶ added in v0.24.0
func (w *RemoteDedupWriter) Close() error
func (*RemoteDedupWriter) Encoder ¶ added in v0.24.0
func (w *RemoteDedupWriter) Encoder() *encoder.Encoder
Encoder returns the underlying encoder.
func (*RemoteDedupWriter) EndDirectory ¶ added in v0.24.0
func (w *RemoteDedupWriter) EndDirectory() error
func (*RemoteDedupWriter) Finish ¶ added in v0.24.0
func (w *RemoteDedupWriter) Finish() error
func (*RemoteDedupWriter) WriteEntry ¶ added in v0.24.0
func (w *RemoteDedupWriter) WriteEntry(entry *pxar.Entry, content []byte) error
func (*RemoteDedupWriter) WriteEntryReader ¶ added in v0.24.0
func (*RemoteDedupWriter) WriteEntryRef ¶ added in v0.24.0
func (w *RemoteDedupWriter) WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error
WriteEntryRef writes an entry referencing existing payload data. Returns an error if payloadOffset is not strictly greater than the last accepted offset (mirrors Rust's try_record_strictly_greater validation).
type SessionWriter ¶ added in v0.24.0
type SessionWriter struct {
SplitResult *backupproxy.SplitArchiveResult
// contains filtered or unexported fields
}
SessionWriter writes a split (v2) archive by uploading both metadata and payload streams through a BackupSession.
func NewSessionWriter ¶ added in v0.24.0
func NewSessionWriter(ctx context.Context, session backupproxy.BackupSession, metaName, payloadName string) *SessionWriter
NewSessionWriter creates a split writer that uploads via a BackupSession.
func (*SessionWriter) Begin ¶ added in v0.24.0
func (w *SessionWriter) Begin(rootMeta *pxar.Metadata, opts Options) error
func (*SessionWriter) BeginDirectory ¶ added in v0.24.0
func (w *SessionWriter) BeginDirectory(name string, meta *pxar.Metadata) error
func (*SessionWriter) Close ¶ added in v0.24.0
func (w *SessionWriter) Close() error
func (*SessionWriter) Encoder ¶ added in v0.24.0
func (w *SessionWriter) Encoder() *encoder.Encoder
Encoder returns the underlying encoder for advanced operations.
func (*SessionWriter) EndDirectory ¶ added in v0.24.0
func (w *SessionWriter) EndDirectory() error
func (*SessionWriter) Finish ¶ added in v0.24.0
func (w *SessionWriter) Finish() error
func (*SessionWriter) WriteEntry ¶ added in v0.24.0
func (w *SessionWriter) WriteEntry(entry *pxar.Entry, content []byte) error
func (*SessionWriter) WriteEntryReader ¶ added in v0.24.0
func (*SessionWriter) WriteEntryRef ¶ added in v0.24.0
func (w *SessionWriter) WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error
type SplitReader ¶ added in v0.24.0
type SplitReader struct {
// contains filtered or unexported fields
}
SplitReader reads from a split chunked archive (.mpxar.didx + .ppxar.didx). It uses lazy on-demand chunk loading for both metadata and payload streams, avoiding full-stream-in-memory reconstruction. For small archives, use NewSplitReaderEager.
func NewSplitReader ¶ added in v0.24.0
func NewSplitReader(metaIdxData, payloadIdxData []byte, source datastore.ChunkSource) (*SplitReader, error)
NewSplitReader creates a reader for a split chunked archive using lazy on-demand chunk loading. Only chunks needed for Lookups and ReadFileContent calls are loaded, which is critical for same-datastore PBS transfers where downloading the entire payload stream is expensive.
func NewSplitReaderEager ¶ added in v0.24.0
func NewSplitReaderEager(metaIdxData, payloadIdxData []byte, source datastore.ChunkSource) (*SplitReader, error)
NewSplitReaderEager creates a reader that reconstructs both streams into memory upfront. Use for small archives or when you need guaranteed sequential access performance.
func NewSplitReaderMetaOnly ¶ added in v0.24.0
func NewSplitReaderMetaOnly(metaIdxData []byte, source datastore.ChunkSource) (*SplitReader, error)
NewSplitReaderMetaOnly creates a reader for a split archive that only downloads and uses the metadata stream. The payload stream is never fetched. ReadFileContent/ReadFileContentReader will return errors for files stored in the payload stream (PayloadOffset > 0).
func (*SplitReader) Close ¶ added in v0.24.0
func (r *SplitReader) Close() error
func (*SplitReader) ListDirectory ¶ added in v0.24.0
func (r *SplitReader) ListDirectory(dirOffset int64, opts accessor.ListOption, fn func(*pxar.Entry) error) error
func (*SplitReader) Lookup ¶ added in v0.24.0
func (r *SplitReader) Lookup(path string) (*pxar.Entry, error)
func (*SplitReader) PayloadReaderAt ¶ added in v0.24.0
func (r *SplitReader) PayloadReaderAt() io.ReaderAt
PayloadReaderAt returns the underlying io.ReaderAt for the payload stream. Returns nil for meta-only or eager readers that don't use a ReadSeeker. The returned ReaderAt is safe for concurrent use.
func (*SplitReader) ReadCatalog ¶ added in v0.24.0
func (r *SplitReader) ReadCatalog(fn func(CatalogEntry) error) error
func (*SplitReader) ReadEntryAt ¶ added in v0.24.0
func (r *SplitReader) ReadEntryAt(offset int64) (*pxar.Entry, error)
func (*SplitReader) ReadEntryAtMinimal ¶ added in v0.24.0
func (r *SplitReader) ReadEntryAtMinimal(offset int64) (*pxar.Entry, error)
func (*SplitReader) ReadFileContentReader ¶ added in v0.24.0
func (r *SplitReader) ReadFileContentReader(entry *pxar.Entry) (io.ReadCloser, error)
func (*SplitReader) ReadRoot ¶ added in v0.24.0
func (r *SplitReader) ReadRoot() (*pxar.Entry, error)
func (*SplitReader) SetPayloadCacheSize ¶ added in v0.24.0
func (r *SplitReader) SetPayloadCacheSize(n int)
SetPayloadCacheSize adjusts the payload chunk cache size. See ReadSeeker.SetCacheSize for details.
type StreamWriter ¶ added in v0.24.0
type StreamWriter struct {
// contains filtered or unexported fields
}
StreamWriter writes a pxar archive to one or two io.Writer streams. For v1 format, only output is used. For v2 format, both output and payloadOut are used.
func NewSplitStreamWriter ¶ added in v0.24.0
func NewSplitStreamWriter(output, payloadOut io.Writer) *StreamWriter
NewSplitStreamWriter creates a writer for v2 (split) format.
func NewStreamWriter ¶ added in v0.24.0
func NewStreamWriter(output io.Writer) *StreamWriter
NewStreamWriter creates a writer for v1 (unified) format.
func (*StreamWriter) Begin ¶ added in v0.24.0
func (w *StreamWriter) Begin(rootMeta *pxar.Metadata, opts Options) error
func (*StreamWriter) BeginDirectory ¶ added in v0.24.0
func (w *StreamWriter) BeginDirectory(name string, meta *pxar.Metadata) error
func (*StreamWriter) Close ¶ added in v0.24.0
func (w *StreamWriter) Close() error
func (*StreamWriter) Encoder ¶ added in v0.24.0
func (w *StreamWriter) Encoder() *encoder.Encoder
Encoder returns the underlying encoder for advanced operations. This is useful for getting file offsets for hardlink tracking.
func (*StreamWriter) EndDirectory ¶ added in v0.24.0
func (w *StreamWriter) EndDirectory() error
func (*StreamWriter) Finish ¶ added in v0.24.0
func (w *StreamWriter) Finish() error
func (*StreamWriter) WriteEntry ¶ added in v0.24.0
func (w *StreamWriter) WriteEntry(entry *pxar.Entry, content []byte) error
func (*StreamWriter) WriteEntryReader ¶ added in v0.24.0
func (*StreamWriter) WriteEntryRef ¶ added in v0.24.0
func (w *StreamWriter) WriteEntryRef(entry *pxar.Entry, payloadOffset uint64) error
func (*StreamWriter) WriteHardlink ¶ added in v0.24.0
func (w *StreamWriter) WriteHardlink(name string, target string, targetOffset encoder.LinkOffset) error
WriteHardlink writes a hard link entry with an explicit target offset.
type TreeWalker ¶ added in v0.9.0
type TreeWalker struct {
// contains filtered or unexported fields
}
TreeWalker provides a pull-based iterator for walking a pxar archive tree. It reuses a single Entry across all Next() calls, producing zero heap allocations per iteration.
Example:
walker := transfer.NewTreeWalker(reader, transfer.WalkOption{
MetaOnly: true,
Filter: transfer.WalkFiles | transfer.WalkDirs,
})
if err := walker.Init("/"); err != nil { ... }
for walker.Next() {
entry := walker.Entry()
// entry is reused each iteration — copy values you need to keep
}
if err := walker.Err(); err != nil { ... }
func NewTreeWalker ¶ added in v0.9.0
func NewTreeWalker(reader ArchiveReader, opts WalkOption) *TreeWalker
NewTreeWalker creates a pull-based walker for the archive. Call Init to set the root path before calling Next.
func (*TreeWalker) Entry ¶ added in v0.9.0
func (w *TreeWalker) Entry() *pxar.Entry
Entry returns the current entry. The returned pointer is valid only until the next call to Next. The same Entry memory is reused each iteration.
func (*TreeWalker) Err ¶ added in v0.9.0
func (w *TreeWalker) Err() error
Err returns the error that stopped iteration, if any.
func (*TreeWalker) Init ¶ added in v0.9.0
func (w *TreeWalker) Init(rootPath string) error
Init resolves the root entry and prepares the walker for iteration. Must be called before Next.
func (*TreeWalker) Next ¶ added in v0.9.0
func (w *TreeWalker) Next() bool
Next advances to the next entry matching the walk filter. Returns false when there are no more entries or on error (check Err).
type WalkFilter ¶ added in v0.9.0
type WalkFilter uint
WalkFilter is a bitmask that controls which entry types are visited during a walk. Entries whose type is not in the mask are skipped entirely — the callback is never invoked for them, and directories are not descended into.
const ( WalkFiles WalkFilter = 1 << iota // regular files WalkDirs // directories WalkSymlinks // symbolic links WalkHardlinks // hard links WalkDevices // device nodes WalkFifos // named pipes (FIFOs) WalkSockets // unix sockets WalkAll WalkFilter = WalkFiles | WalkDirs | WalkSymlinks | WalkHardlinks | WalkDevices | WalkFifos | WalkSockets )
type WalkFunc ¶
WalkFunc is called for each entry encountered during WalkTree. entry is the archive entry. content is the file data (nil for non-files). Return nil to continue, or an error to stop.
type WalkOption ¶ added in v0.9.0
type WalkOption struct {
// MetaOnly skips reading file content. When true, content is never read
// from the archive and the content parameter passed to WalkFunc is always nil.
MetaOnly bool
// Filter is a bitmask of entry types to include. Entries not matching
// the filter are skipped without invoking the callback. Directories that
// are filtered out are not descended into. Zero means accept all types.
Filter WalkFilter
// SkipCount fast-forwards past the first N entries without invoking the
// callback. Entries are still decoded but the walk callback is skipped.
// Useful for resuming a previous walk.
SkipCount int
}
WalkOption configures walk behavior. The zero value walks all entry types and reads file content (equivalent to the original WalkTree behavior).