Documentation
¶
Overview ¶
Package datastore provides chunk storage, indexing, and backup catalog management for pxar archives.
The package implements the Proxmox Backup Server data model: backup data is split into chunks, each chunk is stored as a DataBlob (with optional zstd compression and CRC32 verification), and chunk references are tracked in dynamic or fixed index files.
Chunk Store ¶
ChunkStore manages chunk storage on the local filesystem. Each chunk is identified by its SHA-256 digest and stored under a .chunks directory:
store, err := datastore.NewChunkStore("/backup/datastore")
if err != nil {
log.Fatal(err)
}
// Store a chunk
digest := sha256.Sum256(data)
inserted, size, err := store.InsertChunk(digest, blobData)
// Load a chunk
blobData, err := store.LoadChunk(digest)
Data Blobs ¶
All chunk data is wrapped in a DataBlob envelope containing a magic number and CRC32 checksum:
blob, err := datastore.EncodeBlob(rawChunk) encoded := blob.Bytes() // Decode decoded, err := datastore.DecodeBlob(encoded)
Use EncodeCompressedBlob for zstd compression.
Index Files ¶
Dynamic indexes (.didx) map variable-size chunks (from buzhash chunking) to their digests and offsets:
writer := datastore.NewDynamicIndexWriter(time.Now().Unix()) writer.Add(offset, digest) indexData, err := writer.Finish() // Read back reader, err := datastore.ReadDynamicIndex(indexData) count := reader.Count() info, ok := reader.ChunkInfo(0)
Fixed indexes (.fidx) are used for fixed-size chunks (e.g., raw disk images).
Store Chunker ¶
StoreChunker wires together buzhash chunking, blob encoding, and chunk storage into a single pipeline:
sc := datastore.NewStoreChunker(store, chunkCfg, true) // true = compress results, idxWriter, err := sc.ChunkStream(archiveReader)
Backup Catalog ¶
BackupType, BackupGroup, BackupDir, and BackupInfo model the PBS backup namespace hierarchy (type/id/timestamp). Manifest tracks all files in a backup snapshot:
manifest := &datastore.Manifest{
BackupType: datastore.BackupHost.String(),
BackupID: "myhost",
BackupTime: time.Now().Unix(),
Files: []datastore.FileInfo{...},
}
data, err := manifest.Marshal()
Index ¶
- Constants
- Variables
- func BlobHeaderSizeFor(magic [8]byte) int
- func DecodeBlob(raw []byte) ([]byte, error)
- func DecodeBlobInto(dst []byte, raw []byte) ([]byte, error)
- func EncodeBlobTo(dst []byte, data []byte) ([]byte, error)
- func EncodeCompressedBlobTo(dst []byte, data []byte) ([]byte, error)
- func IsCompressedMagic(magic [8]byte) bool
- func IsEncryptedMagic(magic [8]byte) bool
- type BackupDir
- type BackupGroup
- type BackupInfo
- type BackupType
- type BlobHeader
- type ChunkInfo
- type ChunkResult
- type ChunkSource
- type ChunkStore
- type ChunkStoreSource
- type DataBlob
- type DynamicEntry
- type DynamicIndexHeader
- type DynamicIndexReader
- func (r *DynamicIndexReader) CTime() int64
- func (r *DynamicIndexReader) ChunkFromOffset(offset uint64) (int, bool)
- func (r *DynamicIndexReader) ChunkInfo(pos int) (ChunkInfo, bool)
- func (r *DynamicIndexReader) ComputeCsum() ([32]byte, uint64)
- func (r *DynamicIndexReader) Count() int
- func (r *DynamicIndexReader) Entry(i int) DynamicEntry
- func (r *DynamicIndexReader) IndexBytes() uint64
- func (r *DynamicIndexReader) IndexDigest(pos int) ([32]byte, bool)
- type DynamicIndexWriter
- type EncryptedBlobHeader
- type FileInfo
- type FixedIndexHeader
- type FixedIndexReader
- func (r *FixedIndexReader) CTime() int64
- func (r *FixedIndexReader) ChunkFromOffset(offset uint64) (int, bool)
- func (r *FixedIndexReader) ChunkInfo(pos int) (ChunkInfo, bool)
- func (r *FixedIndexReader) ComputeCsum() ([32]byte, uint64)
- func (r *FixedIndexReader) Count() int
- func (r *FixedIndexReader) IndexBytes() uint64
- func (r *FixedIndexReader) IndexDigest(pos int) ([32]byte, bool)
- type FixedIndexWriter
- type Manifest
- type Restorer
- type StoreChunker
Constants ¶
const ( BlobHeaderSize = 12 // magic(8) + crc32(4) EncryptedBlobHeaderSize = 48 // magic(8) + crc32(4) + iv(16) + tag(16) IndexHeaderSize = 4096 DynamicEntrySize = 40 // end_offset(8) + digest(32) FixedDigestSize = 32 MaxBlobSize = 128 * 1024 * 1024 // 128MB )
Variables ¶
var ( MagicUncompressedBlob = [8]byte{66, 171, 56, 7, 190, 131, 112, 161} MagicCompressedBlob = [8]byte{49, 185, 88, 66, 111, 182, 163, 127} MagicEncryptedBlob = [8]byte{123, 103, 133, 190, 34, 45, 76, 240} MagicEncrComprBlob = [8]byte{230, 89, 27, 191, 11, 191, 216, 11} MagicFixedChunkIndex = [8]byte{47, 127, 65, 237, 145, 253, 15, 205} MagicDynamicChunkIndex = [8]byte{28, 145, 78, 165, 25, 186, 179, 205} )
Magic numbers from Proxmox Backup Server (file_formats.rs).
Functions ¶
func BlobHeaderSizeFor ¶
BlobHeaderSizeFor returns the header size for the given blob magic. Panics for unknown magic values.
func DecodeBlob ¶
DecodeBlob decodes a raw blob, verifies CRC, and returns the payload data.
func DecodeBlobInto ¶ added in v0.3.2
DecodeBlobInto decodes a raw blob into dst, verifying CRC. For compressed blobs, dst is used as the decompression output buffer (grown if needed). For uncompressed blobs, returns a slice into raw (zero allocation).
func EncodeBlobTo ¶ added in v0.3.2
EncodeBlobTo encodes data as an uncompressed blob into dst, which must have capacity of at least BlobHeaderSize+len(data). Returns the slice of dst containing the encoded blob. This avoids the DataBlob wrapper allocation.
func EncodeCompressedBlobTo ¶ added in v0.3.2
EncodeCompressedBlobTo encodes data as a compressed blob into dst. If compression doesn't reduce size, falls back to uncompressed format. Returns the slice of dst containing the encoded blob.
func IsCompressedMagic ¶
IsCompressedMagic returns true for compressed blob types.
func IsEncryptedMagic ¶
IsEncryptedMagic returns true for encrypted blob types.
Types ¶
type BackupDir ¶
type BackupDir struct {
Group BackupGroup
Timestamp time.Time
}
BackupDir represents a single backup snapshot.
func (BackupDir) Info ¶
func (d BackupDir) Info() (*BackupInfo, error)
Info returns detailed information about this backup snapshot.
type BackupGroup ¶
type BackupGroup struct {
Type BackupType
ID string
Base string // base directory (datastore root)
}
BackupGroup represents a collection of backup snapshots (e.g., vm/100).
func ListBackupGroups ¶
func ListBackupGroups(base string) ([]BackupGroup, error)
ListBackupGroups returns all backup groups in the datastore base directory.
func (BackupGroup) Destroy ¶
func (g BackupGroup) Destroy() error
Destroy removes the backup group directory.
func (BackupGroup) FullPath ¶
func (g BackupGroup) FullPath() string
FullPath returns the absolute path under the base directory.
func (BackupGroup) ListSnapshots ¶
func (g BackupGroup) ListSnapshots() ([]BackupDir, error)
ListSnapshots returns all backup snapshots in this group.
func (BackupGroup) Path ¶
func (g BackupGroup) Path() string
Path returns the relative path for this group (e.g., "vm/100").
type BackupInfo ¶
BackupInfo holds metadata about a backup snapshot.
func (*BackupInfo) Protect ¶
func (info *BackupInfo) Protect() error
Protect marks the backup as protected by creating a .protected file.
func (*BackupInfo) Unprotect ¶
func (info *BackupInfo) Unprotect() error
Unprotect removes the protection marker.
type BackupType ¶
type BackupType int
BackupType identifies the kind of backup.
const ( BackupVM BackupType = iota BackupCT BackupHost )
func ParseBackupType ¶
func ParseBackupType(s string) (BackupType, error)
ParseBackupType parses a backup type string.
func (BackupType) String ¶
func (bt BackupType) String() string
type BlobHeader ¶
BlobHeader is the 12-byte header for uncompressed and compressed blobs.
func UnmarshalBlobHeader ¶
func UnmarshalBlobHeader(data []byte) (BlobHeader, error)
UnmarshalBlobHeader parses a BlobHeader from raw bytes.
func (*BlobHeader) MarshalTo ¶
func (h *BlobHeader) MarshalTo(buf []byte)
MarshalTo writes the header to buf (must be at least BlobHeaderSize bytes).
type ChunkResult ¶
type ChunkResult struct {
Digest [32]byte // SHA-256 of raw chunk data
Offset uint64 // start offset in the original stream
Size int // chunk data size in bytes
Exists bool // true if chunk was already in the store
}
ChunkResult describes a single chunk produced by the chunker pipeline.
type ChunkSource ¶
type ChunkSource interface {
// GetChunk retrieves a chunk by its SHA-256 digest.
// Returns the raw chunk data (not decoded/blob-wrapped).
GetChunk(digest [32]byte) ([]byte, error)
}
ChunkSource provides access to chunks by their digest.
type ChunkStore ¶
type ChunkStore struct {
// contains filtered or unexported fields
}
ChunkStore manages chunk storage on the filesystem. Chunks are stored under base/.chunks/XX/XXYY... where XX are the first two hex characters of the SHA-256 digest.
func NewChunkStore ¶
func NewChunkStore(base string) (*ChunkStore, error)
NewChunkStore creates a ChunkStore rooted at base, creating the .chunks directory if needed.
func (*ChunkStore) ChunkPath ¶
func (cs *ChunkStore) ChunkPath(digest [32]byte) string
ChunkPath returns the filesystem path for a chunk identified by digest.
func (*ChunkStore) InsertChunk ¶
InsertChunk stores a chunk. Returns (exists, size, error). If the chunk already exists, returns (true, existingSize, nil).
func (*ChunkStore) LoadChunk ¶
func (cs *ChunkStore) LoadChunk(digest [32]byte) ([]byte, error)
LoadChunk reads a chunk from disk.
func (*ChunkStore) TouchChunk ¶
func (cs *ChunkStore) TouchChunk(digest [32]byte) error
TouchChunk updates the access time of a chunk file.
type ChunkStoreSource ¶
type ChunkStoreSource struct {
// contains filtered or unexported fields
}
ChunkStoreSource adapts a ChunkStore to the ChunkSource interface.
func NewChunkStoreSource ¶
func NewChunkStoreSource(store *ChunkStore) *ChunkStoreSource
NewChunkStoreSource creates a chunk source from a local chunk store.
type DataBlob ¶
type DataBlob struct {
// contains filtered or unexported fields
}
DataBlob represents a stored data blob with optional compression. The raw data contains the magic, CRC, and payload.
func EncodeBlob ¶
EncodeBlob creates an uncompressed blob from data.
func EncodeCompressedBlob ¶
EncodeCompressedBlob creates a compressed blob. Falls back to uncompressed if compression doesn't reduce size.
func (*DataBlob) IsCompressed ¶
IsCompressed returns true if the blob uses compression.
func (*DataBlob) IsEncrypted ¶
IsEncrypted returns true if the blob uses encryption.
type DynamicEntry ¶
DynamicEntry is a single entry in a dynamic index (40 bytes).
type DynamicIndexHeader ¶
DynamicIndexHeader is the 4096-byte header for dynamic chunk index files.
func UnmarshalDynamicIndexHeader ¶
func UnmarshalDynamicIndexHeader(data []byte) (DynamicIndexHeader, error)
UnmarshalDynamicIndexHeader parses a DynamicIndexHeader from raw bytes.
func (*DynamicIndexHeader) MarshalTo ¶
func (h *DynamicIndexHeader) MarshalTo(buf []byte)
MarshalTo writes the header to buf (must be at least IndexHeaderSize bytes).
type DynamicIndexReader ¶
type DynamicIndexReader struct {
// contains filtered or unexported fields
}
DynamicIndexReader reads a dynamic chunk index.
func ReadDynamicIndex ¶
func ReadDynamicIndex(data []byte) (*DynamicIndexReader, error)
ReadDynamicIndex parses a dynamic index from raw bytes.
func (*DynamicIndexReader) CTime ¶
func (r *DynamicIndexReader) CTime() int64
CTime returns the creation timestamp.
func (*DynamicIndexReader) ChunkFromOffset ¶
func (r *DynamicIndexReader) ChunkFromOffset(offset uint64) (int, bool)
ChunkFromOffset returns the chunk index containing the given byte offset. Uses binary search for O(log n) lookup.
func (*DynamicIndexReader) ChunkInfo ¶
func (r *DynamicIndexReader) ChunkInfo(pos int) (ChunkInfo, bool)
ChunkInfo returns the chunk info at position i.
func (*DynamicIndexReader) ComputeCsum ¶
func (r *DynamicIndexReader) ComputeCsum() ([32]byte, uint64)
ComputeCsum computes the SHA-256 checksum over all entry data.
func (*DynamicIndexReader) Count ¶
func (r *DynamicIndexReader) Count() int
Count returns the number of entries.
func (*DynamicIndexReader) Entry ¶
func (r *DynamicIndexReader) Entry(i int) DynamicEntry
Entry returns the entry at position i.
func (*DynamicIndexReader) IndexBytes ¶
func (r *DynamicIndexReader) IndexBytes() uint64
IndexBytes returns the total virtual size (end offset of last entry).
func (*DynamicIndexReader) IndexDigest ¶
func (r *DynamicIndexReader) IndexDigest(pos int) ([32]byte, bool)
IndexDigest returns the digest at position pos.
type DynamicIndexWriter ¶
type DynamicIndexWriter struct {
// contains filtered or unexported fields
}
DynamicIndexWriter builds a dynamic chunk index.
func NewDynamicIndexWriter ¶
func NewDynamicIndexWriter(ctime int64) *DynamicIndexWriter
NewDynamicIndexWriter creates a new writer with the given creation time.
func (*DynamicIndexWriter) Add ¶
func (w *DynamicIndexWriter) Add(endOffset uint64, digest [32]byte)
Add appends an entry with the given end offset and digest.
func (*DynamicIndexWriter) Csum ¶ added in v0.3.0
func (w *DynamicIndexWriter) Csum() [32]byte
Csum returns the SHA-256 checksum over all entry data (end_offset || digest pairs). This matches PBS's compute_csum() and is the checksum stored in the manifest. The result is cached and invalidated by Add().
func (*DynamicIndexWriter) Finish ¶
func (w *DynamicIndexWriter) Finish() ([]byte, error)
Finish writes the complete index and returns the raw bytes.
type EncryptedBlobHeader ¶
EncryptedBlobHeader is the 48-byte header for encrypted blobs.
func UnmarshalEncryptedBlobHeader ¶
func UnmarshalEncryptedBlobHeader(data []byte) (EncryptedBlobHeader, error)
UnmarshalEncryptedBlobHeader parses an EncryptedBlobHeader from raw bytes.
func (*EncryptedBlobHeader) MarshalTo ¶
func (h *EncryptedBlobHeader) MarshalTo(buf []byte)
MarshalTo writes the header to buf (must be at least EncryptedBlobHeaderSize bytes).
type FileInfo ¶
type FileInfo struct {
Filename string `json:"filename"`
CryptMode string `json:"crypt-mode,omitempty"`
Size uint64 `json:"size"`
CSum string `json:"csum"`
}
FileInfo describes a file in a backup manifest.
type FixedIndexHeader ¶
type FixedIndexHeader struct {
Magic [8]byte
UUID [16]byte
Ctime int64
IndexCsum [32]byte
Size uint64
ChunkSize uint64
}
FixedIndexHeader is the 4096-byte header for fixed chunk index files.
func UnmarshalFixedIndexHeader ¶
func UnmarshalFixedIndexHeader(data []byte) (FixedIndexHeader, error)
UnmarshalFixedIndexHeader parses a FixedIndexHeader from raw bytes.
func (*FixedIndexHeader) MarshalTo ¶
func (h *FixedIndexHeader) MarshalTo(buf []byte)
MarshalTo writes the header to buf (must be at least IndexHeaderSize bytes).
type FixedIndexReader ¶
type FixedIndexReader struct {
// contains filtered or unexported fields
}
FixedIndexReader reads a fixed-size chunk index.
func ReadFixedIndex ¶
func ReadFixedIndex(data []byte) (*FixedIndexReader, error)
ReadFixedIndex parses a fixed index from raw bytes.
func (*FixedIndexReader) CTime ¶
func (r *FixedIndexReader) CTime() int64
CTime returns the creation timestamp.
func (*FixedIndexReader) ChunkFromOffset ¶
func (r *FixedIndexReader) ChunkFromOffset(offset uint64) (int, bool)
ChunkFromOffset returns the chunk index for the given byte offset.
func (*FixedIndexReader) ChunkInfo ¶
func (r *FixedIndexReader) ChunkInfo(pos int) (ChunkInfo, bool)
ChunkInfo returns chunk info at position pos.
func (*FixedIndexReader) ComputeCsum ¶
func (r *FixedIndexReader) ComputeCsum() ([32]byte, uint64)
ComputeCsum computes the SHA-256 checksum over all digests.
func (*FixedIndexReader) Count ¶
func (r *FixedIndexReader) Count() int
Count returns the number of chunks.
func (*FixedIndexReader) IndexBytes ¶
func (r *FixedIndexReader) IndexBytes() uint64
IndexBytes returns the total virtual size.
func (*FixedIndexReader) IndexDigest ¶
func (r *FixedIndexReader) IndexDigest(pos int) ([32]byte, bool)
IndexDigest returns the digest at position pos.
type FixedIndexWriter ¶
type FixedIndexWriter struct {
// contains filtered or unexported fields
}
FixedIndexWriter builds a fixed-size chunk index.
func NewFixedIndexWriter ¶
func NewFixedIndexWriter(ctime int64, size, chunkSize uint64) (*FixedIndexWriter, error)
NewFixedIndexWriter creates a writer. ChunkSize must be a power of 2.
func (*FixedIndexWriter) Finish ¶
func (w *FixedIndexWriter) Finish() ([]byte, error)
Finish writes the complete index and returns raw bytes.
func (*FixedIndexWriter) Set ¶
func (w *FixedIndexWriter) Set(i int, digest [32]byte)
Set sets the digest for chunk at index i.
type Manifest ¶
type Manifest struct {
BackupType string `json:"backup-type"`
BackupID string `json:"backup-id"`
BackupTime int64 `json:"backup-time"`
Files []FileInfo `json:"files"`
Signature string `json:"signature,omitempty"`
}
Manifest represents a backup manifest (index.json).
func UnmarshalManifest ¶
UnmarshalManifest parses a manifest from JSON.
type Restorer ¶
type Restorer struct {
// contains filtered or unexported fields
}
Restorer reconstructs files from dynamic indexes using a chunk source.
func NewRestorer ¶
func NewRestorer(source ChunkSource) *Restorer
NewRestorer creates a new restorer with the given chunk source.
func (*Restorer) FileSize ¶
func (r *Restorer) FileSize(idx *DynamicIndexReader) uint64
FileSize returns the total size of the file represented by the index.
func (*Restorer) RestoreFile ¶
func (r *Restorer) RestoreFile(idx *DynamicIndexReader, w io.Writer) error
RestoreFile reconstructs a complete file from a dynamic index. Writes the reconstructed file content to w.
func (*Restorer) RestoreRange ¶
RestoreRange reconstructs a specific byte range from a dynamic index. Useful for partial reads without downloading the entire file.
type StoreChunker ¶
type StoreChunker struct {
// contains filtered or unexported fields
}
StoreChunker splits a data stream into variable-size chunks using buzhash content-defined chunking, computes digests, stores chunks via ChunkStore, and builds a DynamicIndexWriter.
func NewStoreChunker ¶
func NewStoreChunker(store *ChunkStore, config buzhash.Config, compress bool) *StoreChunker
NewStoreChunker creates a chunker pipeline. If compress is true, chunks are stored as compressed DataBlobs; otherwise as uncompressed blobs.
func (*StoreChunker) ChunkStream ¶
func (sc *StoreChunker) ChunkStream(r io.Reader) ([]ChunkResult, *DynamicIndexWriter, error)
ChunkStream reads all data from r, splits it into chunks, stores each chunk, and builds a dynamic index. Returns the chunk results and the completed index writer (Finish has NOT been called on it yet).
func (*StoreChunker) ChunkStreamCallback ¶
func (sc *StoreChunker) ChunkStreamCallback(r io.Reader, fn func(ChunkResult) error) ([]ChunkResult, *DynamicIndexWriter, error)
ChunkStreamCallback is like ChunkStream but calls fn for each chunk after it is stored. If fn returns a non-nil error, chunking stops and the error is returned. If fn is nil, no callback is made.