content

package
v0.9.0-rc1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 11, 2021 License: Apache-2.0 Imports: 42 Imported by: 4

Documentation

Overview

Package content implements repository support for content-addressable storage.

Index

Constants

View Source
const (
	PackBlobIDPrefixRegular blob.ID = "p"
	PackBlobIDPrefixSpecial blob.ID = "q"

	NoCompression compression.HeaderID = 0

	FormatLogModule = "kopia/format"

	DefaultIndexVersion = 2
)

Prefixes for pack blobs.

View Source
const BlobIDPrefixSession blob.ID = "s"

BlobIDPrefixSession is the prefix for blob IDs indicating active sessions. Each blob ID will consist of {sessionID}.{suffix}.

View Source
const IndexBlobPrefix = "n"

IndexBlobPrefix is the prefix for all index blobs.

View Source
const TextLogBlobPrefix = "_log_"

TextLogBlobPrefix is a prefix given to text logs stored in repositor.

Variables

View Source
var AllIDs = IDRange{"", maxIDCharacterPlus1}

AllIDs is an IDRange that contains all valid IDs.

View Source
var AllNonPrefixedIDs = IDRange{"0", "g"}

AllNonPrefixedIDs is an IDRange that contains all valid IDs non-prefixed IDs ('0' .. 'f').

View Source
var AllPrefixedIDs = IDRange{"g", maxIDCharacterPlus1}

AllPrefixedIDs is an IDRange that contains all valid IDs prefixed IDs ('g' .. 'z').

View Source
var ErrContentNotFound = errors.New("content not found")

ErrContentNotFound is returned when content is not found.

PackBlobIDPrefixes contains all possible prefixes for pack blobs.

Functions

func ValidatePrefix added in v0.6.0

func ValidatePrefix(prefix ID) error

ValidatePrefix returns an error if a given prefix is invalid.

Types

type CachingOptions

type CachingOptions struct {
	CacheDirectory            string `json:"cacheDirectory,omitempty"`
	MaxCacheSizeBytes         int64  `json:"maxCacheSize,omitempty"`
	MaxMetadataCacheSizeBytes int64  `json:"maxMetadataCacheSize,omitempty"`
	MaxListCacheDurationSec   int    `json:"maxListCacheDuration,omitempty"`
	HMACSecret                []byte `json:"-"`
}

CachingOptions specifies configuration of local cache.

func (*CachingOptions) CloneOrDefault added in v0.6.0

func (c *CachingOptions) CloneOrDefault() *CachingOptions

CloneOrDefault returns a clone of the caching options or empty options for nil.

type CompactOptions

type CompactOptions struct {
	MaxSmallBlobs                    int
	AllIndexes                       bool
	DropDeletedBefore                time.Time
	DropContents                     []ID
	DisableEventualConsistencySafety bool
}

CompactOptions provides options for compaction.

type Crypter added in v0.9.0

type Crypter struct {
	HashFunction hashing.HashFunc
	Encryptor    encryption.Encryptor
}

Crypter ecapsulates hashing and encryption and provides utilities for whole-BLOB encryption. Whole-BLOB encryption relies on BLOB identifiers formatted as:

<prefix><hash>[-optionalSuffix]

Where:

'prefix' is arbitrary string without dashes
'hash' is base16-encoded 128-bit hash of contents, used as initialization vector (IV)
       for the encryption. In case of longer hash functions, we use last 16 bytes of
       their outputs.
'optionalSuffix' can be any string

func CreateCrypter added in v0.9.0

func CreateCrypter(f *FormattingOptions) (*Crypter, error)

CreateCrypter returns a Crypter based on the specified formatting options.

func (*Crypter) DecryptBLOB added in v0.9.0

func (c *Crypter) DecryptBLOB(payload gather.Bytes, blobID blob.ID, output *gather.WriteBuffer) error

DecryptBLOB decrypts the provided data using provided blobID to derive initialization vector.

func (*Crypter) EncryptBLOB added in v0.9.0

func (c *Crypter) EncryptBLOB(payload gather.Bytes, prefix blob.ID, sessionID SessionID, output *gather.WriteBuffer) (blob.ID, error)

EncryptBLOB encrypts the given data using crypter-defined key and returns a name that should be used to save the blob in thre repository.

type FormatV1 added in v0.9.0

type FormatV1 struct {
	Version    byte   // format version number must be 0x01
	KeySize    byte   // size of each key in bytes
	EntrySize  uint16 // size of each entry in bytes, big-endian
	EntryCount uint32 // number of sorted (key,value) entries that follow

	Entries []struct {
		Key   []byte // key bytes (KeySize)
		Entry indexEntryInfoV1
	}

	ExtraData []byte // extra data
}

FormatV1 describes a format of a single pack index. The actual structure is not used, it's purely for documentation purposes. The struct is byte-aligned.

type FormatV2 added in v0.9.0

type FormatV2 struct {
	Header struct {
		Version           byte   // format version number must be 0x02
		KeySize           byte   // size of each key in bytes
		EntrySize         uint16 // size of each entry in bytes, big-endian
		EntryCount        uint32 // number of sorted (key,value) entries that follow
		EntriesOffset     uint32 // offset where `Entries` begins
		FormatInfosOffset uint32 // offset where `Formats` begins
		NumFormatInfos    uint32
		PacksOffset       uint32 // offset where `Packs` begins
		NumPacks          uint32
		BaseTimestamp     uint32 // base timestamp in unix seconds
	}

	Entries []struct {
		Key   []byte // key bytes (KeySize)
		Entry indexV2EntryInfo
	}

	// each entry contains offset+length of the name of the pack blob, so that each entry can refer to the index
	// and it resolves to a name.
	Packs []struct {
		PackNameLength byte   // length of the filename
		PackNameOffset uint32 // offset to data (within extra data)
	}

	// each entry represents unique content format.
	Formats []indexV2FormatInfo

	ExtraData []byte // extra data
}

FormatV2 describes a format of a single pack index. The actual structure is not used, it's purely for documentation purposes. The struct is byte-aligned.

type FormatVersion added in v0.9.0

type FormatVersion int

FormatVersion denotes content format version.

const (
	FormatVersion1 FormatVersion = 1
	FormatVersion2 FormatVersion = 2 // new in v0.9
)

Supported format versions.

type FormattingOptions

type FormattingOptions struct {
	Hash       string `json:"hash,omitempty"`       // identifier of the hash algorithm used
	Encryption string `json:"encryption,omitempty"` // identifier of the encryption algorithm used
	HMACSecret []byte `json:"secret,omitempty"`     // HMAC secret used to generate encryption keys
	MasterKey  []byte `json:"masterKey,omitempty"`  // master encryption key (SIV-mode encryption only)
	MutableParameters

	EnablePasswordChange bool `json:"enablePasswordChange"` // disables replication of kopia.repository blob in packs
}

FormattingOptions describes the rules for formatting contents in repository.

func (*FormattingOptions) GetEncryptionAlgorithm added in v0.5.2

func (f *FormattingOptions) GetEncryptionAlgorithm() string

GetEncryptionAlgorithm implements encryption.Parameters.

func (*FormattingOptions) GetHashFunction added in v0.5.2

func (f *FormattingOptions) GetHashFunction() string

GetHashFunction implements hashing.Parameters.

func (*FormattingOptions) GetHmacSecret added in v0.8.0

func (f *FormattingOptions) GetHmacSecret() []byte

GetHmacSecret implements hashing.Parameters.

func (*FormattingOptions) GetMasterKey added in v0.5.2

func (f *FormattingOptions) GetMasterKey() []byte

GetMasterKey implements encryption.Parameters.

func (*FormattingOptions) ResolveFormatVersion added in v0.9.0

func (f *FormattingOptions) ResolveFormatVersion() error

ResolveFormatVersion applies format options parameters based on the format version.

type ID

type ID string

ID is an identifier of content in content-addressable storage.

func (ID) HasPrefix

func (i ID) HasPrefix() bool

HasPrefix determines if the given ID has a non-empty prefix.

func (ID) Prefix

func (i ID) Prefix() ID

Prefix returns a one-character prefix of a content ID or an empty string.

type IDRange added in v0.6.0

type IDRange struct {
	StartID ID // inclusive
	EndID   ID // exclusive
}

IDRange represents a range of IDs.

func PrefixRange added in v0.6.0

func PrefixRange(prefix ID) IDRange

PrefixRange returns ID range that contains all IDs with a given prefix.

func (IDRange) Contains added in v0.6.0

func (r IDRange) Contains(id ID) bool

Contains determines whether given ID is in the range.

type IndexBlobInfo

type IndexBlobInfo struct {
	blob.Metadata
	Superseded []blob.Metadata
}

IndexBlobInfo is an information about a single index blob managed by Manager.

type Info

type Info interface {
	GetContentID() ID
	GetPackBlobID() blob.ID
	GetTimestampSeconds() int64
	Timestamp() time.Time
	GetOriginalLength() uint32
	GetPackedLength() uint32
	GetPackOffset() uint32
	GetDeleted() bool
	GetFormatVersion() byte
	GetCompressionHeaderID() compression.HeaderID
	GetEncryptionKeyID() byte
}

Info is an information about a single piece of content managed by Manager.

func ParseIndexBlob added in v0.9.0

func ParseIndexBlob(ctx context.Context, blobID blob.ID, encrypted gather.Bytes, crypter *Crypter) ([]Info, error)

ParseIndexBlob loads entries in a given index blob and returns them.

type InfoStruct added in v0.9.0

type InfoStruct struct {
	ContentID           ID                   `json:"contentID"`
	PackBlobID          blob.ID              `json:"packFile,omitempty"`
	TimestampSeconds    int64                `json:"time"`
	OriginalLength      uint32               `json:"originalLength"`
	PackedLength        uint32               `json:"length"`
	PackOffset          uint32               `json:"packOffset,omitempty"`
	Deleted             bool                 `json:"deleted"`
	FormatVersion       byte                 `json:"formatVersion"`
	CompressionHeaderID compression.HeaderID `json:"compression,omitempty"`
	EncryptionKeyID     byte                 `json:"encryptionKeyID,omitempty"`
}

InfoStruct is an implementation of Info based on a structure.

func ToInfoStruct added in v0.9.0

func ToInfoStruct(i Info) *InfoStruct

ToInfoStruct converts the provided Info to *InfoStruct.

func (*InfoStruct) GetCompressionHeaderID added in v0.9.0

func (i *InfoStruct) GetCompressionHeaderID() compression.HeaderID

GetCompressionHeaderID implements the Info interface.

func (*InfoStruct) GetContentID added in v0.9.0

func (i *InfoStruct) GetContentID() ID

GetContentID implements the Info interface.

func (*InfoStruct) GetDeleted added in v0.9.0

func (i *InfoStruct) GetDeleted() bool

GetDeleted implements the Info interface.

func (*InfoStruct) GetEncryptionKeyID added in v0.9.0

func (i *InfoStruct) GetEncryptionKeyID() byte

GetEncryptionKeyID implements the Info interface.

func (*InfoStruct) GetFormatVersion added in v0.9.0

func (i *InfoStruct) GetFormatVersion() byte

GetFormatVersion implements the Info interface.

func (*InfoStruct) GetOriginalLength added in v0.9.0

func (i *InfoStruct) GetOriginalLength() uint32

GetOriginalLength implements the Info interface.

func (*InfoStruct) GetPackBlobID added in v0.9.0

func (i *InfoStruct) GetPackBlobID() blob.ID

GetPackBlobID implements the Info interface.

func (*InfoStruct) GetPackOffset added in v0.9.0

func (i *InfoStruct) GetPackOffset() uint32

GetPackOffset implements the Info interface.

func (*InfoStruct) GetPackedLength added in v0.9.0

func (i *InfoStruct) GetPackedLength() uint32

GetPackedLength implements the Info interface.

func (*InfoStruct) GetTimestampSeconds added in v0.9.0

func (i *InfoStruct) GetTimestampSeconds() int64

GetTimestampSeconds implements the Info interface.

func (*InfoStruct) Timestamp added in v0.9.0

func (i *InfoStruct) Timestamp() time.Time

Timestamp implements the Info interface.

type IterateCallback

type IterateCallback func(Info) error

IterateCallback is the function type used as a callback during content iteration.

type IterateOptions

type IterateOptions struct {
	Range          IDRange
	IncludeDeleted bool
	Parallel       int
}

IterateOptions contains the options used for iterating over content.

type IteratePackOptions

type IteratePackOptions struct {
	IncludePacksWithOnlyDeletedContent bool
	IncludeContentInfos                bool
	Prefixes                           []blob.ID
}

IteratePackOptions are the options used to iterate over packs.

type IteratePacksCallback

type IteratePacksCallback func(PackInfo) error

IteratePacksCallback is the function type used as callback during pack iteration.

type ManagerOptions added in v0.6.0

type ManagerOptions struct {
	RepositoryFormatBytes []byte
	TimeNow               func() time.Time // Time provider
	DisableInternalLog    bool
}

ManagerOptions are the optional parameters for manager creation.

func (*ManagerOptions) CloneOrDefault added in v0.8.0

func (o *ManagerOptions) CloneOrDefault() *ManagerOptions

CloneOrDefault returns a clone of provided ManagerOptions or default empty struct if nil.

type MutableParameters added in v0.9.0

type MutableParameters struct {
	Version         FormatVersion    `json:"version,omitempty"`         // version number, must be "1" or "2"
	MaxPackSize     int              `json:"maxPackSize,omitempty"`     // maximum size of a pack object
	IndexVersion    int              `json:"indexVersion,omitempty"`    // force particular index format version (1,2,..)
	EpochParameters epoch.Parameters `json:"epochParameters,omitempty"` // epoch manager parameters
}

MutableParameters represents parameters of the content manager that can be mutated after the repository is created.

func (*MutableParameters) Validate added in v0.9.0

func (v *MutableParameters) Validate() error

Validate validates the parameters.

type PackInfo

type PackInfo struct {
	PackID       blob.ID
	ContentCount int
	TotalSize    int64
	ContentInfos []Info
}

PackInfo contains the data for a pack.

type Reader added in v0.8.0

type Reader interface {
	SupportsContentCompression() bool
	ContentFormat() FormattingOptions
	GetContent(ctx context.Context, id ID) ([]byte, error)
	ContentInfo(ctx context.Context, id ID) (Info, error)
	IterateContents(ctx context.Context, opts IterateOptions, callback IterateCallback) error
	IteratePacks(ctx context.Context, opts IteratePackOptions, callback IteratePacksCallback) error
	ListActiveSessions(ctx context.Context) (map[SessionID]*SessionInfo, error)
	EpochManager() (*epoch.Manager, bool)
}

Reader defines content read API.

type SessionID added in v0.8.0

type SessionID string

SessionID represents identifier of a session.

func SessionIDFromBlobID added in v0.8.0

func SessionIDFromBlobID(b blob.ID) SessionID

SessionIDFromBlobID returns session ID from a given blob ID or empty string if it's not a session blob ID.

type SessionInfo added in v0.8.0

type SessionInfo struct {
	ID             SessionID `json:"id"`
	StartTime      time.Time `json:"startTime"`
	CheckpointTime time.Time `json:"checkpointTime"`
	User           string    `json:"username"`
	Host           string    `json:"hostname"`
}

SessionInfo describes a particular session and is persisted in Session blob.

type SessionOptions added in v0.8.0

type SessionOptions struct {
	SessionUser string
	SessionHost string
	OnUpload    func(int64)
}

SessionOptions specifies session options.

type SharedManager added in v0.8.0

type SharedManager struct {
	Stats *Stats
	// contains filtered or unexported fields
}

SharedManager is responsible for read-only access to committed data.

func NewSharedManager added in v0.8.0

func NewSharedManager(ctx context.Context, st blob.Storage, f *FormattingOptions, caching *CachingOptions, opts *ManagerOptions) (*SharedManager, error)

NewSharedManager returns SharedManager that is used by SessionWriteManagers on top of a repository.

func (*SharedManager) CompactIndexes added in v0.9.0

func (sm *SharedManager) CompactIndexes(ctx context.Context, opt CompactOptions) error

CompactIndexes performs compaction of index blobs ensuring that # of small index blobs is below opt.maxSmallBlobs.

func (*SharedManager) Crypter added in v0.9.0

func (sm *SharedManager) Crypter() *Crypter

Crypter returns the crypter.

func (*SharedManager) EpochManager added in v0.9.0

func (sm *SharedManager) EpochManager() (*epoch.Manager, bool)

EpochManager returns the epoch manager.

func (*SharedManager) IndexBlobs added in v0.8.0

func (sm *SharedManager) IndexBlobs(ctx context.Context, includeInactive bool) ([]IndexBlobInfo, error)

IndexBlobs returns the list of active index blobs.

func (*SharedManager) InternalLogger added in v0.9.0

func (sm *SharedManager) InternalLogger() logging.Logger

InternalLogger returns the internal logger.

func (*SharedManager) PrepareUpgradeToIndexBlobManagerV1 added in v0.9.0

func (sm *SharedManager) PrepareUpgradeToIndexBlobManagerV1(ctx context.Context, params epoch.Parameters) error

PrepareUpgradeToIndexBlobManagerV1 prepares the repository for migrating to IndexBlobManagerV1.

func (*SharedManager) Refresh added in v0.9.0

func (sm *SharedManager) Refresh(ctx context.Context) error

Refresh reloads the committed content indexes.

type Stats

type Stats struct {
	// contains filtered or unexported fields
}

Stats exposes statistics about content operation.

func (*Stats) DecryptedBytes

func (s *Stats) DecryptedBytes() int64

DecryptedBytes returns the approximate total number of decrypted bytes.

func (*Stats) EncryptedBytes

func (s *Stats) EncryptedBytes() int64

EncryptedBytes returns the approximate total number of decrypted bytes.

func (*Stats) HashedContent added in v0.6.0

func (s *Stats) HashedContent() (count uint32, bytes int64)

HashedContent returns the approximate hashed content count and their total size in bytes.

func (*Stats) InvalidContents

func (s *Stats) InvalidContents() uint32

InvalidContents returns the approximate count of invalid contents found.

func (*Stats) ReadContent added in v0.6.0

func (s *Stats) ReadContent() (count uint32, bytes int64)

ReadContent returns the approximate read content count and their total size in bytes.

func (*Stats) Reset

func (s *Stats) Reset()

Reset clears all content statistics.

func (*Stats) ValidContents

func (s *Stats) ValidContents() uint32

ValidContents returns the approximate count of valid contents found.

func (*Stats) WrittenContent added in v0.6.0

func (s *Stats) WrittenContent() (count uint32, bytes int64)

WrittenContent returns the approximate written content count and their total size in bytes.

type WriteManager added in v0.8.0

type WriteManager struct {
	*SharedManager
	// contains filtered or unexported fields
}

WriteManager builds content-addressable storage with encryption, deduplication and packaging on top of BLOB store.

func NewManagerForTesting added in v0.9.0

func NewManagerForTesting(ctx context.Context, st blob.Storage, f *FormattingOptions, caching *CachingOptions, options *ManagerOptions) (*WriteManager, error)

NewManagerForTesting creates new content manager with given packing options and a formatter.

func NewWriteManager added in v0.8.0

func NewWriteManager(ctx context.Context, sm *SharedManager, options SessionOptions, writeManagerID string) *WriteManager

NewWriteManager returns a session write manager.

func (*WriteManager) Close added in v0.8.0

func (bm *WriteManager) Close(ctx context.Context) error

Close closes the content manager.

func (*WriteManager) ContentFormat added in v0.8.0

func (bm *WriteManager) ContentFormat() FormattingOptions

ContentFormat returns formatting options.

func (*WriteManager) ContentInfo added in v0.8.0

func (bm *WriteManager) ContentInfo(ctx context.Context, contentID ID) (Info, error)

ContentInfo returns information about a single content.

func (*WriteManager) DeleteContent added in v0.8.0

func (bm *WriteManager) DeleteContent(ctx context.Context, contentID ID) error

DeleteContent marks the given contentID as deleted.

NOTE: To avoid race conditions only contents that cannot be possibly re-created should ever be deleted. That means that contents of such contents should include some element of randomness or a contemporaneous timestamp that will never reappear.

func (*WriteManager) DisableIndexFlush added in v0.8.0

func (bm *WriteManager) DisableIndexFlush(ctx context.Context)

DisableIndexFlush increments the counter preventing automatic index flushes.

func (*WriteManager) DisableIndexRefresh added in v0.9.0

func (bm *WriteManager) DisableIndexRefresh()

DisableIndexRefresh disables index refresh for the remainder of this session.

func (*WriteManager) EnableIndexFlush added in v0.8.0

func (bm *WriteManager) EnableIndexFlush(ctx context.Context)

EnableIndexFlush decrements the counter preventing automatic index flushes. The flushes will be reenabled when the index drops to zero.

func (*WriteManager) Flush added in v0.8.0

func (bm *WriteManager) Flush(ctx context.Context) error

Flush completes writing any pending packs and writes pack indexes to the underlying storage. Any pending writes completed before Flush() has started are guaranteed to be committed to the repository before Flush() returns.

func (*WriteManager) GetContent added in v0.8.0

func (bm *WriteManager) GetContent(ctx context.Context, contentID ID) (v []byte, err error)

GetContent gets the contents of a given content. If the content is not found returns ErrContentNotFound.

func (*WriteManager) IterateContents added in v0.8.0

func (bm *WriteManager) IterateContents(ctx context.Context, opts IterateOptions, callback IterateCallback) error

IterateContents invokes the provided callback for each content starting with a specified prefix and possibly including deleted items.

func (*WriteManager) IteratePacks added in v0.8.0

func (bm *WriteManager) IteratePacks(ctx context.Context, options IteratePackOptions, callback IteratePacksCallback) error

IteratePacks invokes the provided callback for all pack blobs.

func (*WriteManager) IterateUnreferencedBlobs added in v0.8.0

func (bm *WriteManager) IterateUnreferencedBlobs(ctx context.Context, blobPrefixes []blob.ID, parallellism int, callback func(blob.Metadata) error) error

IterateUnreferencedBlobs returns the list of unreferenced storage blobs.

func (*WriteManager) ListActiveSessions added in v0.8.0

func (bm *WriteManager) ListActiveSessions(ctx context.Context) (map[SessionID]*SessionInfo, error)

ListActiveSessions returns a set of all active sessions in a given storage.

func (*WriteManager) RecoverIndexFromPackBlob added in v0.8.0

func (bm *WriteManager) RecoverIndexFromPackBlob(ctx context.Context, packFile blob.ID, packFileLength int64, commit bool) ([]Info, error)

RecoverIndexFromPackBlob attempts to recover index blob entries from a given pack file. Pack file length may be provided (if known) to reduce the number of bytes that are read from the storage.

func (*WriteManager) Revision added in v0.8.0

func (bm *WriteManager) Revision() int64

Revision returns data revision number that changes on each write or refresh.

func (*WriteManager) RewriteContent added in v0.8.0

func (bm *WriteManager) RewriteContent(ctx context.Context, contentID ID) error

RewriteContent causes reads and re-writes a given content using the most recent format. TODO(jkowalski): this will currently always re-encrypt and re-compress data, perhaps consider a pass-through mode that preserves encrypted/compressed bits.

func (*WriteManager) SupportsContentCompression added in v0.9.0

func (bm *WriteManager) SupportsContentCompression() bool

SupportsContentCompression returns true if content manager supports content-compression.

func (*WriteManager) SyncMetadataCache added in v0.8.0

func (bm *WriteManager) SyncMetadataCache(ctx context.Context) error

SyncMetadataCache synchronizes metadata cache with metadata blobs in storage.

func (*WriteManager) UndeleteContent added in v0.8.0

func (bm *WriteManager) UndeleteContent(ctx context.Context, contentID ID) error

UndeleteContent rewrites the content with the given ID if the content exists and is mark deleted. If the content exists and is not marked deleted, this operation is a no-op.

func (*WriteManager) WriteContent added in v0.8.0

func (bm *WriteManager) WriteContent(ctx context.Context, data []byte, prefix ID, comp compression.HeaderID) (ID, error)

WriteContent saves a given content of data to a pack group with a provided name and returns a contentID that's based on the contents of data written.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL