filesystem

package
v0.0.0-...-20d847b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 28, 2024 License: Apache-2.0 Imports: 23 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CreateDirectoryMerkleTree

func CreateDirectoryMerkleTree[TDirectory, TFile model_core.ReferenceMetadata](
	ctx context.Context,
	concurrency *semaphore.Weighted,
	group *errgroup.Group,
	directoryParameters *DirectoryCreationParameters,
	directory CapturableDirectory[TDirectory, TFile],
	capturer DirectoryMerkleTreeCapturer[TDirectory, TFile],
	out *model_core.PatchedMessage[*model_filesystem_pb.Directory, TDirectory],
) error

CreateDirectoryMerkleTree creates a Merkle tree that corresponds to the contents of a given directory. Upon success, a Directory message corresponding with the root directory is returned.

Computation of the Merkle trees of the individual files can be done in parallel. These processes may terminate asynchronously, meaning that group.Wait() needs to be called to ensure that the capturer is invoked for all objects, and the output message is set.

func CreateFileMerkleTree

CreateFileMerkleTree creates a Merkle tree structure that corresponds to the contents of a single file. If a file is small, it stores all of its contents in a single object. If a file is large, it creates a B-tree, where chunks of data are joined together using a FileContentsList message.

Chunking of large files is performed using the MaxCDC algorithm. The resulting B-tree is a Prolly tree. This ensures that minor changes to a file also result in minor changes to the resulting Merkle tree.

func NewCapturedDirectoryWalker

func NewCapturedDirectoryWalker(directoryParameters *DirectoryAccessParameters, fileParameters *FileCreationParameters, rootDirectory CapturedDirectory, rootObject *CapturedObject) dag.ObjectContentsWalker

NewCapturedDirectoryWalker returns an implementation of ObjectContentsWalker that is capable of walking over a hierarchy over Directory and Leaves objects that were created using CreateDirectoryMerkleTree and captured using FileDiscardingDirectoryMerkleTreeCapturer. This makes it possible to upload such directory hierarchies to a storage server.

These Merkle trees do not contain any file contents, but it is permitted for the storage server to request them. If that happens, we must reobtain them from the underlying file system. This is why the caller must provide a handle to the root directory on which the provided Merkle tree is based.

func NewCapturedFileWalker

func NewCapturedFileWalker(fileParameters *FileCreationParameters, r filesystem.FileReader, fileReference object.LocalReference, fileSizeBytes uint64, fileObject *CapturedObject) dag.ObjectContentsWalker

func NewDirectoryClusterObjectParser

func NewDirectoryClusterObjectParser[TReference DirectoryClusterObjectParserReference[TReference]](leavesReader parser.ParsedObjectReader[TReference, model_core.Message[*model_filesystem_pb.Leaves]]) parser.ObjectParser[TReference, DirectoryCluster]

NewDirectoryClusterObjectParser creates an ObjectParser that is capable of parsing directory objects. These directory objects may either be empty, contain subdirectories, or leaves.

func NewFileContentsListObjectParser

func NewFileContentsListObjectParser[TReference FileContentsListObjectParserReference[TReference]]() parser.ObjectParser[TReference, FileContentsList]

NewFileContentsListObjectParser creates an ObjectParser that is capable of parsing FileContentsList messages, turning them into a list of entries that can be processed by FileContentsIterator.

Types

type CapturableDirectory

type CapturableDirectory[TDirectory, TFile model_core.ReferenceMetadata] interface {
	// Identical to filesystem.Directory.
	Close() error
	ReadDir() ([]filesystem.FileInfo, error)
	Readlink(name path.Component) (path.Parser, error)

	// Enter a directory, so that it may be traversed. The
	// implementation has the possibility to return an existing
	// Directory message. This can be of use when computing a Merkle
	// tree that is based on an existing directory structure that is
	// altered slightly.
	EnterCapturableDirectory(name path.Component) (model_core.PatchedMessage[*model_filesystem_pb.Directory, TDirectory], CapturableDirectory[TDirectory, TFile], error)

	// Open a file, so that a Merkle tree of its contents can be
	// computed. The actual Merkle tree computation is performed by
	// calling CapturableFile.CreateFileMerkleTree(). That way file
	// Merkle tree computation can happen in parallel.
	OpenForFileMerkleTreeCreation(name path.Component) (CapturableFile[TFile], error)
}

CapturableDirectory is an interface for a directory that can be traversed by CreateFileMerkleTree().

type CapturableFile

type CapturableFile[TFile model_core.ReferenceMetadata] interface {
	CreateFileMerkleTree(ctx context.Context) (model_core.PatchedMessage[*model_filesystem_pb.FileContents, TFile], error)
	Discard()
}

CapturableFile is called into by CreateFileMerkleTree() to obtain a Merkle tree for a given file. Either one of these methods will be called exactly once.

type CapturedDirectory

type CapturedDirectory interface {
	Close() error
	EnterCapturedDirectory(name path.Component) (CapturedDirectory, error)
	OpenRead(name path.Component) (filesystem.FileReader, error)
}

CapturedDirectory is called into by CapturedDirectoryWalker to traverse a directory hierarchy and read file contents.

type CapturedObject

type CapturedObject struct {
	Contents *object.Contents
	Children []CapturedObject
}

func (CapturedObject) Discard

func (CapturedObject) Discard()

type Directory

type Directory struct {
	Directories      []DirectoryNode
	Leaves           model_core.Message[*model_filesystem_pb.Leaves]
	LeavesReferences object.OutgoingReferencesList
}

Directory contained in a DirectoryCluster.

type DirectoryAccessParameters

type DirectoryAccessParameters struct {
	// contains filtered or unexported fields
}

DirectoryAccessParameters contains parameters that were used when creating Merkle trees of directories that should also be applied when attempting to access its contents afterwards. Parameters include whether files were compressed or encrypted.

func NewDirectoryAccessParametersFromProto

func NewDirectoryAccessParametersFromProto(m *model_filesystem_pb.DirectoryAccessParameters, referenceFormat object.ReferenceFormat) (*DirectoryAccessParameters, error)

NewDirectoryAccessParametersFromProto creates an instance of DirectoryAccessParameters that matches the configuration stored in a Protobuf message. This, for example, permits a server to access files that were uploaded by a client.

func (*DirectoryAccessParameters) DecodeDirectory

func (p *DirectoryAccessParameters) DecodeDirectory(contents *object.Contents) (*model_filesystem_pb.Directory, error)

func (*DirectoryAccessParameters) DecodeLeaves

func (*DirectoryAccessParameters) GetEncoder

type DirectoryCluster

type DirectoryCluster []Directory

DirectoryCluster is a list of all Directory messages that are contained in a single object in storage. Directories are stored in topological order, meaning that the root directory is located at index zero.

type DirectoryClusterObjectParserReference

type DirectoryClusterObjectParserReference[T any] interface {
	GetLocalReference() object.LocalReference
	WithLocalReference(localReference object.LocalReference) T
}

DirectoryClusterObjectParserReference is a constraint on the reference types accepted by the ObjectParser returned by NewDirectoryClusterObjectParser.

type DirectoryCreationParameters

type DirectoryCreationParameters struct {
	*DirectoryAccessParameters
	// contains filtered or unexported fields
}

type DirectoryInfo

type DirectoryInfo struct {
	ClusterReference object.LocalReference
	DirectoryIndex   uint
	DirectoriesCount uint32
}

DirectoryInfo holds all of the properties of a directory that could be derived from its parent directory.

type DirectoryMerkleTreeCapturer

type DirectoryMerkleTreeCapturer[TDirectory, TFile any] interface {
	CaptureFileNode(TFile) TDirectory
	CaptureDirectory(contents *object.Contents, children []TDirectory) TDirectory
	CaptureLeaves(contents *object.Contents, children []TDirectory) TDirectory
}
var FileDiscardingDirectoryMerkleTreeCapturer DirectoryMerkleTreeCapturer[CapturedObject, model_core.NoopReferenceMetadata] = fileDiscardingDirectoryMerkleTreeCapturer{}

FileDiscardingDirectoryMerkleTreeCapturer is an instance of DirectoryMerkleTreeCapturer that keeps any Directory and Leaves objects, but discards FileContentsList and file chunk objects.

Discarding the contents of files is typically the right approach for uploading directory structures with changes to only a small number of files. The Merkle trees of files can be recomputed if it turns out they still need to be uploaded.

type DirectoryMerkleTreeFileResolver

type DirectoryMerkleTreeFileResolver struct {
	// contains filtered or unexported fields
}

DirectoryMerkleTreeFileResolver is an implementation of path.ComponentWalker that resolves the FileProperties corresponding to a given path. It can be used to look up files contained in a Merkle tree of Directory and Leaves messages.

func NewDirectoryMerkleTreeFileResolver

NewDirectoryMerkleTreeFileResolver creates a DirectoryMerkleTreeFileResolver that starts resolution within a provided root directory.

func (*DirectoryMerkleTreeFileResolver) GetFileProperties

func (*DirectoryMerkleTreeFileResolver) OnDirectory

func (*DirectoryMerkleTreeFileResolver) OnTerminal

func (DirectoryMerkleTreeFileResolver) OnUp

type DirectoryNode

type DirectoryNode struct {
	Name path.Component
	Info DirectoryInfo
}

DirectoryNode contains the name and properties of a directory that is contained within another directory.

type FileAccessParameters

type FileAccessParameters struct {
	// contains filtered or unexported fields
}

FileAccessParameters contains parameters that were used when creating Merkle trees of files that should also be applied when attempting to access its contents afterwards. Parameters include whether files were compressed or encrypted.

func NewFileAccessParametersFromProto

func NewFileAccessParametersFromProto(m *model_filesystem_pb.FileAccessParameters, referenceFormat object.ReferenceFormat) (*FileAccessParameters, error)

NewFileAccessParametersFromProto creates an instance of FileAccessParameters that matches the configuration stored in a Protobuf message. This, for example, permits a server to access files that were uploaded by a client.

func (*FileAccessParameters) DecodeFileContentsList

func (p *FileAccessParameters) DecodeFileContentsList(contents *object.Contents) (*model_filesystem_pb.FileContentsList, error)

DecodeFileContentsList extracts the FileContentsList that is stored in an object backed by storage.

func (*FileAccessParameters) GetChunkEncoder

func (p *FileAccessParameters) GetChunkEncoder() encoding.BinaryEncoder

func (*FileAccessParameters) GetFileContentsListEncoder

func (p *FileAccessParameters) GetFileContentsListEncoder() encoding.BinaryEncoder

type FileContentsEntry

type FileContentsEntry struct {
	EndBytes  uint64
	Reference object.LocalReference
}

FileContentsEntry contains the properties of a part of a concatenated file.

func NewFileContentsEntryFromProto

func NewFileContentsEntryFromProto(fileContents model_core.Message[*model_filesystem_pb.FileContents], referenceFormat object.ReferenceFormat) (FileContentsEntry, error)

NewFileContentsEntryFromProto constructs a FileContentsListEntry based on the contents of a single FileContents Protobuf message, refering to the file as a whole.

type FileContentsIterator

type FileContentsIterator struct {
	// contains filtered or unexported fields
}

FileContentsIterator is a helper type for iterating over the chunks of a concatenated file sequentially.

func NewFileContentsIterator

func NewFileContentsIterator(root FileContentsEntry, initialOffsetBytes uint64) FileContentsIterator

NewFileContentsIterator creates a FileContentsIterator that starts iteration at the provided offset within the file. It is the caller's responsibility to ensure the provided offset is less than the size of the file.

func (*FileContentsIterator) GetCurrentPart

func (i *FileContentsIterator) GetCurrentPart() (reference object.LocalReference, offsetBytes, sizeBytes uint64)

GetCurrentPart returns the reference of the part of the file that contain the data corresponding with the current offset. It also returns the offset within the part from which data should be read, and the expected total size of the part.

It is the caller's responsibility to track whether iteration has reached the end of the file. Once the end of the file has been reached, GetCurrentPart() may no longer be called.

func (*FileContentsIterator) PushFileContentsList

func (i *FileContentsIterator) PushFileContentsList(list FileContentsList) error

PushFileContentsList can be invoked after GetCurrentPart() to signal that the current part does not refer to a chunk of data, but another FileContentsList. After calling this method, another call to GetCurrentPart() can be made to retry resolution of the part within the provided FileContentsList.

func (*FileContentsIterator) ToNextPart

func (i *FileContentsIterator) ToNextPart()

ToNextPart can be invoked after GetCurrentPart() to signal that the current part refers to a chunk of data. The next call to GetCurrentPart() will return the reference of the part that is stored after the current one.

type FileContentsList

type FileContentsList []FileContentsEntry

FileContentsList contains the properties of parts of a concatenated file. Parts are stored in the order in which they should be concatenated, with EndBytes increasing.

type FileContentsListObjectParserReference

type FileContentsListObjectParserReference[T any] interface {
	GetSizeBytes() int
}

FileContentsListObjectParserReference is a constraint on the reference types accepted by the ObjectParser returned by NewFileContentsListObjectParser.

type FileCreationParameters

type FileCreationParameters struct {
	*FileAccessParameters
	// contains filtered or unexported fields
}

func (*FileCreationParameters) EncodeChunk

func (p *FileCreationParameters) EncodeChunk(data []byte) (*object.Contents, error)

type FileMerkleTreeCapturer

type FileMerkleTreeCapturer[T any] interface {
	CaptureChunk(contents *object.Contents) T
	CaptureFileContentsList(contents *object.Contents, children []T) T
}

FileMerkleTreeCapturer is provided by callers of CreateFileMerkleTree to provide logic for how the resulting Merkle tree of the file should be captured.

A no-op implementation can be used by the caller to simply compute a reference of the file. An implementation that actually captures the provided contents can be used to prepare a Merkle tree for uploading.

The methods below return metadata. The metadata for the root object will be returned by CreateFileMerkleTree.

var ChunkDiscardingFileMerkleTreeCapturer FileMerkleTreeCapturer[CapturedObject] = chunkDiscardingFileMerkleTreeCapturer{}

ChunkDiscardingFileMerkleTreeCapturer is an implementation of FileMerkleTreeCapturer that only preserves the FileContentsList messages of the Merkle tree. This can be of use when incrementally replicating the contents of a file. In those cases it's wasteful to store the full contents of a file in memory.

var NoopFileMerkleTreeCapturer FileMerkleTreeCapturer[model_core.NoopReferenceMetadata] = noopFileMerkleTreeCapturer{}

NoopFileMerkleTreeCapturer is a no-op implementation of FileMerkleTreeCapturer. It can be used when only a reference of a file needs to be computed, and there is no need to capture the resulting Merkle tree.

type FileReader

type FileReader struct {
	// contains filtered or unexported fields
}

func (*FileReader) FileOpenRead

func (fr *FileReader) FileOpenRead(ctx context.Context, fileContents FileContentsEntry, offsetBytes uint64) *SequentialFileReader

func (*FileReader) FileOpenReadAt

func (fr *FileReader) FileOpenReadAt(ctx context.Context, fileContents FileContentsEntry) io.ReaderAt

func (*FileReader) FileReadAll

func (fr *FileReader) FileReadAll(ctx context.Context, fileContents FileContentsEntry, maximumSizeBytes uint64) ([]byte, error)

func (*FileReader) FileReadAt

func (fr *FileReader) FileReadAt(ctx context.Context, fileContents FileContentsEntry, p []byte, offsetBytes uint64) (int, error)

type SequentialFileReader

type SequentialFileReader struct {
	// contains filtered or unexported fields
}

func (*SequentialFileReader) Read

func (r *SequentialFileReader) Read(p []byte) (int, error)

func (*SequentialFileReader) ReadByte

func (r *SequentialFileReader) ReadByte() (byte, error)

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL