fileset

package
v2.9.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 8, 2024 License: Apache-2.0 Imports: 54 Imported by: 0

Documentation

Overview

Package fileset provides access to files through file sets.

A file set is the basic unit of storage for files. File sets come in two types: primitive and composite. A primitive file set stores a list of files added and a list of files deleted. A composite file set composes multiple composite and / or primitive file sets into layers which can be merged to produce a new file set. Composite file sets allow file operations to be applied lazily, which can be very beneficial for improving write / storage costs.

File sets are ephemeral by default, so unreferenced file sets will eventually be garbage collected. Processing that involves creating and using file sets should be done inside a renewer to keep the file sets alive while the processing is occurring.

File sets can be created with a file set writer and files can be retrieved by opening a file set with optional indexing and filtering, then iterating through the files. Files must be added to a file set in lexicographical order and files are emitted by a file set in lexicographical order. Various file set wrappers are available that can perform computations and provide filtering based on the files emitted by a file set.

Backwards compatibility:

The algorithms for file and index chunking have changed throughout 2.x, and we must support previously written data. Here are some examples of conditions in past data which current code will not generate:

  • a file that is split across multiple chunks may share some of them with other files (in the current code, a file split across chunks will be the only file in those chunks).
  • related, even small files may be split across multiple chunks
  • an index range data reference may start part way through a chunk

Index

Constants

View Source
const (
	// DefaultMemoryThreshold is the default for the memory threshold that must
	// be met before a file set part is serialized (excluding close).
	DefaultMemoryThreshold = units.GB
	// DefaultCompactionLevelFactor is the default factor that level sizes increase by in a compacted fileset.
	DefaultCompactionLevelFactor = 10
	DefaultPrefetchLimit         = 10
	DefaultBatchThreshold        = units.MB
	// DefaultIndexCacheSize is the default size of the index cache.
	DefaultIndexCacheSize = 100

	// TrackerPrefix is used for creating tracker objects for filesets
	TrackerPrefix = "fileset/"

	// DefaultFileDatum is the default file datum.
	DefaultFileDatum = "default"
)
View Source
const CacheTrackerPrefix = "cache/"

Variables

View Source
var (
	// ErrFileSetExists path already exists
	ErrFileSetExists = fmt.Errorf("path already exists")
	// ErrFileSetNotExists path does not exist
	ErrFileSetNotExists = fmt.Errorf("path does not exist")
	// ErrNoTTLSet no ttl set on path
	ErrNoTTLSet = fmt.Errorf("no ttl set on path")
)
View Source
var (
	// ErrNoFileSetFound is returned by the methods on Storage when a fileset does not exist
	ErrNoFileSetFound = errors.Errorf("no fileset found")
)
View Source
var File_internal_storage_fileset_fileset_proto protoreflect.FileDescriptor

Functions

func Clean

func Clean(p string, isDir bool) string

Clean cleans a file path.

func CopyDeletedFiles

func CopyDeletedFiles(ctx context.Context, w *Writer, fs FileSet, opts ...index.Option) error

CopyDeletedFiles copies the deleted files from a file set to a file set writer.

func CopyFiles

func CopyFiles(ctx context.Context, w *Writer, fs FileSet, opts ...index.Option) error

CopyFiles copies files from a file set to a file set writer.

func CreatePostgresCacheV1

func CreatePostgresCacheV1(ctx context.Context, tx *pachsql.Tx) error

CreatePostgresCacheV1 creates the table for a cache. DO NOT MODIFY THIS FUNCTION IT HAS BEEN USED IN A RELEASED MIGRATION

func IDsToHexStrings

func IDsToHexStrings(ids []ID) []string

func IsClean

func IsClean(x string, isDir bool) bool

IsClean checks if a file path is clean.

func IsDir

func IsDir(p string) bool

IsDir determines if a path is for a directory.

func SetupPostgresStoreV0

func SetupPostgresStoreV0(ctx context.Context, tx *pachsql.Tx) error

SetupPostgresStoreV0 sets up the tables for a Store DO NOT MODIFY THIS FUNCTION IT HAS BEEN USED IN A RELEASED MIGRATION

func SizeFromIndex

func SizeFromIndex(idx *index.Index) (size int64)

func StoreTestSuite

func StoreTestSuite(t *testing.T, newStore func(t testing.TB) MetadataStore)

StoreTestSuite is a suite of tests for a Store.

func WriteTarEntry

func WriteTarEntry(ctx context.Context, w io.Writer, f File) error

WriteTarEntry writes an tar entry for f to w

func WriteTarStream

func WriteTarStream(ctx context.Context, w io.Writer, fs FileSet) error

WriteTarStream writes an entire tar stream to w It will contain an entry for each File in fs

Types

type Buffer

type Buffer struct {
	// contains filtered or unexported fields
}

func NewBuffer

func NewBuffer() *Buffer

func (*Buffer) Add

func (b *Buffer) Add(path, datum string) io.Writer

func (*Buffer) Copy

func (b *Buffer) Copy(file File, datum string)

func (*Buffer) Count

func (b *Buffer) Count() int

Count gives the number of paths tracked in the buffer, meant as a proxy for metadata memory usage

func (*Buffer) Delete

func (b *Buffer) Delete(path, datum string)

func (*Buffer) Empty

func (b *Buffer) Empty() bool

func (*Buffer) WalkAdditive

func (b *Buffer) WalkAdditive(onAdd func(path, datum string, r io.Reader) error, onCopy func(file File, datum string) error) error

func (*Buffer) WalkDeletive

func (b *Buffer) WalkDeletive(cb func(path, datum string) error) error

type Cache

type Cache struct {
	// contains filtered or unexported fields
}

func NewCache

func NewCache(db *pachsql.DB, tracker track.Tracker, maxSize int) *Cache

func (*Cache) Clear

func (c *Cache) Clear(ctx context.Context, tagPrefix string) error

func (*Cache) Get

func (c *Cache) Get(ctx context.Context, key string) (*anypb.Any, error)

func (*Cache) Put

func (c *Cache) Put(ctx context.Context, key string, value *anypb.Any, ids []ID, tag string) error

type CompactCallback

type CompactCallback func(context.Context, []ID, time.Duration) (*ID, error)

CompactCallback is the standard callback signature for a compaction operation.

type CompactionConfig

type CompactionConfig struct {
	LevelFactor int64
}

type Composite

type Composite struct {
	Layers []string `protobuf:"bytes,1,rep,name=layers,proto3" json:"layers,omitempty"`
	// contains filtered or unexported fields
}

func (*Composite) Descriptor deprecated

func (*Composite) Descriptor() ([]byte, []int)

Deprecated: Use Composite.ProtoReflect.Descriptor instead.

func (*Composite) GetLayers

func (x *Composite) GetLayers() []string

func (*Composite) MarshalLogObject

func (x *Composite) MarshalLogObject(enc zapcore.ObjectEncoder) error

func (*Composite) PointsTo

func (c *Composite) PointsTo() ([]ID, error)

PointsTo returns the IDs of the filesets which this composite fileset points to

func (*Composite) ProtoMessage

func (*Composite) ProtoMessage()

func (*Composite) ProtoReflect added in v2.7.0

func (x *Composite) ProtoReflect() protoreflect.Message

func (*Composite) Reset

func (x *Composite) Reset()

func (*Composite) String

func (x *Composite) String() string

func (*Composite) Validate added in v2.8.0

func (m *Composite) Validate() error

Validate checks the field values on Composite with the rules defined in the proto definition for this message. If any rules are violated, the first error encountered is returned, or nil if there are no violations.

func (*Composite) ValidateAll added in v2.8.0

func (m *Composite) ValidateAll() error

ValidateAll checks the field values on Composite with the rules defined in the proto definition for this message. If any rules are violated, the result is a list of violation errors wrapped in CompositeMultiError, or nil if none found.

type CompositeMultiError added in v2.8.0

type CompositeMultiError []error

CompositeMultiError is an error wrapping multiple validation errors returned by Composite.ValidateAll() if the designated constraints aren't met.

func (CompositeMultiError) AllErrors added in v2.8.0

func (m CompositeMultiError) AllErrors() []error

AllErrors returns a list of validation violation errors.

func (CompositeMultiError) Error added in v2.8.0

func (m CompositeMultiError) Error() string

Error returns a concatenation of all the error messages it wraps.

type CompositeValidationError added in v2.8.0

type CompositeValidationError struct {
	// contains filtered or unexported fields
}

CompositeValidationError is the validation error returned by Composite.Validate if the designated constraints aren't met.

func (CompositeValidationError) Cause added in v2.8.0

func (e CompositeValidationError) Cause() error

Cause function returns cause value.

func (CompositeValidationError) Error added in v2.8.0

func (e CompositeValidationError) Error() string

Error satisfies the builtin error interface

func (CompositeValidationError) ErrorName added in v2.8.0

func (e CompositeValidationError) ErrorName() string

ErrorName returns error name.

func (CompositeValidationError) Field added in v2.8.0

func (e CompositeValidationError) Field() string

Field function returns field value.

func (CompositeValidationError) Key added in v2.8.0

Key function returns key value.

func (CompositeValidationError) Reason added in v2.8.0

func (e CompositeValidationError) Reason() string

Reason function returns reason value.

type File

type File interface {
	// Index returns the index for the file.
	Index() *index.Index
	// Content writes the content of the file.
	Content(ctx context.Context, w io.Writer, opts ...chunk.ReaderOption) error
	// Hash returns the hash of the file.
	Hash(ctx context.Context) ([]byte, error)
}

File represents a file.

type FileReader

type FileReader struct {
	// contains filtered or unexported fields
}

FileReader is an abstraction for reading a file.

func (*FileReader) Content

func (fr *FileReader) Content(ctx context.Context, w io.Writer, opts ...chunk.ReaderOption) error

Content writes the content of the file.

func (*FileReader) Hash

func (fr *FileReader) Hash(_ context.Context) ([]byte, error)

Hash returns the hash of the file.

func (*FileReader) Index

func (fr *FileReader) Index() *index.Index

Index returns the index for the file.

type FileSet

type FileSet interface {
	// Iterate iterates over the files in the file set.
	Iterate(ctx context.Context, cb func(File) error, opts ...index.Option) error
	// IterateDeletes iterates over the deleted files in the file set.
	IterateDeletes(ctx context.Context, cb func(File) error, opts ...index.Option) error
	// Shards returns a list of shards for the file set.
	Shards(ctx context.Context, opts ...index.Option) ([]*index.PathRange, error)
}

FileSet represents a set of files.

func NewDirInserter

func NewDirInserter(x FileSet, lower string) FileSet

NewDirInserter creates a file set that inserts directory entries.

func NewIndexFilter

func NewIndexFilter(fs FileSet, predicate func(idx *index.Index) bool, full ...bool) FileSet

NewIndexFilter filters fs using predicate.

func NewIndexMapper

func NewIndexMapper(x FileSet, fn func(*index.Index) *index.Index) FileSet

NewIndexMapper performs a map operation on the index entries of the files in the file set.

func NewPrefetcher

func NewPrefetcher(storage *Storage, fileSet FileSet, upper string) FileSet

type ID

type ID [16]byte

ID is the unique identifier for a fileset

func HexStringsToIDs

func HexStringsToIDs(xs []string) ([]ID, error)

func ParseID

func ParseID(x string) (*ID, error)

ParseID parses a string into an ID or returns an error

func (ID) HexString

func (id ID) HexString() string

HexString returns the ID encoded with the hex alphabet.

func (*ID) Scan

func (id *ID) Scan(src interface{}) error

Scan implements sql.Scanner

func (ID) TrackerID

func (id ID) TrackerID() string

TrackerID returns the tracker ID for the fileset.

func (ID) Value

func (id ID) Value() (driver.Value, error)

Value implements sql.Valuer

type Iterator

type Iterator struct {
	// contains filtered or unexported fields
}

Iterator provides functionality for imperative iteration over a file set.

func NewIterator

func NewIterator(ctx context.Context, iter iterFunc, opts ...index.Option) *Iterator

func (*Iterator) Next

func (i *Iterator) Next() (File, error)

Next returns the next file and progresses the iterator.

func (*Iterator) Peek

func (i *Iterator) Peek() (File, error)

Peek returns the next file without progressing the iterator.

type MergeFileReader

type MergeFileReader struct {
	// contains filtered or unexported fields
}

MergeFileReader is an abstraction for reading a merged file.

func (*MergeFileReader) Content

func (mfr *MergeFileReader) Content(ctx context.Context, w io.Writer, opts ...chunk.ReaderOption) error

Content returns the content of the merged file.

func (*MergeFileReader) Hash

func (mfr *MergeFileReader) Hash(ctx context.Context) ([]byte, error)

Hash returns the hash of the file. TODO: It would be good to remove this potential performance footgun, but it would require removing the append functionality.

func (*MergeFileReader) Index

func (mfr *MergeFileReader) Index() *index.Index

Index returns the index for the merged file. TODO: Removed clone because it had a significant performance impact for small files. May want to revisit.

type MergeReader

type MergeReader struct {
	// contains filtered or unexported fields
}

MergeReader is an abstraction for reading merged file sets.

func (*MergeReader) Iterate

func (mr *MergeReader) Iterate(ctx context.Context, cb func(File) error, opts ...index.Option) error

Iterate iterates over the files in the merge reader.

func (*MergeReader) IterateDeletes

func (mr *MergeReader) IterateDeletes(ctx context.Context, cb func(File) error, opts ...index.Option) error

func (*MergeReader) Shards

func (mr *MergeReader) Shards(ctx context.Context, opts ...index.Option) ([]*index.PathRange, error)

TODO: Look at the sizes? TODO: Come up with better heuristics for sharding.

type Metadata

type Metadata struct {

	// Types that are assignable to Value:
	//
	//	*Metadata_Primitive
	//	*Metadata_Composite
	Value isMetadata_Value `protobuf_oneof:"value"`
	// contains filtered or unexported fields
}

func (*Metadata) Descriptor deprecated

func (*Metadata) Descriptor() ([]byte, []int)

Deprecated: Use Metadata.ProtoReflect.Descriptor instead.

func (*Metadata) GetComposite

func (x *Metadata) GetComposite() *Composite

func (*Metadata) GetPrimitive

func (x *Metadata) GetPrimitive() *Primitive

func (*Metadata) GetValue

func (m *Metadata) GetValue() isMetadata_Value

func (*Metadata) MarshalLogObject

func (x *Metadata) MarshalLogObject(enc zapcore.ObjectEncoder) error

func (*Metadata) ProtoMessage

func (*Metadata) ProtoMessage()

func (*Metadata) ProtoReflect added in v2.7.0

func (x *Metadata) ProtoReflect() protoreflect.Message

func (*Metadata) Reset

func (x *Metadata) Reset()

func (*Metadata) String

func (x *Metadata) String() string

func (*Metadata) Validate added in v2.8.0

func (m *Metadata) Validate() error

Validate checks the field values on Metadata with the rules defined in the proto definition for this message. If any rules are violated, the first error encountered is returned, or nil if there are no violations.

func (*Metadata) ValidateAll added in v2.8.0

func (m *Metadata) ValidateAll() error

ValidateAll checks the field values on Metadata with the rules defined in the proto definition for this message. If any rules are violated, the result is a list of violation errors wrapped in MetadataMultiError, or nil if none found.

type MetadataMultiError added in v2.8.0

type MetadataMultiError []error

MetadataMultiError is an error wrapping multiple validation errors returned by Metadata.ValidateAll() if the designated constraints aren't met.

func (MetadataMultiError) AllErrors added in v2.8.0

func (m MetadataMultiError) AllErrors() []error

AllErrors returns a list of validation violation errors.

func (MetadataMultiError) Error added in v2.8.0

func (m MetadataMultiError) Error() string

Error returns a concatenation of all the error messages it wraps.

type MetadataStore

type MetadataStore interface {
	DB() *pachsql.DB
	SetTx(tx *pachsql.Tx, id ID, md *Metadata) error
	Get(ctx context.Context, id ID) (*Metadata, error)
	GetTx(tx *pachsql.Tx, id ID) (*Metadata, error)
	DeleteTx(tx *pachsql.Tx, id ID) error
	Exists(ctx context.Context, id ID) (bool, error)
}

MetadataStore stores filesets. A fileset is a path -> index relationship All filesets exist in the same keyspace and can be merged by prefix

func NewPostgresStore

func NewPostgresStore(db *pachsql.DB) MetadataStore

NewPostgresStore returns a Store backed by db TODO: Expose configuration for cache size?

func NewTestStore

func NewTestStore(ctx context.Context, t testing.TB, db *pachsql.DB) MetadataStore

NewTestStore returns a Store scoped to the lifetime of the test.

type MetadataValidationError added in v2.8.0

type MetadataValidationError struct {
	// contains filtered or unexported fields
}

MetadataValidationError is the validation error returned by Metadata.Validate if the designated constraints aren't met.

func (MetadataValidationError) Cause added in v2.8.0

func (e MetadataValidationError) Cause() error

Cause function returns cause value.

func (MetadataValidationError) Error added in v2.8.0

func (e MetadataValidationError) Error() string

Error satisfies the builtin error interface

func (MetadataValidationError) ErrorName added in v2.8.0

func (e MetadataValidationError) ErrorName() string

ErrorName returns error name.

func (MetadataValidationError) Field added in v2.8.0

func (e MetadataValidationError) Field() string

Field function returns field value.

func (MetadataValidationError) Key added in v2.8.0

func (e MetadataValidationError) Key() bool

Key function returns key value.

func (MetadataValidationError) Reason added in v2.8.0

func (e MetadataValidationError) Reason() string

Reason function returns reason value.

type Metadata_Composite

type Metadata_Composite struct {
	Composite *Composite `protobuf:"bytes,2,opt,name=composite,proto3,oneof"`
}

type Metadata_Primitive

type Metadata_Primitive struct {
	Primitive *Primitive `protobuf:"bytes,1,opt,name=primitive,proto3,oneof"`
}

type Primitive

type Primitive struct {
	Deletive  *index.Index `protobuf:"bytes,1,opt,name=deletive,proto3" json:"deletive,omitempty"`
	Additive  *index.Index `protobuf:"bytes,2,opt,name=additive,proto3" json:"additive,omitempty"`
	SizeBytes int64        `protobuf:"varint,3,opt,name=size_bytes,json=sizeBytes,proto3" json:"size_bytes,omitempty"`
	// contains filtered or unexported fields
}

func (*Primitive) Descriptor deprecated

func (*Primitive) Descriptor() ([]byte, []int)

Deprecated: Use Primitive.ProtoReflect.Descriptor instead.

func (*Primitive) GetAdditive

func (x *Primitive) GetAdditive() *index.Index

func (*Primitive) GetDeletive

func (x *Primitive) GetDeletive() *index.Index

func (*Primitive) GetSizeBytes

func (x *Primitive) GetSizeBytes() int64

func (*Primitive) MarshalLogObject

func (x *Primitive) MarshalLogObject(enc zapcore.ObjectEncoder) error

func (*Primitive) PointsTo

func (p *Primitive) PointsTo() []chunk.ID

PointsTo returns a slice of the chunk.IDs which this fileset immediately points to. Transitively reachable chunks are not included in the slice.

func (*Primitive) ProtoMessage

func (*Primitive) ProtoMessage()

func (*Primitive) ProtoReflect added in v2.7.0

func (x *Primitive) ProtoReflect() protoreflect.Message

func (*Primitive) Reset

func (x *Primitive) Reset()

func (*Primitive) String

func (x *Primitive) String() string

func (*Primitive) Validate added in v2.8.0

func (m *Primitive) Validate() error

Validate checks the field values on Primitive with the rules defined in the proto definition for this message. If any rules are violated, the first error encountered is returned, or nil if there are no violations.

func (*Primitive) ValidateAll added in v2.8.0

func (m *Primitive) ValidateAll() error

ValidateAll checks the field values on Primitive with the rules defined in the proto definition for this message. If any rules are violated, the result is a list of violation errors wrapped in PrimitiveMultiError, or nil if none found.

type PrimitiveMultiError added in v2.8.0

type PrimitiveMultiError []error

PrimitiveMultiError is an error wrapping multiple validation errors returned by Primitive.ValidateAll() if the designated constraints aren't met.

func (PrimitiveMultiError) AllErrors added in v2.8.0

func (m PrimitiveMultiError) AllErrors() []error

AllErrors returns a list of validation violation errors.

func (PrimitiveMultiError) Error added in v2.8.0

func (m PrimitiveMultiError) Error() string

Error returns a concatenation of all the error messages it wraps.

type PrimitiveValidationError added in v2.8.0

type PrimitiveValidationError struct {
	// contains filtered or unexported fields
}

PrimitiveValidationError is the validation error returned by Primitive.Validate if the designated constraints aren't met.

func (PrimitiveValidationError) Cause added in v2.8.0

func (e PrimitiveValidationError) Cause() error

Cause function returns cause value.

func (PrimitiveValidationError) Error added in v2.8.0

func (e PrimitiveValidationError) Error() string

Error satisfies the builtin error interface

func (PrimitiveValidationError) ErrorName added in v2.8.0

func (e PrimitiveValidationError) ErrorName() string

ErrorName returns error name.

func (PrimitiveValidationError) Field added in v2.8.0

func (e PrimitiveValidationError) Field() string

Field function returns field value.

func (PrimitiveValidationError) Key added in v2.8.0

Key function returns key value.

func (PrimitiveValidationError) Reason added in v2.8.0

func (e PrimitiveValidationError) Reason() string

Reason function returns reason value.

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader is an abstraction for reading a file set.

func (*Reader) Iterate

func (r *Reader) Iterate(ctx context.Context, cb func(File) error, opts ...index.Option) error

func (*Reader) IterateDeletes

func (r *Reader) IterateDeletes(ctx context.Context, cb func(File) error, opts ...index.Option) error

func (*Reader) Shards

func (r *Reader) Shards(ctx context.Context, opts ...index.Option) ([]*index.PathRange, error)

type Renewer

type Renewer struct {
	// contains filtered or unexported fields
}

Renewer renews file sets by ID It is a renew.StringSet wrapped for type safety

func (*Renewer) Add

func (r *Renewer) Add(ctx context.Context, id ID) error

func (*Renewer) Close

func (r *Renewer) Close() error

func (*Renewer) Context

func (r *Renewer) Context() context.Context

type Storage

type Storage struct {
	// contains filtered or unexported fields
}

Storage is an abstraction for interfacing with file sets. A storage instance: - Provides methods for writing file sets, opening file sets for reading, and managing file sets. - Manages tracker state to keep internal file sets alive while writing file sets. - Manages an internal index cache that supports logarithmic lookup in the multilevel indexes. - Provides methods for processing file set compaction tasks. - Provides a method for creating a garbage collector.

func NewStorage

func NewStorage(mds MetadataStore, tr track.Tracker, chunks *chunk.Storage, opts ...StorageOption) *Storage

NewStorage creates a new Storage.

func NewTestStorage

func NewTestStorage(ctx context.Context, t testing.TB, db *pachsql.DB, tr track.Tracker, opts ...StorageOption) *Storage

NewTestStorage constructs a local storage instance scoped to the lifetime of the test

func (*Storage) CloneTx

func (s *Storage) CloneTx(tx *pachsql.Tx, id ID, ttl time.Duration) (*ID, error)

CloneTx creates a new fileset, identical to the fileset at id, but with the specified ttl. The ttl can be ignored by using track.NoTTL

func (*Storage) Compact

func (s *Storage) Compact(ctx context.Context, ids []ID, ttl time.Duration, opts ...index.Option) (*ID, error)

Compact compacts the contents of ids into a new file set with the specified ttl and returns the ID. Compact always returns the ID of a primitive file set. Compact does not renew ids. It is the responsibility of the caller to renew ids. In some cases they may be permanent and not require renewal.

func (*Storage) CompactLevelBased

func (s *Storage) CompactLevelBased(ctx context.Context, ids []ID, maxFanIn int, ttl time.Duration, compact CompactCallback) (*ID, error)

CompactLevelBased performs a level-based compaction on the passed in file sets.

func (*Storage) Compose

func (s *Storage) Compose(ctx context.Context, ids []ID, ttl time.Duration) (*ID, error)

Compose produces a composite fileset from the filesets under ids. It does not perform a merge or check that the filesets at ids in any way other than ensuring that they exist.

func (*Storage) ComposeTx

func (s *Storage) ComposeTx(tx *pachsql.Tx, ids []ID, ttl time.Duration) (*ID, error)

ComposeTx produces a composite fileset from the filesets under ids. It does not perform a merge or check that the filesets at ids in any way other than ensuring that they exist.

func (*Storage) Concat

func (s *Storage) Concat(ctx context.Context, ids []ID, ttl time.Duration) (*ID, error)

Concat is a special case of Merge, where the filesets each contain paths for distinct ranges. The path ranges must be non-overlapping and the ranges must be lexigraphically sorted. Concat always returns the ID of a primitive fileset.

func (*Storage) Drop

func (s *Storage) Drop(ctx context.Context, id ID) error

Drop allows a fileset to be deleted if it is not otherwise referenced.

func (*Storage) Flatten

func (s *Storage) Flatten(ctx context.Context, ids []ID, cb func(id ID) error) error

Flatten iterates through IDs and replaces references to composite file sets with all their layers in place and executes the user provided callback against each primitive file set.

func (*Storage) FlattenAll

func (s *Storage) FlattenAll(ctx context.Context, ids []ID) ([]ID, error)

FlattenAll is like Flatten, but collects the primitives to return to the user.

func (*Storage) IsCompacted

func (s *Storage) IsCompacted(ctx context.Context, id ID) (bool, error)

IsCompacted returns true if the file sets are already in a compacted form.

func (*Storage) NewGC

func (*Storage) NewUnorderedWriter

func (s *Storage) NewUnorderedWriter(ctx context.Context, opts ...UnorderedWriterOption) (*UnorderedWriter, error)

NewUnorderedWriter creates a new unordered file set writer.

func (*Storage) NewWriter

func (s *Storage) NewWriter(ctx context.Context, opts ...WriterOption) *Writer

NewWriter creates a new file set writer.

func (*Storage) Open

func (s *Storage) Open(ctx context.Context, ids []ID) (FileSet, error)

Open opens a file set for reading.

func (*Storage) SetTTL

func (s *Storage) SetTTL(ctx context.Context, id ID, ttl time.Duration) (time.Time, error)

SetTTL sets the time-to-live for the fileset at id

func (*Storage) Shard

func (s *Storage) Shard(ctx context.Context, ids []ID, pathRange *index.PathRange) ([]*index.PathRange, error)

Shard shards the file set into path ranges. TODO This should be extended to be more configurable (different criteria for creating shards).

func (*Storage) ShardConfig

func (s *Storage) ShardConfig() *index.ShardConfig

func (*Storage) Size

func (s *Storage) Size(ctx context.Context, id ID) (int64, error)

Size returns the size of the data in the file set in bytes.

func (*Storage) SizeUpperBound

func (s *Storage) SizeUpperBound(ctx context.Context, id ID) (int64, error)

SizeUpperBound returns an upper bound for the size of the data in the file set in bytes. The upper bound is cheaper to compute than the actual size.

func (*Storage) WithRenewer

func (s *Storage) WithRenewer(ctx context.Context, ttl time.Duration, cb func(context.Context, *Renewer) error) (retErr error)

WithRenewer calls cb with a Renewer, and a context which will be canceled if the renewer is unable to renew a path.

type StorageOption

type StorageOption func(*Storage)

StorageOption configures a storage.

func WithLevelFactor

func WithLevelFactor(x int64) StorageOption

WithLevelFactor sets the factor which the size of levels in inc

func WithMaxOpenFileSets

func WithMaxOpenFileSets(max int) StorageOption

WithMaxOpenFileSets sets the maximum number of filesets that will be open (potentially buffered in memory) at a time.

func WithMemoryThreshold

func WithMemoryThreshold(threshold int64) StorageOption

WithMemoryThreshold sets the memory threshold that must be met before a file set part is serialized (excluding close).

func WithShardCountThreshold

func WithShardCountThreshold(threshold int64) StorageOption

WithShardCountThreshold sets the count threshold that must be met before a shard is created by the shard function.

func WithShardSizeThreshold

func WithShardSizeThreshold(threshold int64) StorageOption

WithShardSizeThreshold sets the size threshold that must be met before a shard is created by the shard function.

type TestCacheValue

type TestCacheValue struct {
	FileSetId string `protobuf:"bytes,1,opt,name=file_set_id,json=fileSetId,proto3" json:"file_set_id,omitempty"`
	// contains filtered or unexported fields
}

func (*TestCacheValue) Descriptor deprecated

func (*TestCacheValue) Descriptor() ([]byte, []int)

Deprecated: Use TestCacheValue.ProtoReflect.Descriptor instead.

func (*TestCacheValue) GetFileSetId

func (x *TestCacheValue) GetFileSetId() string

func (*TestCacheValue) MarshalLogObject

func (x *TestCacheValue) MarshalLogObject(enc zapcore.ObjectEncoder) error

func (*TestCacheValue) ProtoMessage

func (*TestCacheValue) ProtoMessage()

func (*TestCacheValue) ProtoReflect added in v2.7.0

func (x *TestCacheValue) ProtoReflect() protoreflect.Message

func (*TestCacheValue) Reset

func (x *TestCacheValue) Reset()

func (*TestCacheValue) String

func (x *TestCacheValue) String() string

func (*TestCacheValue) Validate added in v2.8.0

func (m *TestCacheValue) Validate() error

Validate checks the field values on TestCacheValue with the rules defined in the proto definition for this message. If any rules are violated, the first error encountered is returned, or nil if there are no violations.

func (*TestCacheValue) ValidateAll added in v2.8.0

func (m *TestCacheValue) ValidateAll() error

ValidateAll checks the field values on TestCacheValue with the rules defined in the proto definition for this message. If any rules are violated, the result is a list of violation errors wrapped in TestCacheValueMultiError, or nil if none found.

type TestCacheValueMultiError added in v2.8.0

type TestCacheValueMultiError []error

TestCacheValueMultiError is an error wrapping multiple validation errors returned by TestCacheValue.ValidateAll() if the designated constraints aren't met.

func (TestCacheValueMultiError) AllErrors added in v2.8.0

func (m TestCacheValueMultiError) AllErrors() []error

AllErrors returns a list of validation violation errors.

func (TestCacheValueMultiError) Error added in v2.8.0

func (m TestCacheValueMultiError) Error() string

Error returns a concatenation of all the error messages it wraps.

type TestCacheValueValidationError added in v2.8.0

type TestCacheValueValidationError struct {
	// contains filtered or unexported fields
}

TestCacheValueValidationError is the validation error returned by TestCacheValue.Validate if the designated constraints aren't met.

func (TestCacheValueValidationError) Cause added in v2.8.0

Cause function returns cause value.

func (TestCacheValueValidationError) Error added in v2.8.0

Error satisfies the builtin error interface

func (TestCacheValueValidationError) ErrorName added in v2.8.0

func (e TestCacheValueValidationError) ErrorName() string

ErrorName returns error name.

func (TestCacheValueValidationError) Field added in v2.8.0

Field function returns field value.

func (TestCacheValueValidationError) Key added in v2.8.0

Key function returns key value.

func (TestCacheValueValidationError) Reason added in v2.8.0

Reason function returns reason value.

type UnorderedWriter

type UnorderedWriter struct {
	// contains filtered or unexported fields
}

UnorderedWriter allows writing Files, unordered by path, into multiple ordered filesets. This may be a full filesystem or a subfilesystem (e.g. datum / datum set / shard).

func (*UnorderedWriter) AddFileSet

func (uw *UnorderedWriter) AddFileSet(ctx context.Context, id ID) error

func (*UnorderedWriter) Close

func (uw *UnorderedWriter) Close(ctx context.Context) (*ID, error)

Close closes the writer.

func (*UnorderedWriter) Copy

func (uw *UnorderedWriter) Copy(ctx context.Context, fs FileSet, datum string, appendFile bool, opts ...index.Option) error

func (*UnorderedWriter) Delete

func (uw *UnorderedWriter) Delete(ctx context.Context, p, datum string) error

Delete deletes a file from the file set.

func (*UnorderedWriter) Put

func (uw *UnorderedWriter) Put(ctx context.Context, p, datum string, appendFile bool, r io.Reader) (retErr error)

type UnorderedWriterOption

type UnorderedWriterOption func(*UnorderedWriter)

UnorderedWriterOption configures an UnorderedWriter.

func WithCompact

func WithCompact(maxFanIn int) UnorderedWriterOption

WithCompact sets the unordered writer to compact the created file sets if they exceed the passed in fan in.

func WithParentID

func WithParentID(getParentID func() (*ID, error)) UnorderedWriterOption

WithParentID sets a factory for the parent fileset ID for the unordered writer. This is used for converting directory deletions into a set of point deletes for the files contained within the directory.

func WithRenewal

func WithRenewal(ttl time.Duration, r *Renewer) UnorderedWriterOption

WithRenewal configures the UnorderedWriter to renew subfileset paths with the provided renewer.

func WithValidator

func WithValidator(validator func(string) error) UnorderedWriterOption

WithValidator sets the validator for paths being written to the unordered writer.

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer provides functionality for writing a file set.

func (*Writer) Add

func (w *Writer) Add(path, datum string, r io.Reader) error

func (*Writer) Close

func (w *Writer) Close() (*ID, error)

Close closes the writer.

func (*Writer) Copy

func (w *Writer) Copy(file File, datum string) error

Copy copies a file to the file set writer.

func (*Writer) Delete

func (w *Writer) Delete(path, datum string) error

Delete creates a delete operation for a file.

type WriterOption

type WriterOption func(w *Writer)

WriterOption configures a file set writer.

func WithTTL

func WithTTL(ttl time.Duration) WriterOption

WithTTL sets the ttl for the fileset

Directories

Path Synopsis
Package index provides access to files through multilevel indexes.
Package index provides access to files through multilevel indexes.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL