index

package
v2.9.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 8, 2024 License: Apache-2.0 Imports: 29 Imported by: 0

Documentation

Overview

Package index provides access to files through multilevel indexes.

A multilevel index contains one or more levels where the lowest level contains file type index entries which directly reference file content and the above levels contain range type index entries which directly reference a range of index entries and indirectly reference ranges of file content. Multilevel indexes are created using writers which provide functionality for creating indexes for new files or creating indexes based on other indexes (rooted by file or range type indexes). Reading a multilevel index requires setting up a reader which provide various indexing strategies and filters.

Index

Constants

View Source
const (
	// DefaultShardNumThreshold is the default for the NumFiles threshold that must
	// be met before a shard is created.
	DefaultShardNumThreshold = 1000000
	// DefaultShardSizeThreshold is the default for the SizeBytes threshold that must
	// be met before a shard is created.
	DefaultShardSizeThreshold = units.GB
)
View Source
const (
	DefaultBatchThreshold = units.MB
)

Variables

View Source
var File_internal_storage_fileset_index_index_proto protoreflect.FileDescriptor

Functions

func Generate

func Generate(s string) []string

Generate generates the permutations of the passed in string and returns them sorted.

func Merge

func Merge(ctx context.Context, storage *chunk.Storage, indexes []*Index, cb func(*Index) error) error

func Perm

func Perm(a []rune, f func([]rune))

Perm calls f with each permutation of a.

func PointsTo

func PointsTo(idx *Index) []chunk.ID

PointsTo returns a list of all the chunks this index references

func SizeBytes

func SizeBytes(idx *Index) int64

SizeBytes computes the size of the indexed data in bytes.

Types

type Cache

type Cache struct {
	// contains filtered or unexported fields
}

func NewCache

func NewCache(storage *chunk.Storage, size int) *Cache

func (*Cache) Get

func (c *Cache) Get(ctx context.Context, chunkRef *chunk.DataRef, filter *pathFilter, w io.Writer) error

type File

type File struct {
	Datum    string           `protobuf:"bytes,1,opt,name=datum,proto3" json:"datum,omitempty"`
	DataRefs []*chunk.DataRef `protobuf:"bytes,2,rep,name=data_refs,json=dataRefs,proto3" json:"data_refs,omitempty"`
	// contains filtered or unexported fields
}

func (*File) Descriptor deprecated

func (*File) Descriptor() ([]byte, []int)

Deprecated: Use File.ProtoReflect.Descriptor instead.

func (*File) GetDataRefs

func (x *File) GetDataRefs() []*chunk.DataRef

func (*File) GetDatum

func (x *File) GetDatum() string

func (*File) MarshalLogObject

func (x *File) MarshalLogObject(enc zapcore.ObjectEncoder) error

func (*File) ProtoMessage

func (*File) ProtoMessage()

func (*File) ProtoReflect added in v2.7.0

func (x *File) ProtoReflect() protoreflect.Message

func (*File) Reset

func (x *File) Reset()

func (*File) String

func (x *File) String() string

func (*File) Validate added in v2.8.0

func (m *File) Validate() error

Validate checks the field values on File with the rules defined in the proto definition for this message. If any rules are violated, the first error encountered is returned, or nil if there are no violations.

func (*File) ValidateAll added in v2.8.0

func (m *File) ValidateAll() error

ValidateAll checks the field values on File with the rules defined in the proto definition for this message. If any rules are violated, the result is a list of violation errors wrapped in FileMultiError, or nil if none found.

type FileMultiError added in v2.8.0

type FileMultiError []error

FileMultiError is an error wrapping multiple validation errors returned by File.ValidateAll() if the designated constraints aren't met.

func (FileMultiError) AllErrors added in v2.8.0

func (m FileMultiError) AllErrors() []error

AllErrors returns a list of validation violation errors.

func (FileMultiError) Error added in v2.8.0

func (m FileMultiError) Error() string

Error returns a concatenation of all the error messages it wraps.

type FileValidationError added in v2.8.0

type FileValidationError struct {
	// contains filtered or unexported fields
}

FileValidationError is the validation error returned by File.Validate if the designated constraints aren't met.

func (FileValidationError) Cause added in v2.8.0

func (e FileValidationError) Cause() error

Cause function returns cause value.

func (FileValidationError) Error added in v2.8.0

func (e FileValidationError) Error() string

Error satisfies the builtin error interface

func (FileValidationError) ErrorName added in v2.8.0

func (e FileValidationError) ErrorName() string

ErrorName returns error name.

func (FileValidationError) Field added in v2.8.0

func (e FileValidationError) Field() string

Field function returns field value.

func (FileValidationError) Key added in v2.8.0

func (e FileValidationError) Key() bool

Key function returns key value.

func (FileValidationError) Reason added in v2.8.0

func (e FileValidationError) Reason() string

Reason function returns reason value.

type Index

type Index struct {
	Path string `protobuf:"bytes,1,opt,name=path,proto3" json:"path,omitempty"`
	// NOTE: range and file are mutually exclusive.
	Range *Range `protobuf:"bytes,2,opt,name=range,proto3" json:"range,omitempty"`
	File  *File  `protobuf:"bytes,3,opt,name=file,proto3" json:"file,omitempty"`
	// NOTE: num_files and size_bytes did not exist in older versions of 2.x, so
	// they will not be set.
	NumFiles  int64 `protobuf:"varint,4,opt,name=num_files,json=numFiles,proto3" json:"num_files,omitempty"`
	SizeBytes int64 `protobuf:"varint,5,opt,name=size_bytes,json=sizeBytes,proto3" json:"size_bytes,omitempty"`
	// contains filtered or unexported fields
}

Index stores an index to and metadata about a range of files or a file.

func (*Index) Descriptor deprecated

func (*Index) Descriptor() ([]byte, []int)

Deprecated: Use Index.ProtoReflect.Descriptor instead.

func (*Index) GetFile

func (x *Index) GetFile() *File

func (*Index) GetNumFiles

func (x *Index) GetNumFiles() int64

func (*Index) GetPath

func (x *Index) GetPath() string

func (*Index) GetRange

func (x *Index) GetRange() *Range

func (*Index) GetSizeBytes

func (x *Index) GetSizeBytes() int64

func (*Index) MarshalLogObject

func (x *Index) MarshalLogObject(enc zapcore.ObjectEncoder) error

func (*Index) ProtoMessage

func (*Index) ProtoMessage()

func (*Index) ProtoReflect added in v2.7.0

func (x *Index) ProtoReflect() protoreflect.Message

func (*Index) Reset

func (x *Index) Reset()

func (*Index) String

func (x *Index) String() string

func (*Index) Validate added in v2.8.0

func (m *Index) Validate() error

Validate checks the field values on Index with the rules defined in the proto definition for this message. If any rules are violated, the first error encountered is returned, or nil if there are no violations.

func (*Index) ValidateAll added in v2.8.0

func (m *Index) ValidateAll() error

ValidateAll checks the field values on Index with the rules defined in the proto definition for this message. If any rules are violated, the result is a list of violation errors wrapped in IndexMultiError, or nil if none found.

type IndexMultiError added in v2.8.0

type IndexMultiError []error

IndexMultiError is an error wrapping multiple validation errors returned by Index.ValidateAll() if the designated constraints aren't met.

func (IndexMultiError) AllErrors added in v2.8.0

func (m IndexMultiError) AllErrors() []error

AllErrors returns a list of validation violation errors.

func (IndexMultiError) Error added in v2.8.0

func (m IndexMultiError) Error() string

Error returns a concatenation of all the error messages it wraps.

type IndexValidationError added in v2.8.0

type IndexValidationError struct {
	// contains filtered or unexported fields
}

IndexValidationError is the validation error returned by Index.Validate if the designated constraints aren't met.

func (IndexValidationError) Cause added in v2.8.0

func (e IndexValidationError) Cause() error

Cause function returns cause value.

func (IndexValidationError) Error added in v2.8.0

func (e IndexValidationError) Error() string

Error satisfies the builtin error interface

func (IndexValidationError) ErrorName added in v2.8.0

func (e IndexValidationError) ErrorName() string

ErrorName returns error name.

func (IndexValidationError) Field added in v2.8.0

func (e IndexValidationError) Field() string

Field function returns field value.

func (IndexValidationError) Key added in v2.8.0

func (e IndexValidationError) Key() bool

Key function returns key value.

func (IndexValidationError) Reason added in v2.8.0

func (e IndexValidationError) Reason() string

Reason function returns reason value.

type Option

type Option func(r *Reader)

Option configures an index reader.

func WithDatum

func WithDatum(datum string) Option

WithDatum adds a datum filter that matches a single datum.

func WithPrefix

func WithPrefix(prefix string) Option

WithPrefix sets a prefix filter for the read.

func WithRange

func WithRange(pathRange *PathRange) Option

WithRange sets a range filter for the read.

func WithShardConfig

func WithShardConfig(config *ShardConfig) Option

WithShardConfig sets the sharding configuration.

type PathRange

type PathRange struct {
	Lower, Upper string
}

PathRange is a range of paths. The range is inclusive, exclusive: [Lower, Upper).

func (*PathRange) String

func (r *PathRange) String() string

type Range

type Range struct {
	Offset   int64          `protobuf:"varint,1,opt,name=offset,proto3" json:"offset,omitempty"`
	LastPath string         `protobuf:"bytes,2,opt,name=last_path,json=lastPath,proto3" json:"last_path,omitempty"`
	ChunkRef *chunk.DataRef `protobuf:"bytes,3,opt,name=chunk_ref,json=chunkRef,proto3" json:"chunk_ref,omitempty"`
	// contains filtered or unexported fields
}

func (*Range) Descriptor deprecated

func (*Range) Descriptor() ([]byte, []int)

Deprecated: Use Range.ProtoReflect.Descriptor instead.

func (*Range) GetChunkRef

func (x *Range) GetChunkRef() *chunk.DataRef

func (*Range) GetLastPath

func (x *Range) GetLastPath() string

func (*Range) GetOffset

func (x *Range) GetOffset() int64

func (*Range) MarshalLogObject

func (x *Range) MarshalLogObject(enc zapcore.ObjectEncoder) error

func (*Range) ProtoMessage

func (*Range) ProtoMessage()

func (*Range) ProtoReflect added in v2.7.0

func (x *Range) ProtoReflect() protoreflect.Message

func (*Range) Reset

func (x *Range) Reset()

func (*Range) String

func (x *Range) String() string

func (*Range) Validate added in v2.8.0

func (m *Range) Validate() error

Validate checks the field values on Range with the rules defined in the proto definition for this message. If any rules are violated, the first error encountered is returned, or nil if there are no violations.

func (*Range) ValidateAll added in v2.8.0

func (m *Range) ValidateAll() error

ValidateAll checks the field values on Range with the rules defined in the proto definition for this message. If any rules are violated, the result is a list of violation errors wrapped in RangeMultiError, or nil if none found.

type RangeMultiError added in v2.8.0

type RangeMultiError []error

RangeMultiError is an error wrapping multiple validation errors returned by Range.ValidateAll() if the designated constraints aren't met.

func (RangeMultiError) AllErrors added in v2.8.0

func (m RangeMultiError) AllErrors() []error

AllErrors returns a list of validation violation errors.

func (RangeMultiError) Error added in v2.8.0

func (m RangeMultiError) Error() string

Error returns a concatenation of all the error messages it wraps.

type RangeValidationError added in v2.8.0

type RangeValidationError struct {
	// contains filtered or unexported fields
}

RangeValidationError is the validation error returned by Range.Validate if the designated constraints aren't met.

func (RangeValidationError) Cause added in v2.8.0

func (e RangeValidationError) Cause() error

Cause function returns cause value.

func (RangeValidationError) Error added in v2.8.0

func (e RangeValidationError) Error() string

Error satisfies the builtin error interface

func (RangeValidationError) ErrorName added in v2.8.0

func (e RangeValidationError) ErrorName() string

ErrorName returns error name.

func (RangeValidationError) Field added in v2.8.0

func (e RangeValidationError) Field() string

Field function returns field value.

func (RangeValidationError) Key added in v2.8.0

func (e RangeValidationError) Key() bool

Key function returns key value.

func (RangeValidationError) Reason added in v2.8.0

func (e RangeValidationError) Reason() string

Reason function returns reason value.

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader is used for reading a multilevel index.

func NewReader

func NewReader(chunks *chunk.Storage, cache *Cache, topIdx *Index, opts ...Option) *Reader

NewReader creates a new Reader.

func (*Reader) Iterate

func (r *Reader) Iterate(ctx context.Context, cb func(*Index) error) error

Iterate iterates over the lowest level (file type) indexes.

func (*Reader) Shards

func (r *Reader) Shards(ctx context.Context) ([]*PathRange, error)

Shards creates shards for the index based on the sharding configuration provided to the reader. Sharding takes advantage of the NumFiles and SizeBytes index metadata to efficiently traverse the multilevel index. A subtree is traversed only when a split point exists within it, which we know based on the NumFiles and SizeBytes values at the root of each subtree.

type ShardConfig

type ShardConfig struct {
	NumFiles  int64
	SizeBytes int64
}

ShardConfig is a sharding configuration. NumFiles is the number of files to target for each shard. SizeBytes is the size, in bytes, to target for each shard.

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer is used for creating a multilevel index into a serialized file set. Each index level is a stream of byte length encoded index entries that are stored in chunk storage. Both file and range type indexes can be written to a writer. New levels above the written indexes will be created when the serialized indexes reach the batching threshold.

func NewWriter

func NewWriter(ctx context.Context, chunks *chunk.Storage, tmpID string) *Writer

NewWriter create a new Writer.

func (*Writer) Close

func (w *Writer) Close() (*Index, error)

Close finishes the index, and returns the serialized top index level.

func (*Writer) WriteIndex

func (w *Writer) WriteIndex(idx *Index) error

WriteIndex writes an index entry.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL