chunk

package
v1.9.4-6277238fa1e72d2... Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 10, 2019 License: Apache-2.0 Imports: 17 Imported by: 2

Documentation

Index

Constants

View Source
const (
	// MB is Megabytes.
	MB = 1024 * 1024
	// WindowSize is the size of the rolling hash window.
	WindowSize = 64
)

Variables

View Source
var (
	ErrInvalidLengthChunk = fmt.Errorf("proto: negative length found during unmarshaling")
	ErrIntOverflowChunk   = fmt.Errorf("proto: integer overflow")
)

Functions

func Cleanup

func Cleanup(objC obj.Client, chunks *Storage)

Cleanup cleans up a local chunk storage instance.

func RandSeq

func RandSeq(n int) []byte

RandSeq generates a random sequence of data (n is number of bytes)

Types

type Annotation

type Annotation struct {
	Offset      int64
	RefDataRefs []*DataRef
	NextDataRef *DataRef
	Meta        interface{}
}

Annotation is used to associate information with a set of bytes written into the chunk storage layer.

type Chunk

type Chunk struct {
	Hash                 string   `protobuf:"bytes,1,opt,name=hash,proto3" json:"hash,omitempty"`
	XXX_NoUnkeyedLiteral struct{} `json:"-"`
	XXX_unrecognized     []byte   `json:"-"`
	XXX_sizecache        int32    `json:"-"`
}

func (*Chunk) Descriptor

func (*Chunk) Descriptor() ([]byte, []int)

func (*Chunk) GetHash

func (m *Chunk) GetHash() string

func (*Chunk) Marshal

func (m *Chunk) Marshal() (dAtA []byte, err error)

func (*Chunk) MarshalTo

func (m *Chunk) MarshalTo(dAtA []byte) (int, error)

func (*Chunk) MarshalToSizedBuffer

func (m *Chunk) MarshalToSizedBuffer(dAtA []byte) (int, error)

func (*Chunk) ProtoMessage

func (*Chunk) ProtoMessage()

func (*Chunk) Reset

func (m *Chunk) Reset()

func (*Chunk) Size

func (m *Chunk) Size() (n int)

func (*Chunk) String

func (m *Chunk) String() string

func (*Chunk) Unmarshal

func (m *Chunk) Unmarshal(dAtA []byte) error

func (*Chunk) XXX_DiscardUnknown

func (m *Chunk) XXX_DiscardUnknown()

func (*Chunk) XXX_Marshal

func (m *Chunk) XXX_Marshal(b []byte, deterministic bool) ([]byte, error)

func (*Chunk) XXX_Merge

func (m *Chunk) XXX_Merge(src proto.Message)

func (*Chunk) XXX_Size

func (m *Chunk) XXX_Size() int

func (*Chunk) XXX_Unmarshal

func (m *Chunk) XXX_Unmarshal(b []byte) error

type DataRef

type DataRef struct {
	// The chunk the referenced data is located in.
	Chunk *Chunk `protobuf:"bytes,1,opt,name=chunk,proto3" json:"chunk,omitempty"`
	// The hash of the data being referenced.
	// This field is empty when it is equal to the chunk hash (the ref is the whole chunk).
	Hash string `protobuf:"bytes,2,opt,name=hash,proto3" json:"hash,omitempty"`
	// The offset and size used for accessing the data within the chunk.
	OffsetBytes          int64    `protobuf:"varint,3,opt,name=offset_bytes,json=offsetBytes,proto3" json:"offset_bytes,omitempty"`
	SizeBytes            int64    `protobuf:"varint,4,opt,name=size_bytes,json=sizeBytes,proto3" json:"size_bytes,omitempty"`
	XXX_NoUnkeyedLiteral struct{} `json:"-"`
	XXX_unrecognized     []byte   `json:"-"`
	XXX_sizecache        int32    `json:"-"`
}

DataRef is a reference to data within a chunk.

func (*DataRef) Descriptor

func (*DataRef) Descriptor() ([]byte, []int)

func (*DataRef) GetChunk

func (m *DataRef) GetChunk() *Chunk

func (*DataRef) GetHash

func (m *DataRef) GetHash() string

func (*DataRef) GetOffsetBytes

func (m *DataRef) GetOffsetBytes() int64

func (*DataRef) GetSizeBytes

func (m *DataRef) GetSizeBytes() int64

func (*DataRef) Marshal

func (m *DataRef) Marshal() (dAtA []byte, err error)

func (*DataRef) MarshalTo

func (m *DataRef) MarshalTo(dAtA []byte) (int, error)

func (*DataRef) MarshalToSizedBuffer

func (m *DataRef) MarshalToSizedBuffer(dAtA []byte) (int, error)

func (*DataRef) ProtoMessage

func (*DataRef) ProtoMessage()

func (*DataRef) Reset

func (m *DataRef) Reset()

func (*DataRef) Size

func (m *DataRef) Size() (n int)

func (*DataRef) String

func (m *DataRef) String() string

func (*DataRef) Unmarshal

func (m *DataRef) Unmarshal(dAtA []byte) error

func (*DataRef) XXX_DiscardUnknown

func (m *DataRef) XXX_DiscardUnknown()

func (*DataRef) XXX_Marshal

func (m *DataRef) XXX_Marshal(b []byte, deterministic bool) ([]byte, error)

func (*DataRef) XXX_Merge

func (m *DataRef) XXX_Merge(src proto.Message)

func (*DataRef) XXX_Size

func (m *DataRef) XXX_Size() int

func (*DataRef) XXX_Unmarshal

func (m *DataRef) XXX_Unmarshal(b []byte) error

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader reads a set of DataRefs from chunk storage.

func (*Reader) Close

func (r *Reader) Close() error

Close closes the reader. Currently a no-op, but will be used when streaming is implemented.

func (*Reader) Len

func (r *Reader) Len() int64

Len returns the number of bytes left.

func (*Reader) NextRange

func (r *Reader) NextRange(dataRefs []*DataRef)

NextRange sets the next range for the reader.

func (*Reader) Read

func (r *Reader) Read(data []byte) (int, error)

Read reads from the byte stream produced by the set of DataRefs.

func (*Reader) WriteToN

func (r *Reader) WriteToN(w *Writer, n int64) error

WriteToN writes n bytes from the reader to the passed in writer. These writes are data reference copies when full chunks are being written to the writer.

type ReaderFunc

type ReaderFunc func() ([]*DataRef, error)

ReaderFunc is a callback that returns the next set of data references to a reader.

type Storage

type Storage struct {
	// contains filtered or unexported fields
}

Storage is the abstraction that manages chunk storage.

func LocalStorage

func LocalStorage(tb testing.TB) (obj.Client, *Storage)

LocalStorage creates a local chunk storage instance. Useful for storage layer tests.

func NewStorage

func NewStorage(objC obj.Client) *Storage

NewStorage creates a new Storage.

func (*Storage) DeleteAll

func (s *Storage) DeleteAll(ctx context.Context) error

DeleteAll deletes all of the chunks in object storage.

func (*Storage) List

func (s *Storage) List(ctx context.Context, f func(string) error) error

List lists all of the chunks in object storage.

func (*Storage) NewReader

func (s *Storage) NewReader(ctx context.Context, f ...ReaderFunc) *Reader

NewReader creates an io.ReadCloser for a chunk. (bryce) The whole chunk is in-memory right now. Could be a problem with concurrency, particularly the merge process. May want to handle concurrency here (pass in multiple data refs)

func (*Storage) NewWriter

func (s *Storage) NewWriter(ctx context.Context, averageBits int, f WriterFunc) *Writer

NewWriter creates an io.WriteCloser for a stream of bytes to be chunked. Chunks are created based on the content, then hashed and deduplicated/uploaded to object storage. The callback arguments are the chunk hash and content.

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer splits a byte stream into content defined chunks that are hashed and deduplicated/uploaded to object storage. Chunk split points are determined by a bit pattern in a rolling hash function (buzhash64 at https://github.com/chmduquesne/rollinghash).

func (*Writer) Annotate

func (w *Writer) Annotate(a *Annotation)

Annotate associates an annotation with the next set of bytes that are written.

func (*Writer) AnnotatedBytesSize

func (w *Writer) AnnotatedBytesSize() int64

AnnotatedBytesSize returns the size of the bytes for the current annotation.

func (*Writer) ChunkCount

func (w *Writer) ChunkCount() int64

ChunkCount returns a count of the number of chunks created/referenced by the writer.

func (*Writer) Close

func (w *Writer) Close() error

Close closes the writer and flushes the remaining bytes to a chunk and finishes the final range.

func (*Writer) Write

func (w *Writer) Write(data []byte) (int, error)

Write rolls through the data written, calling c.f when a chunk is found. Note: If making changes to this function, be wary of the performance implications (check before and after performance with chunker benchmarks).

type WriterFunc

type WriterFunc func(*DataRef, []*Annotation) error

WriterFunc is a callback that returns a data reference to the next chunk and the annotations within the chunk.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL