chunker

package
v0.5.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 23, 2023 License: Apache-2.0, MPL-2.0 Imports: 3 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Chunk

type Chunk struct {
	// Offset is the number of bytes from the start of the reader to the beginning of
	// the chunk.
	Offset int64

	// Length is the length of the chunk in bytes. Same as len(Data).
	Length int64

	// Data is the chunk data.
	Data []byte

	// Fingerprint is the value of the rolling hash algorithm for the chunk data.
	Fingerprint uint64
}

Chunk stores a content-defined chunk returned by a Chunker.

type Chunker

type Chunker struct {
	// contains filtered or unexported fields
}

Chunker implements the FastCDC content defined chunking algorithm. See https://www.usenix.org/system/files/conference/atc16/atc16-paper-xia.pdf.

func NewChunker

func NewChunker(rd io.Reader, opts Options) (*Chunker, error)

NewChunker returns a Chunker with the given Options.

func (*Chunker) Next

func (c *Chunker) Next() (Chunk, error)

Next returns the next Chunk from the reader or io.EOF after the last chunk has been read. The chunk data is invalidated when Next is called again.

type Options

type Options struct {
	// NormalSize is the target chunk size. Typically a power of 2. It must be in the
	// range 64B to 1GiB.
	AverageSize int64

	// (Optional) MinSize is the minimum allowed chunk size. By default, it's set to
	// AverageSize / 4.
	MinSize int64

	// (Optional) MaxSize is the maximum allowed chunk size. By default, it's set to
	// AverageSize * 4.
	MaxSize int64

	// (Optional) Sets the chunk normalization level. It may be set to 1, 2 or 3,
	// unless DisableNormalization is set, in which case it's ignored. By default,
	// it's set to 2.
	Normalization int64

	// (Optional) DisableNormalization turns normalization off. By default, it's set to
	// false.
	DisableNormalization bool

	// (Optional) Seed alters the lookup table of the rolling hash algorithm to mitigate
	// chunk-size based fingerprinting attacks. It may be set to a random uint64.
	Seed uint64

	// (Optional) BufSize is the size of the internal buffer used while chunking. It has
	// no effect on the chuking output, but performance is improved with larger buffers.
	// It must be at least MaxSize. Recommended values are 1 to 3 times MaxSize. By
	// default it is set to MaxSize * 2.
	BufSize int64
}

Options configures the options for the Chunker.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL