# chunker

package
Version: v0.1.0 Latest Latest

This package is not in the latest version of its module.

Go to latest
Published: Aug 21, 2015 License: BSD-2-Clause

## Documentation ¶

### Overview ¶

Package chunker implements Content Defined Chunking (CDC) based on a rolling Rabin Checksum.

#### Choosing a Random Irreducible Polynomial ¶

The function RandomPolynomial() returns a new random polynomial of degree 53 for use with the chunker. The degree 53 is chosen because it is the largest prime below 64-8 = 56, so that the top 8 bits of an uint64 can be used for optimising calculations in the chunker.

A random polynomial is chosen selecting 64 random bits, masking away bits 64..54 and setting bit 53 to one (otherwise the polynomial is not of the desired degree) and bit 0 to one (otherwise the polynomial is trivially reducible), so that 51 bits are chosen at random.

This process is repeated until Irreducible() returns true, then this polynomials is returned. If this doesn't happen after 1 million tries, the function returns an error. The probability for selecting an irreducible polynomial at random is about 7.5% ( (2^53-2)/53 / 2^51), so the probability that no irreducible polynomial has been found after 100 tries is lower than 0.04%.

#### Verifying Irreducible Polynomials ¶

During development the results have been verified using the computational discrete algebra system GAP, which can be obtained from the website at http://www.gap-system.org/.

For filtering a given list of polynomials in hexadecimal coefficient notation, the following script can be used:

```# create x over F_2 = GF(2)
x := Indeterminate(GF(2), "x");

# test if polynomial is irreducible, i.e. the number of factors is one
IrredPoly := function (poly)
return (Length(Factors(poly)) = 1);
end;;

# create a polynomial in x from the hexadecimal representation of the
# coefficients
Hex2Poly := function (s)
return ValuePol(CoefficientsQadic(IntHexString(s), 2), x);
end;;

# list of candidates, in hex
candidates := [ "3DA3358B4DC173" ];

# create real polynomials
L := List(candidates, Hex2Poly);

# filter and display the list of irreducible polynomials contained in L
Display(Filtered(L, x -> (IrredPoly(x))));
```

All irreducible polynomials from the list are written to the output.

#### Background Literature ¶

An introduction to Rabin Fingerprints/Checksums can be found in the following articles:

Michael O. Rabin (1981): "Fingerprinting by Random Polynomials" http://www.xmailserver.org/rabin.pdf

Ross N. Williams (1993): "A Painless Guide to CRC Error Detection Algorithms" http://www.zlib.net/crc_v3.txt

Andrei Z. Broder (1993): "Some Applications of Rabin's Fingerprinting Method" http://www.xmailserver.org/rabin_apps.pdf

Shuhong Gao and Daniel Panario (1997): "Tests and Constructions of Irreducible Polynomials over Finite Fields" http://www.math.clemson.edu/~sgao/papers/GP97a.pdf

Andrew Kadatch, Bob Jenkins (2007): "Everything we know about CRC but afraid to forget" http://crcutil.googlecode.com/files/crc-doc.1.0.pdf

### Constants ¶

View Source
```const (
KiB = 1024
MiB = 1024 * KiB

// MinSize is the minimal size of a chunk.
MinSize = 512 * KiB
// MaxSize is the maximal size of a chunk.
MaxSize = 8 * MiB
)```

### Variables ¶

This section is empty.

### Functions ¶

This section is empty.

### Types ¶

#### type Chunk ¶

```type Chunk struct {
Start  uint
Length uint
Cut    uint64
Digest []byte
}```

Chunk is one content-dependent chunk of bytes whose end was cut when the Rabin Fingerprint had the value stored in Cut.

#### func (Chunk) Reader ¶

`func (c Chunk) Reader(r io.ReaderAt) io.Reader`

#### type Chunker ¶

```type Chunker struct {
// contains filtered or unexported fields
}```

Chunker splits content with Rabin Fingerprints.

#### func New ¶

`func New(rd io.Reader, pol Pol, h hash.Hash) *Chunker`

New returns a new Chunker based on polynomial p that reads from rd with bufsize and pass all data to hash along the way.

#### func (*Chunker) Next ¶

`func (c *Chunker) Next() (*Chunk, error)`

Next returns the position and length of the next chunk of data. If an error occurs while reading, the error is returned with a nil chunk. The state of the current chunk is undefined. When the last chunk has been returned, all subsequent calls yield a nil chunk and an io.EOF error.

#### type Pol ¶

`type Pol uint64`

Pol is a polynomial from F_2[X].

#### func RandomPolynomial ¶

`func RandomPolynomial() (Pol, error)`

RandomPolynomial returns a new random irreducible polynomial of degree 53 (largest prime number below 64-8). There are (2^53-2/53) irreducible polynomials of degree 53 in F_2[X], c.f. Michael O. Rabin (1981): "Fingerprinting by Random Polynomials", page 4. If no polynomial could be found in one million tries, an error is returned.

#### func (Pol) Add ¶

`func (x Pol) Add(y Pol) Pol`

#### func (Pol) Deg ¶

`func (x Pol) Deg() int`

Deg returns the degree of the polynomial x. If x is zero, -1 is returned.

#### func (Pol) Div ¶

`func (x Pol) Div(d Pol) Pol`

Div returns the integer division result x / d.

#### func (Pol) DivMod ¶

`func (x Pol) DivMod(d Pol) (Pol, Pol)`

DivMod returns x / d = q, and remainder r, see https://en.wikipedia.org/wiki/Division_algorithm

#### func (Pol) Expand ¶

`func (x Pol) Expand() string`

Expand returns the string representation of the polynomial x.

#### func (Pol) GCD ¶

`func (x Pol) GCD(f Pol) Pol`

GCD computes the Greatest Common Divisor x and f.

#### func (Pol) Irreducible ¶

`func (x Pol) Irreducible() bool`

Irreducible returns true iff x is irreducible over F_2. This function uses Ben Or's reducibility test.

For details see "Tests and Constructions of Irreducible Polynomials over Finite Fields".

#### func (Pol) MarshalJSON ¶

`func (p Pol) MarshalJSON() ([]byte, error)`

#### func (Pol) Mod ¶

`func (x Pol) Mod(d Pol) Pol`

Mod returns the remainder of x / d

#### func (Pol) Mul ¶

`func (x Pol) Mul(y Pol) Pol`

Mul returns x*y. When an overflow occurs, Mul panics.

#### func (Pol) MulMod ¶

`func (x Pol) MulMod(f, g Pol) Pol`

MulMod computes x*f mod g

#### func (Pol) String ¶

`func (x Pol) String() string`

String returns the coefficients in hex.

#### func (*Pol) UnmarshalJSON ¶

`func (p *Pol) UnmarshalJSON(data []byte) error`