Documentation
¶
Overview ¶
Package lz provides encoders and decoders for LZ77 sequences. A sequence, as described in the zstd specification, describes a number of literal bytes and a match.
A Sequencer is an encoder that converts a byte stream into blocks of sequences. A Decoder converts the block of sequences into the original decompressed byte stream. A wrapped Sequencer reads the byte stream from a reader. The sequencers are provided here separately because they are more efficient for encoding byte slices directly.
The module provides multiple sequencers that provide different combinations of encoding speed and compression ratios. Usually a slower sequencer will generate a better compression ratio.
The Decoder slides the decompression window through a larger buffer implemented by DecBuffer.
Index ¶
- Constants
- Variables
- type BDHSConfig
- type BHSConfig
- type BTreeConfig
- type BTreeHashConfig
- type BUHSConfig
- type Block
- type CostEstimator
- type DConfig
- type DHSConfig
- type DecBuffer
- func (buf *DecBuffer) Available() int
- func (buf *DecBuffer) ByteAtEnd(i int) byte
- func (buf *DecBuffer) Init(windowSize, max int) error
- func (buf *DecBuffer) Len() int
- func (buf *DecBuffer) Pos() int64
- func (buf *DecBuffer) Read(p []byte) (n int, err error)
- func (buf *DecBuffer) Reset()
- func (buf *DecBuffer) Write(p []byte) (n int, err error)
- func (buf *DecBuffer) WriteBlock(blk Block) (k, l, n int, err error)
- func (buf *DecBuffer) WriteByte(c byte) error
- func (buf *DecBuffer) WriteMatch(n, offset int) error
- func (buf *DecBuffer) WriteTo(w io.Writer) (n int64, err error)
- type Decoder
- type GSASConfig
- type GenericSequencerConfig
- type HSConfig
- type HashConfig
- type MatchFinder
- type MatchFinderConfig
- type Params
- type SBConfig
- type Seq
- type SeqBuffer
- func (w *SeqBuffer) Available() int
- func (w *SeqBuffer) Buffer() *SeqBuffer
- func (w *SeqBuffer) Buffered() int
- func (w *SeqBuffer) Init(cfg SBConfig) error
- func (w *SeqBuffer) Len() int
- func (w *SeqBuffer) Pos() int64
- func (w *SeqBuffer) ReadAt(p []byte, pos int64) (n int, err error)
- func (w *SeqBuffer) ReadByteAt(pos int64) (c byte, err error)
- func (w *SeqBuffer) ReadFrom(r io.Reader) (n int64, err error)
- func (w *SeqBuffer) Reset(data []byte) error
- func (w *SeqBuffer) Write(p []byte) (n int, err error)
- type SeqConfig
- type Sequencer
- type SimpleEstimator
- type WrappedSequencer
Constants ¶
const ( // NoTrailingLiterals tells a sequencer that trailing literals don't // need to be included in the block. NoTrailingLiterals = 1 << iota )
Flags for the Sequence function.
Variables ¶
var ErrEmptyBuffer = errors.New("lz: empty buffer")
ErrEmptyBuffer indicates that the buffer is empty and no more data can be read or processed. More data must be provided to the buffer.
var ErrFullBuffer = errors.New("lz: full buffer")
ErrFullBuffer indicates that the buffer is full and no further data can be written.
Functions ¶
This section is empty.
Types ¶
type BDHSConfig ¶
type BDHSConfig struct {
// sequence buffer configuration
SBConfig
// smaller hash input length; range 2 to 8
InputLen1 int
// hash bits for the smaller hash input length
HashBits1 int
// larger input length; range 2 to 8
InputLen2 int
// hash bits for the larger hash input length
HashBits2 int
}
BDHSConfig provides the configuration parameters for the backward-looking double Hash Sequencer.
func (*BDHSConfig) ApplyDefaults ¶
func (cfg *BDHSConfig) ApplyDefaults()
ApplyDefaults sets for the zero fields in the configuration to the default values.
func (BDHSConfig) NewSequencer ¶ added in v0.1.0
func (cfg BDHSConfig) NewSequencer() (s Sequencer, err error)
NewSequencer creates a new BackwardDoubleHashSequencer.
func (*BDHSConfig) Verify ¶
func (cfg *BDHSConfig) Verify() error
Verify checks the configuration for errors.
type BHSConfig ¶
type BHSConfig struct {
SBConfig
// number of bits of the hash index
HashBits int
// length of the input used; range [2,8]
InputLen int
}
BHSConfig provides the parameters for the backward hash sequencer.
func (*BHSConfig) ApplyDefaults ¶
func (cfg *BHSConfig) ApplyDefaults()
ApplyDefaults sets values that are zero to their defaults values.
func (BHSConfig) NewSequencer ¶ added in v0.1.0
NewSequencer create a new backward hash sequencer.
type BTreeConfig ¶ added in v0.1.1
func (*BTreeConfig) ApplyDefaults ¶ added in v0.1.1
func (cfg *BTreeConfig) ApplyDefaults()
func (*BTreeConfig) NewMatchFinder ¶ added in v0.1.1
func (cfg *BTreeConfig) NewMatchFinder() (mf MatchFinder, err error)
func (*BTreeConfig) Verify ¶ added in v0.1.1
func (cfg *BTreeConfig) Verify() error
type BTreeHashConfig ¶ added in v0.1.1
func (*BTreeHashConfig) ApplyDefaults ¶ added in v0.1.1
func (cfg *BTreeHashConfig) ApplyDefaults()
func (*BTreeHashConfig) NewMatchFinder ¶ added in v0.1.1
func (cfg *BTreeHashConfig) NewMatchFinder() (mf MatchFinder, err error)
func (*BTreeHashConfig) Verify ¶ added in v0.1.1
func (cfg *BTreeHashConfig) Verify() error
type BUHSConfig ¶ added in v0.1.1
type BUHSConfig struct {
SBConfig
// number of bits of the hash index
HashBits int
// length of the input used; range [2,8]
InputLen int
// size of a bucket; range [1,128]
BucketSize int
}
BUHSConfig provides the configuration parameters for the bucket hash sequencer.
func (*BUHSConfig) ApplyDefaults ¶ added in v0.1.1
func (cfg *BUHSConfig) ApplyDefaults()
ApplyDefaults sets values that are zero to their defaults values.
func (BUHSConfig) NewSequencer ¶ added in v0.1.1
func (cfg BUHSConfig) NewSequencer() (s Sequencer, err error)
NewSequencer creates a new hash sequencer.
func (*BUHSConfig) Verify ¶ added in v0.1.1
func (cfg *BUHSConfig) Verify() error
Verify checks the config for correctness.
type Block ¶
Block stores sequences and literals. Note that literals that are not consumed by the Sequences slice need to be added to the end of the reconstructed data.
type CostEstimator ¶ added in v0.1.1
CostEstimator provides a cost estimation to encode matches and literals. The costs are provided for a match with a non-zero offset or m literal bytes with a zero o value. The Costs should be provided in bits, but other measures like 1/100th of a bit are also possible. The Update method is provided to update the offset history.
type DConfig ¶
DConfig contains the configuration for a simple Decoder. It provides the window size and the MaxSize of the buffer.
func (*DConfig) ApplyDefaults ¶
func (cfg *DConfig) ApplyDefaults()
ApplyDefaults applies the defaults for the configuration.
type DHSConfig ¶
type DHSConfig struct {
SBConfig
// smaller hash input length; range 2 to 8
InputLen1 int
// hash bits for the smaller hash input length
HashBits1 int
// larger input length; range 2 to 8
InputLen2 int
// hash bits for the larger hash input length
HashBits2 int
}
DHSConfig provides the configuration parameters for the DoubleHashSequencer.
func (*DHSConfig) ApplyDefaults ¶
func (cfg *DHSConfig) ApplyDefaults()
ApplyDefaults uses the defaults for the configuration parameters that are set to zero.
func (DHSConfig) NewSequencer ¶ added in v0.1.0
NewSequencer creates a new DoubleHashSequencer.
type DecBuffer ¶ added in v0.1.1
type DecBuffer struct {
// contains filtered or unexported fields
}
DecBuffer provides a simple buffer to decode sequences. The max field gives a target that can be exceeded once.
func (*DecBuffer) Available ¶ added in v0.1.1
Available provides the amount of data that can be written into the buffer.
func (*DecBuffer) ByteAtEnd ¶ added in v0.1.1
ByteAtEnd reads the byte with offset i from the end. If it it points outside the window the value returned is 0.
func (*DecBuffer) Init ¶ added in v0.1.1
Init initialized the buffer. The window size must be larger than 1 and max must be larger then the windowSize.
func (*DecBuffer) Len ¶ added in v0.1.1
Len returns the number of bytes in the unread portion of the buffer.
func (*DecBuffer) Read ¶ added in v0.1.1
Read reads data from the buffer. The function never returns an error.
func (*DecBuffer) Reset ¶ added in v0.1.1
func (buf *DecBuffer) Reset()
Reset puts the buffer into its initial state.
func (*DecBuffer) Write ¶ added in v0.1.1
Write writes the provided byte slice into the buffer and extends the window accordingly.
func (*DecBuffer) WriteBlock ¶ added in v0.1.1
WriteBlock writes a whole list of sequences, each sequence will be written atomically. The functions returns the number of sequences k written, the number of literals l consumed and the number of bytes n generated.
func (*DecBuffer) WriteByte ¶ added in v0.1.1
WriteByte writes a single byte to the buffer and extends the window.
func (*DecBuffer) WriteMatch ¶ added in v0.1.1
WriteMatch writes a match into the buffer and extends the window by the match.
type Decoder ¶
type Decoder struct {
// contains filtered or unexported fields
}
A Decoder decodes sequences and writes data into the writer.
func NewDecoder ¶
NewDecoder allocates and initializes a decoder. If the windowSize is not positive an error will be returned.
func (*Decoder) Init ¶
Init initializes the decoder. Internal buffers will be reused if they are large enough.
func (*Decoder) WriteBlock ¶
WriteBlock writes a complete block into the decoder.
type GSASConfig ¶
GSASConfig defines the configuration parameter for the greedy suffix array sequencer.
func (*GSASConfig) ApplyDefaults ¶
func (cfg *GSASConfig) ApplyDefaults()
ApplyDefaults sets configuration parameters to its defaults. The code doesn't provide consistency.
func (GSASConfig) NewSequencer ¶ added in v0.1.0
func (cfg GSASConfig) NewSequencer() (s Sequencer, err error)
NewSequencer generates a new sequencer using the configuration parameters in the structure.
func (*GSASConfig) Verify ¶
func (cfg *GSASConfig) Verify() error
Verify checks the configuration for inconsistencies.
type GenericSequencerConfig ¶ added in v0.1.1
type GenericSequencerConfig struct {
SBConfig
MatchFinderConfigs []MatchFinderConfig
CostEstimator CostEstimator
}
func (*GenericSequencerConfig) ApplyDefaults ¶ added in v0.1.1
func (cfg *GenericSequencerConfig) ApplyDefaults()
func (*GenericSequencerConfig) NewSequencer ¶ added in v0.1.1
func (cfg *GenericSequencerConfig) NewSequencer() (s Sequencer, err error)
func (*GenericSequencerConfig) Verify ¶ added in v0.1.1
func (cfg *GenericSequencerConfig) Verify() error
type HSConfig ¶
type HSConfig struct {
SBConfig
// number of bits of the hash index
HashBits int
// length of the input used; range [2,8]
InputLen int
}
HSConfig provides the configuration parameters for the HashSequencer.
func (*HSConfig) ApplyDefaults ¶
func (cfg *HSConfig) ApplyDefaults()
ApplyDefaults sets values that are zero to their defaults values.
func (HSConfig) NewSequencer ¶ added in v0.1.0
NewSequencer creates a new hash sequencer.
type HashConfig ¶ added in v0.1.1
func (*HashConfig) ApplyDefaults ¶ added in v0.1.1
func (cfg *HashConfig) ApplyDefaults()
func (*HashConfig) NewMatchFinder ¶ added in v0.1.1
func (cfg *HashConfig) NewMatchFinder() (mf MatchFinder, err error)
func (*HashConfig) Verify ¶ added in v0.1.1
func (cfg *HashConfig) Verify() error
type MatchFinder ¶ added in v0.1.1
type MatchFinder interface {
Add(pos uint32, x uint64)
AppendMatchesAndAdd(m []uint32, pos uint32, x uint64) []uint32
Adapt(delta uint32)
// Resets the match finder and sets the pointer to the new data slice. The
// pointer is used to ensure that length changes are available to the match
// finders.
Reset(pdata *[]byte)
}
type MatchFinderConfig ¶ added in v0.1.1
type MatchFinderConfig interface {
NewMatchFinder() (mf MatchFinder, err error)
ApplyDefaults()
Verify() error
}
type Params ¶ added in v0.1.1
type Params struct {
// MemoryBudget specifies the memory budget in bytes for the sequencer. The
// budget controls how much memory the sequencer has for the window size and the
// match search data structures. It doesn't control temporary memory
// allocations. It is a budget, so it can be overdrawn, right?
MemoryBudget int
// Effort is scale from 1 to 10 controlling the CPU consumption. A
// sequencer with an effort of 1 might be extremely fast but will have a
// worse compression ratio. The default effort is 6 and will provide a
// reasonable compromise between compression speed and compression
// ratio. Effort 10 will provide the best compression ratio but will be
// very slow.
Effort int
// BlockSize defines a maximum block size. Note that the configurator
// might create a smaller block size to fit the match search data
// structures into the memory budget. The main consumer is ZStandard
// which has a maximum block size of 128 kByte.
BlockSize int
// WindowSize fixes the window size.
WindowSize int
}
Params provides a general method to create sequencers.
func (*Params) ApplyDefaults ¶ added in v0.1.1
func (p *Params) ApplyDefaults()
ApplyDefaults applies the defaults to the Config structure. The memory budget is set to 2 MB, the effort to 5 and the block size to 128 kByte unless no other non-zero values have been set.
type SBConfig ¶ added in v0.1.1
type SBConfig struct {
// WindowSize is the maximum window size in bytes
WindowSize int
// ShrinkSize provides the size the buffer is shrunk to if the buffer
// has been completely filled and encoded. It must be smaller than the
// BufferSize, and should be significantly so.
ShrinkSize int
// BufferSize defines the maximum size of the buffer. The BufferSize
// must be greater or equal the window size.
BufferSize int
// BlockSize provides the block size.
BlockSize int
}
SBConfig stores the parameter for the Window.
func (*SBConfig) ApplyDefaults ¶ added in v0.1.1
func (cfg *SBConfig) ApplyDefaults()
ApplyDefaults sets the defaults for the sequencer buffer configuration.
func (*SBConfig) BufferConfig ¶ added in v0.1.1
BufferConfig returns the a pointer to the sequencer buffer configuration, SBConfig.
func (*SBConfig) SetWindowSize ¶ added in v0.1.1
SetWindowSize sets the window size. BufferSize and ShrinkSize will be adapted.
type Seq ¶
Seq represents a single Lempel-Ziv 77 Sequence describing a match, consisting of the offset, the length of the match and the number of literals preceding the match. The Aux field can be used on upper layers to store additional information.
type SeqBuffer ¶ added in v0.1.1
type SeqBuffer struct {
// SBConfig stores the configuration parameters
SBConfig
// contains filtered or unexported fields
}
SeqBuffer acts as a buffer for the sequencers. The buffer contains the window from which matches can't be copied in a sequence. Data is written into the buffer, the sequencer creates Lempel-Ziv sequences and advances the window head. Since all positions behind the window head are in the window we even save one check in the sequencer loop.
The Sequencer ensures that len(w.data)+7 < cap(w.data), which allows 64-bit reads on all byte position of the window.
func (*SeqBuffer) Available ¶ added in v0.1.1
Available returns the number of bytes are available for writing into the buffer.
func (*SeqBuffer) Buffer ¶ added in v0.1.1
Buffer returns a pointer to itself. It provides the function to the sequencer structure who embed SeqBuffer.
func (*SeqBuffer) Buffered ¶ added in v0.1.1
Buffered returns the number of bytes buffered but are not yet part of the window. They have to be sequenced first.
func (*SeqBuffer) Init ¶ added in v0.1.1
Init initializes the window. The parameter size must be positive.
func (*SeqBuffer) ReadByteAt ¶ added in v0.1.1
ReadByteAt returns the byte at the absolute position pos unless pos is outside of the data stored in window.
func (*SeqBuffer) ReadFrom ¶ added in v0.1.1
ReadFrom transfers data from the reader into the buffer.
type SeqConfig ¶ added in v0.1.1
type SeqConfig interface {
NewSequencer() (s Sequencer, err error)
BufferConfig() *SBConfig
ApplyDefaults()
Verify() error
}
SeqConfig generates new sequencer instances.
type Sequencer ¶
type Sequencer interface {
// Sequence finds Lempel-Ziv sequences.
Sequence(blk *Block, flags int) (n int, err error)
// Shrink reduces the actual window length to make more buffer space
// available.
Shrink()
// Buffer returns a pointer to the sequencer buffer.
Buffer() *SeqBuffer
// Reset allows the reuse of the Sequencer. The data slice provides new
// data to sequence but Sequencers are usually also Writers for
// providing the data.
Reset(data []byte) error
}
Sequencer transforms byte streams into Lempel-Ziv sequences, that allow the reconstruction of the input data.
type SimpleEstimator ¶ added in v0.1.1
type SimpleEstimator struct {
Rep [4]uint32
}
SimpleEstimator provides a very simple cost model for compression. It supports offset repeats as in LZMA.
func (*SimpleEstimator) Cost ¶ added in v0.1.1
func (e *SimpleEstimator) Cost(m, o uint32) uint64
Cost provides a simple cost estimation for the match or a literal, offset 0.
func (*SimpleEstimator) Push ¶ added in v0.1.1
func (e *SimpleEstimator) Push(o uint32)
Push writes the offset o into the history.
func (*SimpleEstimator) Reset ¶ added in v0.1.1
func (e *SimpleEstimator) Reset()
type WrappedSequencer ¶
type WrappedSequencer struct {
// contains filtered or unexported fields
}
WrappedSequencer is returned by the Wrap function. It provides the Sequence method and reads the data required automatically from the stored reader.
func Wrap ¶
func Wrap(r io.Reader, seq Sequencer) *WrappedSequencer
Wrap combines a reader and a Sequencer and makes a Sequencer. The user doesn't need to take care of filling the Sequencer with additional data. The returned sequencer returns EOF if no further data is available.
Wrap chooses the minimum of 32 kbyte or half of the window size as shrink size.
func (*WrappedSequencer) MemSize ¶
func (s *WrappedSequencer) MemSize() uintptr
MemSize returns the memory consumption of the wrapped sequencer.
func (*WrappedSequencer) Reset ¶
func (s *WrappedSequencer) Reset(r io.Reader)
Reset puts the WrappedSequencer in its initial state and changes the wrapped reader to another reader.