lz

package module
v0.0.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 17, 2021 License: BSD-3-Clause Imports: 8 Imported by: 2

README

Module LZ

The LZ module provides sequencers that convert byte streams into blocks of Lempel-Ziv 77 sequences. It is designed to support multiple compression methods that differ in the way they are encoding those LZ77 sequences.

Documentation

Overview

Package lz provides encoders and decoders for LZ77 sequences. A sequence, as described in the zstd specification, describes a number of literal bytes and a match.

A Sequencer is an encoder that converts a byte stream into blocks of sequences. A Decoder converts the block of sequences into the original decompressed byte stream. A wrapped Sequencer reads the byte stream from a reader. The sequencers are provided here seperately because they are more efficient for encoding byte slices directly.

The module provides multiple sequencers that provide different combinations of encoding speed and compression ratios. Usually a slower sequencer will generate a better compression ratio.

We provide also two decoders. The Decoder slides the decompression window through a larger buffer implmented by Buffer. The RingDecoder uses the RingBuffer that requires only a slice of the size of the window plus 1. The Decoder is significantly faster.

Index

Constants

View Source
const (
	// NoTrailingLiterals tells a sequencer that trailing literals don't
	// need to be included in the block.
	NoTrailingLiterals = 1 << iota
)

Flags for the Sequence function.

Variables

View Source
var ErrEmptyBuffer = errors.New("lz: empty buffer")

ErrEmptyBuffer indicates that the buffer is empty and no more data can be read or processed. More data must be provided to the buffer.

View Source
var ErrFullBuffer = errors.New("lz: buffer is full")

ErrFullBuffer indicates that no more data can be buffered. Data must be read or processed.

Functions

This section is empty.

Types

type BDHSConfig

type BDHSConfig struct {
	// maximal window size
	WindowSize int
	// size of the window if the buffer is shrinked
	ShrinkSize int
	// maximum size of the buffer
	MaxSize int
	// BlockSize: target size for a block
	BlockSize int
	// smaller hash input length; range 2 to 8
	InputLen1 int
	// hash bits for the smaller hash input length
	HashBits1 int
	// larger input length; range 2 to 8
	InputLen2 int
	// hash bits for the larger hash input length
	HashBits2 int
}

BDHSConfig provides the confifuration parameters for the DoubleHashSequencer.

func (*BDHSConfig) ApplyDefaults

func (cfg *BDHSConfig) ApplyDefaults()

ApplyDefaults uses the defaults for the configuration parameters that are set to zero.

func (BDHSConfig) NewInputSequencer

func (cfg BDHSConfig) NewInputSequencer() (s InputSequencer, err error)

NewInputSequencer creates a new DoubleHashSequencer.

func (*BDHSConfig) Verify

func (cfg *BDHSConfig) Verify() error

Verify checks the configuration for errors.

type BHSConfig

type BHSConfig struct {
	// maximal window size
	WindowSize int
	// size of the window if the buffer is shrinked
	ShrinkSize int
	// maximum size of the buffer
	MaxSize int
	// BlockSize: target size for a block
	BlockSize int
	// number of bits of the hash index
	HashBits int
	// length of the input used; range [2,8]
	InputLen int
}

BHSConfig provides the parameters for the backward hash sequencer.

func (*BHSConfig) ApplyDefaults

func (cfg *BHSConfig) ApplyDefaults()

ApplyDefaults sets values that are zero to their defaults values.

func (BHSConfig) NewInputSequencer

func (cfg BHSConfig) NewInputSequencer() (s InputSequencer, err error)

NewInputSequencer create a new backward hash sequencer.

func (*BHSConfig) Verify

func (cfg *BHSConfig) Verify() error

Verify checks the config for correctness.

type BackwardDoubleHashSequencer

type BackwardDoubleHashSequencer struct {
	// contains filtered or unexported fields
}

BackwardDoubleHashSequencer uses two hashes and tries to extend matches backward.

func NewBackwardDoubleHashSequencer

func NewBackwardDoubleHashSequencer(cfg BDHSConfig) (s *BackwardDoubleHashSequencer, err error)

NewBackwardDoubleHashSequencer creates a new sequencer. If the configuration is invalid an error will be returned.

func (*BackwardDoubleHashSequencer) ByteAt added in v0.0.3

func (s *BackwardDoubleHashSequencer) ByteAt(pos int64) (c byte, err error)

func (*BackwardDoubleHashSequencer) Init

Init initializes the sequencer. The method returns an error if the configuration contains inconsistencies and the sequencer remains uninitialized.

func (*BackwardDoubleHashSequencer) MemSize

func (s *BackwardDoubleHashSequencer) MemSize() uintptr

MemSize returns the consumed memory size by the sequencer.

func (*BackwardDoubleHashSequencer) Pos added in v0.0.3

func (s *BackwardDoubleHashSequencer) Pos() int64

Pos returns the position of the window head.

func (*BackwardDoubleHashSequencer) ReadFrom

func (s *BackwardDoubleHashSequencer) ReadFrom(r io.Reader) (n int64, err error)

ReadFrom is an alternative way to write data into the buffer.

func (*BackwardDoubleHashSequencer) RequestBuffer

func (s *BackwardDoubleHashSequencer) RequestBuffer() int

RequestBuffer returns the number of bytes that should be written into the sequencer.

func (*BackwardDoubleHashSequencer) Reset

func (s *BackwardDoubleHashSequencer) Reset()

Reset puts the sequencer into the state after initialization. The allocated memory in the buffer will be maintained.

func (*BackwardDoubleHashSequencer) Sequence

func (s *BackwardDoubleHashSequencer) Sequence(blk *Block, flags int) (n int, err error)

Sequence computes the LZ77 sequence for the next block. It returns the number of bytes actually sequenced. ErrEmptyBuffer will be returned if there is no data to sequence.

func (*BackwardDoubleHashSequencer) Shrink

func (s *BackwardDoubleHashSequencer) Shrink() int

Shrink moves the tail of the Window, determined by ShrinkSize, to the front of the buffer and makes then more space available to write into the buffer.

func (*BackwardDoubleHashSequencer) WindowSize

func (s *BackwardDoubleHashSequencer) WindowSize() int

WindowSize returns the configured window size for the sequencer.

func (*BackwardDoubleHashSequencer) Write

func (s *BackwardDoubleHashSequencer) Write(p []byte) (n int, err error)

Write writes data into the buffer that will be later processed by the Sequence method.

type BackwardHashSequencer

type BackwardHashSequencer struct {
	// contains filtered or unexported fields
}

BackwardHashSequencer allows the creation of sequence blocks using a simple hash table. It extends found matches by looking backward in the input stream.

func NewBackwardHashSequencer

func NewBackwardHashSequencer(cfg BHSConfig) (s *BackwardHashSequencer, err error)

NewBackwardHashSequencer creates a new backward hash sequencer.

func (*BackwardHashSequencer) ByteAt added in v0.0.3

func (s *BackwardHashSequencer) ByteAt(pos int64) (c byte, err error)

func (*BackwardHashSequencer) Init

func (s *BackwardHashSequencer) Init(cfg BHSConfig) error

Init initialzes the backward hash sequencer. It returns an error if there is an issue with the configuration parameters.

func (*BackwardHashSequencer) MemSize

func (s *BackwardHashSequencer) MemSize() uintptr

MemSize returns the consumed memory size by the sequencer.

func (*BackwardHashSequencer) Pos added in v0.0.3

func (s *BackwardHashSequencer) Pos() int64

Pos returns the position of the window head.

func (*BackwardHashSequencer) ReadFrom

func (s *BackwardHashSequencer) ReadFrom(r io.Reader) (n int64, err error)

ReadFrom is an alternative way to write data into the buffer.

func (*BackwardHashSequencer) RequestBuffer

func (s *BackwardHashSequencer) RequestBuffer() int

RequestBuffer provides the number of bytes that the sequencer requests to be filled.

func (*BackwardHashSequencer) Reset

func (s *BackwardHashSequencer) Reset()

Reset resets the backward hash sequencer to the initial state after Init has returned.

func (*BackwardHashSequencer) Sequence

func (s *BackwardHashSequencer) Sequence(blk *Block, flags int) (n int, err error)

Sequence converts the next block of k bytes to a sequences. The block will be overwritten. The method returns the number of bytes sequenced and any error encountered. It return ErrEmptyBuffer if there is no further data available.

If blk is nil the search structures will be filled. This mode can be used to ignore segments of data.

func (*BackwardHashSequencer) Shrink

func (s *BackwardHashSequencer) Shrink() int

Shrink moves the tail of the Window, determined by ShrinkSize, to the front of the buffer and makes then more space available to write into the buffer.

func (*BackwardHashSequencer) WindowSize

func (s *BackwardHashSequencer) WindowSize() int

WindowSize returns the configured window size for the sequencer.

func (*BackwardHashSequencer) Write

func (s *BackwardHashSequencer) Write(p []byte) (n int, err error)

Write writes data into the buffer that will be later processed by the Sequence method.

type Block

type Block struct {
	Sequences []Seq
	Literals  []byte
}

Block stores sequences and literals. Note that literals that are not consumed by the Sequences slice need to be added to the end of the reconstructed data.

func (*Block) Len

func (b *Block) Len() int64

Len computes the length of the block in bytes. It assumes that the sum of the literal lengths in the sequences doesn't exceed that length of the Literals byte slice.

type Buffer

type Buffer struct {
	// contains filtered or unexported fields
}

Buffer provides a simple buffer to decode sequences. The max field gives a target that can be exceeded once.

func (*Buffer) Available added in v0.0.3

func (buf *Buffer) Available() int

Available provides the amount of data that can be written into the buffer.

func (*Buffer) ByteAtEnd added in v0.0.3

func (buf *Buffer) ByteAtEnd(i int) byte

ByteAtEnd reads the byte with offset i from the end. If it it points outside the window the value returned is 0.

func (*Buffer) Init

func (buf *Buffer) Init(windowSize, max int) error

Init initialized the buffer. The window size must be larger than 1 and max must be larger then the windowSize.

func (*Buffer) Len added in v0.0.3

func (buf *Buffer) Len() int

Len returns the number of bytes in the unread portion of the buffer.

func (*Buffer) Pos added in v0.0.3

func (buf *Buffer) Pos() int64

Pos returns the file position of the window head.

func (*Buffer) Read

func (buf *Buffer) Read(p []byte) (n int, err error)

Read reads data from the buffer.

func (*Buffer) Reset

func (buf *Buffer) Reset()

Reset puts the buffer into its initial state.

func (*Buffer) Write

func (buf *Buffer) Write(p []byte) (n int, err error)

Write writes the provided byte slice into the buffer and extends the window accordingly.

func (*Buffer) WriteBlock

func (buf *Buffer) WriteBlock(blk Block) (k, l, n int, err error)

WriteBlock writes a whole list of sequences, each sequence will be written atomically. The functions returns the number of sequences k written, the number of literals l consumed and the number of bytes n generated.

func (*Buffer) WriteByte added in v0.0.3

func (buf *Buffer) WriteByte(c byte) error

WriteByte writes a single byte to the buffer and extends the window.

func (*Buffer) WriteMatch

func (buf *Buffer) WriteMatch(n, offset int) error

WriteMatch writes a match into the buffer and extends the window by the match.

func (*Buffer) WriteTo

func (buf *Buffer) WriteTo(w io.Writer) (n int64, err error)

WriteTo writes all data to read into the writer.

type Config

type Config struct {
	// MemoryBudget specifies the memory budget in bytes for the sequencer. The
	// budget controls how much memory the sequencer has for the window size and the
	// match search data structures. It doesn't control temporary memory
	// allocations. It is a budget, so it can be overdrawn, right?
	MemoryBudget int
	// Effort is scale from 1 to 10 controlling the CPU consumption. A
	// sequencer with an effort of 1 might be extremely fast but will have a
	// worse compression ratio. The default effort is 6 and will provide a
	// reasonable compromise between compression speed and compression
	// ratio. Effort 10 will provide the best compression ratio but will
	// require a higher compression ratio but will be very slow.
	Effort int
	// MaxBlockSize defines a maximum block size. Note that the configurator
	// might create a smaller block size to fit the match search data
	// structures into the memory budget. The main consumer is ZStandard
	// which has a maximum block size of 128 kByte.
	BlockSize int
	// WindowSize fixes the window size.
	WindowSize int
}

Config provides a general method to create sequencers.

func (*Config) ApplyDefaults

func (cfg *Config) ApplyDefaults()

ApplyDefaults applies the defaults to the Config structure. The memory budget is set to 2 MB, the effort to 5 and the block size to 128 kByte unless no other non-zero values have been set.

func (*Config) NewInputSequencer

func (cfg *Config) NewInputSequencer() (s InputSequencer, err error)

NewInputSequencer creates a new sequencer according to the parameters provided. The function will only return an error the parameters are negative but otherwise always try to satisfy the requirements.

func (*Config) Verify

func (cfg *Config) Verify() error

Verify checks the configuration for errors. Use ApplyDefaults before this function because it doesn't support zero values in all cases.

type Configurator

type Configurator interface {
	NewInputSequencer() (s InputSequencer, err error)
}

Configurator defines a general interface for sequencer configurations. The different Sequencers have all different configuration parameters and require their own configuration. All configuration types must support the NewInputSequencer method.

Using pattern language that is obviously a factory, but we support multiple factories. A configuration structure like HashSequencerConfig creates only HashSequencers but the general SequencerConfig structure can build different InputSequencer.

type DConfig

type DConfig struct {
	WindowSize int
	MaxSize    int
}

DConfig contains the configuration for a simple Decoder. It provides the window size and the MaxSize of the buffer.

func (*DConfig) ApplyDefaults

func (cfg *DConfig) ApplyDefaults()

ApplyDefaults applies the defaults for the configuration.

func (*DConfig) Verify

func (cfg *DConfig) Verify() error

Verify checks the configuration and returns any errors.

type DHSConfig

type DHSConfig struct {
	// maximal window size
	WindowSize int
	// size of the window if the buffer is shrinked
	ShrinkSize int
	// maximum size of the buffer
	MaxSize int
	// BlockSize: target size for a block
	BlockSize int
	// smaller hash input length; range 2 to 8
	InputLen1 int
	// hash bits for the smaller hash input length
	HashBits1 int
	// larger input length; range 2 to 8
	InputLen2 int
	// hash bits for the larger hash input length
	HashBits2 int
}

DHSConfig provides the confifuration parameters for the DoubleHashSequencer.

func (*DHSConfig) ApplyDefaults

func (cfg *DHSConfig) ApplyDefaults()

ApplyDefaults uses the defaults for the configuration parameters that are set to zero.

func (DHSConfig) NewInputSequencer

func (cfg DHSConfig) NewInputSequencer() (s InputSequencer, err error)

NewInputSequencer creates a new DoubleHashSequencer.

func (*DHSConfig) Verify

func (cfg *DHSConfig) Verify() error

Verify checks the configuration for errors.

type Decoder

type Decoder struct {
	// contains filtered or unexported fields
}

A Decoder decodes sequences and writes data into the writer.

func NewDecoder

func NewDecoder(w io.Writer, cfg DConfig) (*Decoder, error)

NewDecoder allocates and initializes a decoder. If the windowSize is not positive an error will be returned.

func (*Decoder) Flush

func (d *Decoder) Flush() error

Flush writes all decoded data to the underlying writer.

func (*Decoder) Init

func (d *Decoder) Init(w io.Writer, cfg DConfig) error

Init initializes the decoder. Internal bufferes will be reused if they are largen enougn.

func (*Decoder) Reset

func (d *Decoder) Reset(w io.Writer)

Reset resets the decoder to its initial state.

func (*Decoder) Write

func (d *Decoder) Write(p []byte) (n int, err error)

Write writes data directoly into the decoder.

func (*Decoder) WriteBlock

func (d *Decoder) WriteBlock(blk Block) (k, l, n int, err error)

WriteBlock writes a complete block into the decoder.

func (*Decoder) WriteMatch

func (d *Decoder) WriteMatch(n int, offset int) error

WriteMatch writes a single match into the decoder.

type DoubleHashSequencer

type DoubleHashSequencer struct {
	// contains filtered or unexported fields
}

DoubleHashSequencer generates LZ77 sequences by using two hash tables. The input length for the two hash tables will be different. The speed of the hash sequencer is slower than sequencers using a single hash, but the compression ratio is much better.

func NewDoubleHashSequencer

func NewDoubleHashSequencer(cfg DHSConfig) (s *DoubleHashSequencer, err error)

NewDoubleHashSequencer allocates a new DoubleHashSequencer value and initializes it. The function returns the first error found in the configuration.

func (*DoubleHashSequencer) ByteAt added in v0.0.3

func (s *DoubleHashSequencer) ByteAt(pos int64) (c byte, err error)

func (*DoubleHashSequencer) Init

func (s *DoubleHashSequencer) Init(cfg DHSConfig) error

Init initializes the DoubleHashSequencer. The first error found in the configuration will be returned.

func (*DoubleHashSequencer) MemSize

func (s *DoubleHashSequencer) MemSize() uintptr

MemSize returns the consumed memory size by the data structure.

func (*DoubleHashSequencer) Pos added in v0.0.3

func (s *DoubleHashSequencer) Pos() int64

Pos returns the position of the window head.

func (*DoubleHashSequencer) ReadFrom

func (s *DoubleHashSequencer) ReadFrom(r io.Reader) (n int64, err error)

ReadFrom is an alternative way to write data into the buffer.

func (*DoubleHashSequencer) RequestBuffer

func (s *DoubleHashSequencer) RequestBuffer() int

RequestBuffer answers the question whether data needs to be provided to the sequencer. If no data is need 0 will be returned and otherwise the number of bytes that can be added to the internal buffer of the sequencer.

func (*DoubleHashSequencer) Reset

func (s *DoubleHashSequencer) Reset()

Reset puts the DoubleHashSequencer in its initial state.

func (*DoubleHashSequencer) Sequence

func (s *DoubleHashSequencer) Sequence(blk *Block, flags int) (n int, err error)

Sequence generates the LZ77 sequences. It returns the number of bytes covered by the new sequences. The block will be overwritten but the memory for the slices will be reused.

func (*DoubleHashSequencer) Shrink

func (s *DoubleHashSequencer) Shrink() int

Shrink moves the tail of the Window, determined by ShrinkSize, to the front of the buffer and makes then more space available to write into the buffer.

func (*DoubleHashSequencer) WindowSize

func (s *DoubleHashSequencer) WindowSize() int

WindowSize returns the configured window size for the sequencer.

func (*DoubleHashSequencer) Write

func (s *DoubleHashSequencer) Write(p []byte) (n int, err error)

Write writes data into the buffer that will be later processed by the Sequence method.

type GSASConfig

type GSASConfig struct {
	// maximal window size
	WindowSize int
	// size of the window if the buffer is shrinked
	ShrinkSize int
	// maximum size of the buffer
	MaxSize int
	// target size for a block
	BlockSize int
	// minimum match len
	MinMatchLen int
}

GSASConfig defines the configuration parameter for the greedy suffix array seqeuncer.

func (*GSASConfig) ApplyDefaults

func (cfg *GSASConfig) ApplyDefaults()

ApplyDefaults sets configuration parameters to its defaults. The code doesn't provide consistency.

func (GSASConfig) NewInputSequencer

func (cfg GSASConfig) NewInputSequencer() (s InputSequencer, err error)

func (*GSASConfig) Verify

func (cfg *GSASConfig) Verify() error

Verify checks the configuration for inconsistencies.

type GreedySuffixArraySequencer

type GreedySuffixArraySequencer struct {
	// contains filtered or unexported fields
}

GreedySuffixArraySequencer provides a sequencer that uses a suffix array for the window and buffered data to create sequence. It looks for the two nearest entries that have the longest match.

Since computing the suffix array is rather slow, it consumes a lot of CPU. Double Hash Sequencers are achieving almost the same compression rate with much less CPU consumption.

func NewGreedySuffixArraySeqeuncer

func NewGreedySuffixArraySeqeuncer(cfg GSASConfig) (s *GreedySuffixArraySequencer, err error)

NewGreedySuffixArraySeqeuncer creates a new value using the provided configuration. If the configuration has inconsistencies an error will be returned and the value of the return value s will be nil.

func (*GreedySuffixArraySequencer) ByteAt added in v0.0.3

func (s *GreedySuffixArraySequencer) ByteAt(pos int64) (c byte, err error)

func (*GreedySuffixArraySequencer) Init

Init initializes the seequencer. If the configuration has inconsistencies or invalid values the method returns an error.

func (*GreedySuffixArraySequencer) MemSize

func (s *GreedySuffixArraySequencer) MemSize() uintptr

MemSize returns the consumed memory size by the

func (*GreedySuffixArraySequencer) Pos added in v0.0.3

func (s *GreedySuffixArraySequencer) Pos() int64

Pos returns the position of the window head.

func (*GreedySuffixArraySequencer) ReadFrom

func (s *GreedySuffixArraySequencer) ReadFrom(r io.Reader) (n int64, err error)

ReadFrom is an alternative way to write data into the buffer.

func (*GreedySuffixArraySequencer) RequestBuffer

func (s *GreedySuffixArraySequencer) RequestBuffer() int

RequestBuffer returns the number of bytes the sequencer should be provided with not to run in an error for the new Sequence call. The suffix array may be reset if the buffer is changed.

func (*GreedySuffixArraySequencer) Reset

func (s *GreedySuffixArraySequencer) Reset()

Reset puts the sequencer in the initial state.

func (*GreedySuffixArraySequencer) Sequence

func (s *GreedySuffixArraySequencer) Sequence(blk *Block, flags int) (n int, err error)

Sequence computes the sequences for the next block. Data in the block will be overwritten. The NoTrailingLiterals flag is supported. It returns the number of bytes covered by the computed sequences. If the buffer is empty ErrEmptyBuffer will be returned.

The method might compute the suffix array anew using the sort method.

func (*GreedySuffixArraySequencer) Shrink

func (s *GreedySuffixArraySequencer) Shrink() int

Shrink moves the tail of the Window, determined by ShrinkSize, to the front of the buffer and makes then more space available to write into the buffer.

func (*GreedySuffixArraySequencer) WindowSize

func (s *GreedySuffixArraySequencer) WindowSize() int

WindowSize returns the configured window size for the sequencer.

func (*GreedySuffixArraySequencer) Write

func (s *GreedySuffixArraySequencer) Write(p []byte) (n int, err error)

Write writes data into the buffer that will be later processed by the Sequence method.

type HSConfig

type HSConfig struct {
	// maximal window size
	WindowSize int
	// size of the window if the buffer is shrinked
	ShrinkSize int
	// maximum size of the buffer
	MaxSize int
	// BlockSize: target size for a block
	BlockSize int
	// number of bits of the hash index
	HashBits int
	// length of the input used; range [2,8]
	InputLen int
}

HSConfig provides the configuration parameters for the HashSequencer.

The pos-buffer contains the sliding window. If the window reaches the end of the buffer parts of it needs to be moved to the front of the buffer. The number of bytes to be moved are defined by the shrinkSize. A shrinkSize of 0 is supported.

func (*HSConfig) ApplyDefaults

func (cfg *HSConfig) ApplyDefaults()

ApplyDefaults sets values that are zero to their defaults values.

func (HSConfig) NewInputSequencer

func (cfg HSConfig) NewInputSequencer() (s InputSequencer, err error)

NewInputSequencer creates a new hash sequencer.

func (*HSConfig) Verify

func (cfg *HSConfig) Verify() error

Verify checks the config for correctness.

type HashSequencer

type HashSequencer struct {
	// contains filtered or unexported fields
}

HashSequencer allows the creation of sequence blocks using a simple hash table.

func NewHashSequencer

func NewHashSequencer(cfg HSConfig) (s *HashSequencer, err error)

NewHashSequencer creates a new hash sequencer.

func (*HashSequencer) ByteAt added in v0.0.3

func (s *HashSequencer) ByteAt(pos int64) (c byte, err error)

func (*HashSequencer) Init

func (s *HashSequencer) Init(cfg HSConfig) error

Init initialzes the hash sequencer. It returns an error if there is an issue with the configuration parameters.

func (*HashSequencer) MemSize

func (s *HashSequencer) MemSize() uintptr

MemSize returns the the memory that the HashSequencer occupies.

func (*HashSequencer) Pos added in v0.0.3

func (s *HashSequencer) Pos() int64

Pos returns the position of the window head.

func (*HashSequencer) ReadFrom

func (s *HashSequencer) ReadFrom(r io.Reader) (n int64, err error)

ReadFrom is an alternative way to write data into the buffer.

func (*HashSequencer) RequestBuffer

func (s *HashSequencer) RequestBuffer() int

RequestBuffer provides the number of bytes that the sequencer requests to be provided.

func (*HashSequencer) Reset

func (s *HashSequencer) Reset()

Reset resets the hash sequencer. The sequencer will be in the same state as after Init.

func (*HashSequencer) Sequence

func (s *HashSequencer) Sequence(blk *Block, flags int) (n int, err error)

Sequence converts the next block of k bytes to a sequences. The block will be overwritten. The method returns the number of bytes sequenced and any error encountered. It return ErrEmptyBuffer if there is no further data available.

If blk is nil the search structures will be filled. This mode can be used to ignore segments of data.

func (*HashSequencer) Shrink

func (s *HashSequencer) Shrink() int

Shrink moves the tail of the Window, determined by ShrinkSize, to the front of the buffer and makes then more space available to write into the buffer.

func (*HashSequencer) WindowSize

func (s *HashSequencer) WindowSize() int

WindowSize returns the configured window size for the sequencer.

func (*HashSequencer) Write

func (s *HashSequencer) Write(p []byte) (n int, err error)

Write writes data into the buffer that will be later processed by the Sequence method.

type InputSequencer

type InputSequencer interface {
	Sequencer
	io.Writer
	io.ReaderFrom
	WindowSize() int
	RequestBuffer() int
	Reset()
	Pos() int64
	ByteAt(pos int64) (c byte, err error)
}

InputSequencer buffers the data to generate LZ77 sequences for. It has additional methods required to work with a WrappedSequencer. RequestBuffer provides the number of bytes that can be written to the InputSequencer. ByteAt returns the byte at absolute position pos and returns an error if pos refers to a position outside of the current buffer. Pos returns the absolute position of the window head.

The Sequence method will return ErrEmptyBuffer if no data is avaialble in the sequencer buffer.

type OSASConfig

type OSASConfig struct {
	// maximal window size
	WindowSize int
	// size of the window if the buffer is shrinked
	ShrinkSize int
	// maximum size of the buffer
	MaxSize int
	// target size for a block
	BlockSize int
	// minimum match len
	MinMatchLen int
	// function for computing the costs of a match or literal string if
	// offset is zero in bits. Note these costs are independent of position.
	Cost func(offset, matchLen uint32) uint32 `json:"-"`
	// MatchesPerPos provide the numer of matches that should be generated
	// per position in a block.
	MatchesPerPos int
}

OSASConfig defines the configuration parameter for the optimal suffix array seqeuncer.

func (*OSASConfig) ApplyDefaults

func (cfg *OSASConfig) ApplyDefaults()

ApplyDefaults sets configuration parameters to its defaults. The code doesn't provide consistency.

func (OSASConfig) NewInputSequencer

func (cfg OSASConfig) NewInputSequencer() (s InputSequencer, err error)

func (*OSASConfig) Verify

func (cfg *OSASConfig) Verify() error

Verify checks the configuration for inconsistencies.

type OptimalSuffixArraySequencer

type OptimalSuffixArraySequencer struct {
	// contains filtered or unexported fields
}

func NewOptimalSuffixArraySequencer

func NewOptimalSuffixArraySequencer(cfg OSASConfig) (s *OptimalSuffixArraySequencer, err error)

func (*OptimalSuffixArraySequencer) ByteAt added in v0.0.3

func (s *OptimalSuffixArraySequencer) ByteAt(pos int64) (c byte, err error)

func (*OptimalSuffixArraySequencer) Init

func (*OptimalSuffixArraySequencer) MemSize

func (s *OptimalSuffixArraySequencer) MemSize() uintptr

func (*OptimalSuffixArraySequencer) Pos added in v0.0.3

func (s *OptimalSuffixArraySequencer) Pos() int64

Pos returns the position of the window head.

func (*OptimalSuffixArraySequencer) ReadFrom

func (s *OptimalSuffixArraySequencer) ReadFrom(r io.Reader) (n int64, err error)

ReadFrom is an alternative way to write data into the buffer.

func (*OptimalSuffixArraySequencer) RequestBuffer

func (s *OptimalSuffixArraySequencer) RequestBuffer() int

func (*OptimalSuffixArraySequencer) Reset

func (s *OptimalSuffixArraySequencer) Reset()

func (*OptimalSuffixArraySequencer) Sequence

func (s *OptimalSuffixArraySequencer) Sequence(blk *Block, flags int) (n int, err error)

func (*OptimalSuffixArraySequencer) Shrink

func (s *OptimalSuffixArraySequencer) Shrink() int

Shrink moves the tail of the Window, determined by ShrinkSize, to the front of the buffer and makes then more space available to write into the buffer.

func (*OptimalSuffixArraySequencer) WindowSize

func (s *OptimalSuffixArraySequencer) WindowSize() int

WindowSize returns the configured window size for the sequencer.

func (*OptimalSuffixArraySequencer) Write

func (s *OptimalSuffixArraySequencer) Write(p []byte) (n int, err error)

Write writes data into the buffer that will be later processed by the Sequence method.

type RingBuffer

type RingBuffer struct {
	// contains filtered or unexported fields
}

RingBuffer supports the decoding of sequence blocks. It stores the window in a ring buffer. The decoded data must be read from the window and the simplest way to do that is the WriteTo method.

func (*RingBuffer) Init

func (buf *RingBuffer) Init(windowSize int) error

Init initializes the ring buffer. The existing data slice in the ring buffer will be reused if it is has more or equal capacity than the windowSize+1.

func (*RingBuffer) Read

func (buf *RingBuffer) Read(p []byte) (n int, err error)

Read reads data from the writer. It will always try to return as much data as possible.

func (*RingBuffer) Reset

func (buf *RingBuffer) Reset()

Reset puts the Ringbuffer in its initial state.

func (*RingBuffer) Write

func (buf *RingBuffer) Write(p []byte) (n int, err error)

Write writes data into the sequencer. If the Write cannot be completed no bytes will be written.

func (*RingBuffer) WriteBlock

func (buf *RingBuffer) WriteBlock(blk Block) (k, l, n int, err error)

WriteBlock writes a whole list of sequences, each sequence will be written atomically. The functions returns the number of sequences k written, the number of literals l consumed and the number of bytes n generated.

func (*RingBuffer) WriteMatch

func (buf *RingBuffer) WriteMatch(n int, offset int) error

WriteMatch writes a match completely or not completely.

func (*RingBuffer) WriteTo

func (buf *RingBuffer) WriteTo(w io.Writer) (n int64, err error)

WriteTo writes data into the writer as much as it is possible.

type RingDecoder

type RingDecoder struct {
	// contains filtered or unexported fields
}

A RingDecoder decodes sequences and writes data into the writer.

func NewRingDecoder

func NewRingDecoder(w io.Writer, windowSize int) (*RingDecoder, error)

NewRingDecoder allocates and initializes a decoder. If the windowSize is not positive an error will be returned.

func (*RingDecoder) Flush

func (d *RingDecoder) Flush() error

Flush writes all decoded data to the underlying writer.

func (*RingDecoder) Init

func (d *RingDecoder) Init(w io.Writer, windowSize int) error

Init initializes the decoder. Internal bufferes will be reused if they are largen enougn.

func (*RingDecoder) Reset

func (d *RingDecoder) Reset(w io.Writer)

func (*RingDecoder) Write

func (d *RingDecoder) Write(p []byte) (n int, err error)

Write writes data directoly into the decoder.

func (*RingDecoder) WriteBlock

func (d *RingDecoder) WriteBlock(blk Block) (k, l, n int, err error)

WriteBlock writes a complete block into the decoder.

func (*RingDecoder) WriteMatch

func (d *RingDecoder) WriteMatch(n int, offset int) error

WriteMatch writes a single match into the decoder.

type Seq

type Seq struct {
	LitLen   uint32
	MatchLen uint32
	Offset   uint32
	Aux      uint32
}

Seq represents a single Lempel-Ziv 77 Sequence describing a match, consisting of the offset, the length of the match and the number of literals preceding the match. The Aux field can be used on upper layers to store additional information.

func (Seq) Len

func (s Seq) Len() int64

Len returns the complete length of the sequence.

type Sequencer

type Sequencer interface {
	Sequence(blk *Block, flags int) (n int, err error)
}

Sequencer transforms byte streams into a block of sequences. The target block size under control of the sequencer. The method returns the actual number of bytes sequences have been generated for. The block can be reused and will be overwritten. If the block is nil k bytes will be skipped and no sequences generated.

Sequencer manages an internal buffer that provides a window on the data to be compressed.

type WrappedSequencer

type WrappedSequencer struct {
	// contains filtered or unexported fields
}

WrappedSequencer is returned by the Wrap function. It provides the Sequence method and reads the data required automatically from the stored reader.

func Wrap

Wrap combines a reader and a InputSequencer and makes a Sequencer. The user doesn't need to take care of filling the Sequencer with additional data. The returned sequencer returns EOF if no further data is available.

func (*WrappedSequencer) MemSize

func (s *WrappedSequencer) MemSize() uintptr

MemSize returns the memory consumption of the wrapped sequencer.

func (*WrappedSequencer) Reset

func (s *WrappedSequencer) Reset(r io.Reader)

Reset puts the WrappedSequencer in its initial state and changes the wrapped reader to another reader.

func (*WrappedSequencer) Sequence

func (s *WrappedSequencer) Sequence(blk *Block, flags int) (n int, err error)

Sequence creates a block of sequences but reads the required data from the reader if necessary. The function returns io.EOF if no further data is available.

Directories

Path Synopsis
Package suffix provides a suffix sort algorithm.
Package suffix provides a suffix sort algorithm.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL