gozstd

package module
v1.9.1-0...-be86ad2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 21, 2021 License: MIT Imports: 6 Imported by: 0

README

Build Status GoDoc Go Report codecov

gozstd - go wrapper for zstd

Difference from github.com/valyala/gozstd

This fork removes the concurrency limiting channel which was a significant performance bottleneck in a highly-concurrent application running on a 64 core CPU.

Features

  • Vendors upstream zstd without any modifications.

  • Simple API.

  • Optimized for speed. The API may be easily used in zero allocations mode.

  • Compress* and Decompress* functions are optimized for high concurrency.

  • Proper Writer.Flush for network apps.

  • Supports the following features from upstream zstd:

    • Block / stream compression / decompression with all the supported compression levels and with dictionary support.
    • Dictionary building from a sample set. The created dictionary may be saved to persistent storage / transfered over the network.
    • Dictionary loading for compression / decompression.

    Pull requests for missing upstream zstd features are welcome.

Quick start

How to install gozstd?
go get -u github.com/ClearcodeHQ/gozstd
How to compress data?

The easiest way is just to use Compress:

    compressedData := Compress(nil, data)

There is also StreamCompress and Writer for stream compression.

How to decompress data?

The easiest way is just to use Decompress:

    data, err := Decompress(nil, compressedData)

There is also StreamDecompress and Reader for stream decompression.

How to cross-compile gozstd?

If you're cross-compiling some code that uses gozstd and you stumble upon the following error:

# github.com/ClearcodeHQ/gozstd
/go/pkg/mod/github.com/ClearcodeHQ/gozstd@v1.6.2/stream.go:31:59: undefined: CDict
/go/pkg/mod/github.com/ClearcodeHQ/gozstd@v1.6.2/stream.go:35:64: undefined: CDict
/go/pkg/mod/github.com/ClearcodeHQ/gozstd@v1.6.2/stream.go:47:20: undefined: Writer

You can easily fix it by enabling CGO and using a cross-compiler ( e.g. arm-linux-gnueabi-gcc):

env CC=arm-linux-gnueabi-gcc GOOS=linux GOARCH=arm CGO_ENABLED=1 go build ./main.go 

NOTE: Check #21 for more info.

Who uses gozstd?

FAQ

  • Q: Which go version is supported?
    A: go1.10 and newer. Pull requests for older go versions are accepted.

  • Q: Which platforms/architectures are supported?
    A: linux/amd64, linux/arm, linux/arm64, freebsd/amd64, darwin/amd64, windows/amd64. Pull requests for other platforms/architectures are accepted.

  • Q: I don't trust libzstd*.a binary files from the repo or these files dont't work on my OS/ARCH. How to rebuild them? A: Just run make clean libzstd.a if your OS/ARCH is supported.

  • Q: How do I specify custom build flags when recompiling libzstd*.a? A: You can specify MOREFLAGS=... variable when running make like this: MOREFLAGS=-fPIC make clean libzstd.a.

  • Q: Why the repo contains libzstd*.a binary files?
    A: This simplifies package installation to usual go get without additional steps for building the libzstd*.a

Documentation

Overview

Package gozstd is Go wrapper for zstd.

Gozstd is used in https://github.com/VictoriaMetrics/VictoriaMetrics .

Index

Examples

Constants

View Source
const (
	// WindowLogMin is the minimum value of the windowLog parameter.
	WindowLogMin = 10 // from zstd.h
	// WindowLogMax32 is the maximum value of the windowLog parameter on 32-bit architectures.
	WindowLogMax32 = 30 // from zstd.h
	// WindowLogMax64 is the maximum value of the windowLog parameter on 64-bit architectures.
	WindowLogMax64 = 31 // from zstd.h

	// DefaultWindowLog is the default value of the windowLog parameter.
	DefaultWindowLog = 0
)
View Source
const DefaultCompressionLevel = 3 // Obtained from ZSTD_CLEVEL_DEFAULT.

DefaultCompressionLevel is the default compression level.

Variables

This section is empty.

Functions

func BuildDict

func BuildDict(samples [][]byte, desiredDictLen int) []byte

BuildDict returns dictionary built from the given samples.

The resulting dictionary size will be close to desiredDictLen.

The returned dictionary may be passed to NewCDict* and NewDDict.

Example
// Collect samples for the dictionary.
var samples [][]byte
for i := 0; i < 1000; i++ {
	sample := fmt.Sprintf("this is a dict sample number %d", i)
	samples = append(samples, []byte(sample))
}

// Build a dictionary with the desired size of 8Kb.
dict := BuildDict(samples, 8*1024)

// Now the dict may be used for compression/decompression.

// Create CDict from the dict.
cd, err := NewCDict(dict)
if err != nil {
	log.Fatalf("cannot create CDict: %s", err)
}
defer cd.Release()

// Compress multiple blocks with the same CDict.
var compressedBlocks [][]byte
for i := 0; i < 3; i++ {
	plainData := fmt.Sprintf("this is line %d for dict compression", i)
	compressedData := CompressDict(nil, []byte(plainData), cd)
	compressedBlocks = append(compressedBlocks, compressedData)
}

// The compressedData must be decompressed with the same dict.

// Create DDict from the dict.
dd, err := NewDDict(dict)
if err != nil {
	log.Fatalf("cannot create DDict: %s", err)
}
defer dd.Release()

// Decompress multiple blocks with the same DDict.
for _, compressedData := range compressedBlocks {
	decompressedData, err := DecompressDict(nil, compressedData, dd)
	if err != nil {
		log.Fatalf("cannot decompress data: %s", err)
	}
	fmt.Printf("%s\n", decompressedData)
}
Output:

this is line 0 for dict compression
this is line 1 for dict compression
this is line 2 for dict compression

func Compress

func Compress(dst, src []byte) []byte

Compress appends compressed src to dst and returns the result.

Example (NoAllocs)
data := []byte("foo bar baz")

// Compressed data will be put into cbuf.
var cbuf []byte

for i := 0; i < 5; i++ {
	// Compress re-uses cbuf for the compressed data.
	cbuf = Compress(cbuf[:0], data)

	decompressedData, err := Decompress(nil, cbuf)
	if err != nil {
		log.Fatalf("cannot decompress data: %s", err)
	}

	fmt.Printf("%d. %s\n", i, decompressedData)
}
Output:

0. foo bar baz
1. foo bar baz
2. foo bar baz
3. foo bar baz
4. foo bar baz
Example (Simple)
data := []byte("foo bar baz")

// Compress and decompress data into new buffers.
compressedData := Compress(nil, data)
decompressedData, err := Decompress(nil, compressedData)
if err != nil {
	log.Fatalf("cannot decompress data: %s", err)
}

fmt.Printf("%s", decompressedData)
Output:

foo bar baz

func CompressDict

func CompressDict(dst, src []byte, cd *CDict) []byte

CompressDict appends compressed src to dst and returns the result.

The given dictionary is used for the compression.

func CompressLevel

func CompressLevel(dst, src []byte, compressionLevel int) []byte

CompressLevel appends compressed src to dst and returns the result.

The given compressionLevel is used for the compression.

func Decompress

func Decompress(dst, src []byte) ([]byte, error)

Decompress appends decompressed src to dst and returns the result.

Example (NoAllocs)
data := []byte("foo bar baz")

compressedData := Compress(nil, data)

// Decompressed data will be put into dbuf.
var dbuf []byte

for i := 0; i < 5; i++ {
	// Decompress re-uses dbuf for the decompressed data.
	var err error
	dbuf, err = Decompress(dbuf[:0], compressedData)
	if err != nil {
		log.Fatalf("cannot decompress data: %s", err)
	}

	fmt.Printf("%d. %s\n", i, dbuf)
}
Output:

0. foo bar baz
1. foo bar baz
2. foo bar baz
3. foo bar baz
4. foo bar baz
Example (Simple)
data := []byte("foo bar baz")

// Compress and decompress data into new buffers.
compressedData := Compress(nil, data)
decompressedData, err := Decompress(nil, compressedData)
if err != nil {
	log.Fatalf("cannot decompress data: %s", err)
}

fmt.Printf("%s", decompressedData)
Output:

foo bar baz

func DecompressDict

func DecompressDict(dst, src []byte, dd *DDict) ([]byte, error)

DecompressDict appends decompressed src to dst and returns the result.

The given dictionary dd is used for the decompression.

func StreamCompress

func StreamCompress(dst io.Writer, src io.Reader) error

StreamCompress compresses src into dst.

This function doesn't work with interactive network streams, since data read from src may be buffered before passing to dst for performance reasons. Use Writer.Flush for interactive network streams.

func StreamCompressDict

func StreamCompressDict(dst io.Writer, src io.Reader, cd *CDict) error

StreamCompressDict compresses src into dst using the given dict cd.

This function doesn't work with interactive network streams, since data read from src may be buffered before passing to dst for performance reasons. Use Writer.Flush for interactive network streams.

func StreamCompressLevel

func StreamCompressLevel(dst io.Writer, src io.Reader, compressionLevel int) error

StreamCompressLevel compresses src into dst using the given compressionLevel.

This function doesn't work with interactive network streams, since data read from src may be buffered before passing to dst for performance reasons. Use Writer.Flush for interactive network streams.

func StreamDecompress

func StreamDecompress(dst io.Writer, src io.Reader) error

StreamDecompress decompresses src into dst.

This function doesn't work with interactive network streams, since data read from src may be buffered before passing to dst for performance reasons. Use Reader for interactive network streams.

func StreamDecompressDict

func StreamDecompressDict(dst io.Writer, src io.Reader, dd *DDict) error

StreamDecompressDict decompresses src into dst using the given dictionary dd.

This function doesn't work with interactive network streams, since data read from src may be buffered before passing to dst for performance reasons. Use Reader for interactive network streams.

Types

type CDict

type CDict struct {
	// contains filtered or unexported fields
}

CDict is a dictionary used for compression.

A single CDict may be re-used in concurrently running goroutines.

func NewCDict

func NewCDict(dict []byte) (*CDict, error)

NewCDict creates new CDict from the given dict.

Call Release when the returned dict is no longer used.

func NewCDictLevel

func NewCDictLevel(dict []byte, compressionLevel int) (*CDict, error)

NewCDictLevel creates new CDict from the given dict using the given compressionLevel.

Call Release when the returned dict is no longer used.

func (*CDict) Release

func (cd *CDict) Release()

Release releases resources occupied by cd.

cd cannot be used after the release.

type DDict

type DDict struct {
	// contains filtered or unexported fields
}

DDict is a dictionary used for decompression.

A single DDict may be re-used in concurrently running goroutines.

func NewDDict

func NewDDict(dict []byte) (*DDict, error)

NewDDict creates new DDict from the given dict.

Call Release when the returned dict is no longer needed.

func (*DDict) Release

func (dd *DDict) Release()

Release releases resources occupied by dd.

dd cannot be used after the release.

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader implements zstd reader.

Example
// Compress the data.
compressedData := Compress(nil, []byte("line 0\nline 1\nline 2"))

// Read it via Reader.
r := bytes.NewReader(compressedData)
zr := NewReader(r)
defer zr.Release()

var a []int
for i := 0; i < 3; i++ {
	var n int
	if _, err := fmt.Fscanf(zr, "line %d\n", &n); err != nil {
		log.Fatalf("cannot read line: %s", err)
	}
	a = append(a, n)
}

// Make sure there are no data left in zr.
buf := make([]byte, 1)
if _, err := zr.Read(buf); err != io.EOF {
	log.Fatalf("unexpected error; got %v; want %v", err, io.EOF)
}

fmt.Println(a)
Output:

[0 1 2]

func NewReader

func NewReader(r io.Reader) *Reader

NewReader returns new zstd reader reading compressed data from r.

Call Release when the Reader is no longer needed.

func NewReaderDict

func NewReaderDict(r io.Reader, dd *DDict) *Reader

NewReaderDict returns new zstd reader reading compressed data from r using the given DDict.

Call Release when the Reader is no longer needed.

func (*Reader) Read

func (zr *Reader) Read(p []byte) (int, error)

Read reads up to len(p) bytes from zr to p.

func (*Reader) Release

func (zr *Reader) Release()

Release releases all the resources occupied by zr.

zr cannot be used after the release.

func (*Reader) Reset

func (zr *Reader) Reset(r io.Reader, dd *DDict)

Reset resets zr to read from r using the given dictionary dd.

Example
zr := NewReader(nil)
defer zr.Release()

// Read from different sources using the same Reader.
for i := 0; i < 3; i++ {
	compressedData := Compress(nil, []byte(fmt.Sprintf("line %d", i)))
	r := bytes.NewReader(compressedData)
	zr.Reset(r, nil)

	data, err := ioutil.ReadAll(zr)
	if err != nil {
		log.Fatalf("unexpected error when reading compressed data: %s", err)
	}
	fmt.Printf("%s\n", data)
}
Output:

line 0
line 1
line 2

func (*Reader) WriteTo

func (zr *Reader) WriteTo(w io.Writer) (int64, error)

WriteTo writes all the data from zr to w.

It returns the number of bytes written to w.

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer implements zstd writer.

Example
// Compress data to bb.
var bb bytes.Buffer
zw := NewWriter(&bb)
defer zw.Release()

for i := 0; i < 3; i++ {
	fmt.Fprintf(zw, "line %d\n", i)
}
if err := zw.Close(); err != nil {
	log.Fatalf("cannot close writer: %s", err)
}

// Decompress the data and verify it is valid.
plainData, err := Decompress(nil, bb.Bytes())
fmt.Printf("err: %v\n%s", err, plainData)
Output:

err: <nil>
line 0
line 1
line 2

func NewWriter

func NewWriter(w io.Writer) *Writer

NewWriter returns new zstd writer writing compressed data to w.

The returned writer must be closed with Close call in order to finalize the compressed stream.

Call Release when the Writer is no longer needed.

func NewWriterDict

func NewWriterDict(w io.Writer, cd *CDict) *Writer

NewWriterDict returns new zstd writer writing compressed data to w using the given cd.

The returned writer must be closed with Close call in order to finalize the compressed stream.

Call Release when the Writer is no longer needed.

func NewWriterLevel

func NewWriterLevel(w io.Writer, compressionLevel int) *Writer

NewWriterLevel returns new zstd writer writing compressed data to w at the given compression level.

The returned writer must be closed with Close call in order to finalize the compressed stream.

Call Release when the Writer is no longer needed.

func NewWriterParams

func NewWriterParams(w io.Writer, params *WriterParams) *Writer

NewWriterParams returns new zstd writer writing compressed data to w using the given set of parameters.

The returned writer must be closed with Close call in order to finalize the compressed stream.

Call Release when the Writer is no longer needed.

func (*Writer) Close

func (zw *Writer) Close() error

Close finalizes the compressed stream and flushes all the compressed data to the underlying writer.

It doesn't close the underlying writer passed to New* functions.

func (*Writer) Flush

func (zw *Writer) Flush() error

Flush flushes the remaining data from zw to the underlying writer.

Example
var bb bytes.Buffer
zw := NewWriter(&bb)
defer zw.Release()

// Write some data to zw.
data := []byte("some data\nto compress")
if _, err := zw.Write(data); err != nil {
	log.Fatalf("cannot write data to zw: %s", err)
}

// Verify the data is cached in zw and isn't propagated to bb.
if bb.Len() > 0 {
	log.Fatalf("%d bytes unexpectedly propagated to bb", bb.Len())
}

// Flush the compressed data to bb.
if err := zw.Flush(); err != nil {
	log.Fatalf("cannot flush compressed data: %s", err)
}

// Verify the compressed data is propagated to bb.
if bb.Len() == 0 {
	log.Fatalf("the compressed data isn't propagated to bb")
}

// Try reading the compressed data with reader.
zr := NewReader(&bb)
defer zr.Release()
buf := make([]byte, len(data))
if _, err := io.ReadFull(zr, buf); err != nil {
	log.Fatalf("cannot read the compressed data: %s", err)
}
fmt.Printf("%s", buf)
Output:

some data
to compress

func (*Writer) ReadFrom

func (zw *Writer) ReadFrom(r io.Reader) (int64, error)

ReadFrom reads all the data from r and writes it to zw.

Returns the number of bytes read from r.

ReadFrom may not flush the compressed data to the underlying writer due to performance reasons. Call Flush or Close when the compressed data must propagate to the underlying writer.

func (*Writer) Release

func (zw *Writer) Release()

Release releases all the resources occupied by zw.

zw cannot be used after the release.

func (*Writer) Reset

func (zw *Writer) Reset(w io.Writer, cd *CDict, compressionLevel int)

Reset resets zw to write to w using the given dictionary cd and the given compressionLevel. Use ResetWriterParams if you wish to change other parameters that were set via WriterParams.

Example
zw := NewWriter(nil)
defer zw.Release()

// Write to different destinations using the same Writer.
for i := 0; i < 3; i++ {
	var bb bytes.Buffer
	zw.Reset(&bb, nil, DefaultCompressionLevel)
	if _, err := zw.Write([]byte(fmt.Sprintf("line %d", i))); err != nil {
		log.Fatalf("unexpected error when writing data: %s", err)
	}
	if err := zw.Close(); err != nil {
		log.Fatalf("unexpected error when closing zw: %s", err)
	}

	// Decompress the compressed data.
	plainData, err := Decompress(nil, bb.Bytes())
	if err != nil {
		log.Fatalf("unexpected error when decompressing data: %s", err)
	}
	fmt.Printf("%s\n", plainData)
}
Output:

line 0
line 1
line 2

func (*Writer) ResetWriterParams

func (zw *Writer) ResetWriterParams(w io.Writer, params *WriterParams)

ResetWriterParams resets zw to write to w using the given set of parameters.

func (*Writer) Write

func (zw *Writer) Write(p []byte) (int, error)

Write writes p to zw.

Write doesn't flush the compressed data to the underlying writer due to performance reasons. Call Flush or Close when the compressed data must propagate to the underlying writer.

type WriterParams

type WriterParams struct {
	// Compression level. Special value 0 means 'default compression level'.
	CompressionLevel int

	// WindowLog. Must be clamped between WindowLogMin and WindowLogMin32/64.
	// Special value 0 means 'use default windowLog'.
	//
	// Note: enabling log distance matching increases memory usage for both
	// compressor and decompressor. When set to a value greater than 27, the
	// decompressor requires special treatment.
	WindowLog int

	// Dict is optional dictionary used for compression.
	Dict *CDict
}

A WriterParams allows users to specify compression parameters by calling NewWriterParams.

Calling NewWriterParams with a nil WriterParams is equivalent to calling NewWriter.

Example
// Compress data to bb.
var bb bytes.Buffer
zw := NewWriterParams(&bb, &WriterParams{
	CompressionLevel: 10,
	WindowLog:        14,
})
defer zw.Release()

for i := 0; i < 3; i++ {
	fmt.Fprintf(zw, "line %d\n", i)
}
if err := zw.Close(); err != nil {
	log.Fatalf("cannot close writer: %s", err)
}

// Decompress the data and verify it is valid.
plainData, err := Decompress(nil, bb.Bytes())
fmt.Printf("err: %v\n%s", err, plainData)
Output:

err: <nil>
line 0
line 1
line 2

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL