fastqpacker

module
v0.8.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 12, 2026 License: MIT

README

FastQPacker

CI Go Report Card Release Go Reference

The fastest FASTQ compressor available, with better compression than gzip/pigz/zstd while being 10-80x faster. Specialized tools like DSRC or Spring compress smaller, but are 2-6x slower.

Pre-built binaries for macOS and Linux (both ARM and x86_64). Single binary, no dependencies.

Benchmarks

Tested on ERR532393_1 (8.9GB Illumina reads), M4 MacBook Pro:

Tool Size Ratio Compress Decompress Speed
fqpack 2,961 MB 3.25x 3.24s 2.95s 2,967.3 MB/s
DSRC 2,150 MB 4.1x 12s 18s 742 MB/s
zstd 3,312 MB 2.7x 11s 8s 809 MB/s
pigz 3,278 MB 2.7x 79s 12s 113 MB/s
repaq 5,732 MB 1.6x 80s 27s 111 MB/s
repaq+xz 2,761 MB 3.2x 388s 40s 23 MB/s
7z 2,584 MB 3.4x 1,442s 83s 6 MB/s

fqpack is 14% smaller than pigz with 24x faster compression and 4x faster decompression. DSRC compresses 24% smaller but is 3.6x slower to compress and 6x slower to decompress. FQSqueezer achieves the best known compression (1,511 MB, 5.9x ratio) but is ~100x slower.

Re-run fqpack-only 9GB benchmark:

./scripts/benchmark_fqpack_9gb.sh 3

Installation

curl -fsSL https://raw.githubusercontent.com/vertti/fastqpacker/main/install.sh | sh

Or with Go:

go install github.com/vertti/fastqpacker/cmd/fqpack@latest

Usage

# Compress
fqpack -i reads.fq -o reads.fqz

# Compress from gzipped FASTQ
fqpack -i reads.fastq.gz -o reads.fqz

# Decompress
fqpack -d -i reads.fqz -o reads.fq

# Stdin/stdout (Unix pipes)
cat reads.fq | fqpack -c > reads.fqz
fqpack -d < reads.fqz > reads.fq

# Control parallelism
fqpack -w 4 -i reads.fq -o reads.fqz

How It Works

  • 2-bit sequence encoding: ACGT packed 4 bases per byte (N positions stored separately)
  • Delta-encoded quality scores: Adjacent quality scores are similar, deltas compress well
  • zstd compression: Modern entropy coding beats gzip's DEFLATE
  • Parallel block processing: Scales across all CPU cores
  • Built-in integrity verification: CRC32 checksums detect corruption on decompress
  • Auto-detected quality encoding: Phred+33 and Phred+64 handled transparently
  • Lossless FASTQ record preservation: Optional line-3 plus payload (+...) is preserved

Limitations

  • Illumina 4-line FASTQ format only (no multi-line sequences)
  • No streaming decompression (full block buffering)

License

MIT

Directories

Path Synopsis
cmd
fqpack command
fqpack compresses and decompresses FASTQ files.
fqpack compresses and decompresses FASTQ files.
fqscramble command
fqscramble scrambles FASTQ files to remove identifiable sequence information while preserving realistic characteristics for benchmarking.
fqscramble scrambles FASTQ files to remove identifiable sequence information while preserving realistic characteristics for benchmarking.
internal
compress
Package compress provides FASTQ compression and decompression.
Package compress provides FASTQ compression and decompression.
encoder
Package encoder provides encoding functions for FASTQ components.
Package encoder provides encoding functions for FASTQ components.
fqformat
Package fqformat defines the FQZ file format for compressed FASTQ data.
Package fqformat defines the FQZ file format for compressed FASTQ data.
fqparser
Package fqparser provides fast FASTQ file parsing.
Package fqparser provides fast FASTQ file parsing.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL