encoding

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 26, 2026 License: MIT Imports: 3 Imported by: 0

Documentation

Overview

Package encoding implements the per-page value encodings that run before the block codec, the encode stage of the page pipeline pinned in the format spec. It is the heart of tatami's compact story: each encoding removes a kind of structured redundancy (narrow ranges, runs, monotonic steps) that a general block compressor would otherwise have to rediscover byte by byte.

The package is deliberately physical and self-contained. It does not import the root tatami package and knows nothing of logical types or nulls. Integer encodings work on a dense []uint64 of present values (the caller widens its typed slice and handles nulls one level up), bool works on a []bool, and the FOR base is chosen in signed or unsigned order as the caller asks. Every encoder round-trips: Decode(Encode(vs)) == vs for any input including the empty slice.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DecodeBitmap

func DecodeBitmap(src []byte, out []bool) error

DecodeBitmap reads len(out) bools packed by EncodeBitmap.

func DecodeInts

func DecodeInts(id ID, src []byte, out []uint64, signed bool) error

DecodeInts decodes count values (len(out)) of encoding id into out.

func EncodeBitmap

func EncodeBitmap(dst []byte, vs []bool) []byte

EncodeBitmap packs bools into ceil(n/8) bytes, bit i of byte i/8 holding vs[i] with the least significant bit first. This is the natural encoding for a dense boolean column: one bit per value, which zstd then squeezes further when the column is mostly one value.

func EncodeInts

func EncodeInts(id ID, dst []byte, vs []uint64, signed bool) ([]byte, bool)

EncodeInts encodes vs with the given encoding and returns the payload and whether the encoding applies. GROUPVARINT and PFORDELTA only apply when every value fits in 32 bits, and PFORDELTA additionally needs at most 256 values so its one-byte exception positions stay in range; the sampler skips them otherwise. BITPACK_FOR, DELTA, and RLE apply to any input.

Types

type ID

type ID uint8

ID is the encoding enum recorded in a page header. The values are pinned by the format canon and must never be renumbered; they match tatami.Encoding.

const (
	Plain       ID = 0
	RLE         ID = 1
	Dictionary  ID = 2
	BitpackFOR  ID = 3
	Delta       ID = 4
	GroupVarint ID = 5
	PForDelta   ID = 6
	FSST        ID = 7
	Bitmap      ID = 8
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL