Documentation
¶
Overview ¶
Package table holds Prism's columnar in-memory table type (`Table`), shared by every DAG node from Source through Encode. Columns are typed by a small `Kind` enum that buckets Pulse's 13 storage types down to five categories sufficient for spec validation and rendering.
D015 establishes that Table is materialised (not streaming); D016 establishes that storage is columnar with deferred bit-packing. D024 (queued for P02) records why `table/` is its own package rather than nested under `compile/` or `plan/`.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Column ¶
type Column interface {
// Kind reports the Prism Kind bucket this column belongs to.
Kind() Kind
// Len returns the row count.
Len() int
// ValueAt returns the i-th value as an interface{} (any). Returns
// nil if the column carries an explicit null sentinel at i.
ValueAt(i int) any
// IsNull reports whether row i carries an explicit null marker.
// Implementations that don't track nullability return false for
// every i (the safe backward-compatible answer).
IsNull(i int) bool
// NullCount returns the number of nulls in the column. Plain
// slice-backed columns return 0; nullable wrappers consult their
// bitmap. -1 is reserved for future implementations that defer
// the count until requested.
NullCount() int
}
Column is the read interface for one materialised column. Concrete impls are typed Go slices (no bit-packing in v1, per D016).
type DateColumn ¶
type DateColumn []int64
DateColumn is the storage for KindDate columns. Values are stored as int64 "days since epoch" using Pulse's wire format; scales/encoders convert to time.Time at the render boundary.
type IntColumn ¶
type IntColumn []int64
IntColumn is the storage for KindInt columns. int64 is wide enough to hold u64 values up to math.MaxInt64; values above that are truncated at decode time and a warning surfaces in Source telemetry.
type Kind ¶
type Kind int
Kind buckets Pulse storage types into Prism's columnar categories. The mapping is intentionally lossy: every Pulse type folds into exactly one Kind, so downstream code (scales, encodings, format strings) reasons about one of five shapes instead of seventeen.
const ( // KindUnknown is the zero value; a Column must never report it. KindUnknown Kind = iota // KindInt covers unsigned + bit-packed integer Pulse types. KindInt // KindFloat covers f32, f64, and decimal128 variants. KindFloat // KindString covers categorical types (rendered by dictionary). KindString // KindBool covers packed_bool. Null state lives in the per-record // null bitmap when Field.Nullable is set. KindBool // KindDate covers the date Pulse type (days-since-epoch). KindDate )
func KindFromPulseFieldType ¶
KindFromPulseFieldType folds the 13 Pulse FieldType variants into the five Prism Kinds. Decimal types route to KindFloat in v1 because we surface them as numeric scalars at the encoding layer; revisit when a dedicated decimal renderer lands. Nullability is orthogonal to Kind — callers consult Field.Nullable separately.
type NullBitmap ¶
type NullBitmap struct {
// contains filtered or unexported fields
}
NullBitmap is a packed bit set tracking which positions in a column carry an explicit null marker. Bits are LSB-first within each word. A nil *NullBitmap is treated as "no nulls" — callers should compare against nil before touching the methods.
The bitmap is grown lazily by Set; explicit pre-allocation via NewNullBitmap is preferred when the upper bound is known so the hash-join writer doesn't churn allocations as it appends nulls one row at a time.
func NewNullBitmap ¶
func NewNullBitmap(n int) *NullBitmap
NewNullBitmap returns a bitmap pre-sized for n positions. n may be zero; the slice grows on demand.
func (*NullBitmap) Capacity ¶
func (b *NullBitmap) Capacity() int
Capacity returns the high-water mark of positions ever set or the pre-allocated size — whichever is larger.
func (*NullBitmap) Count ¶
func (b *NullBitmap) Count() int
Count returns the number of bits set in the bitmap.
func (*NullBitmap) IsNull ¶
func (b *NullBitmap) IsNull(i int) bool
IsNull reports whether position i carries a null marker.
func (*NullBitmap) Set ¶
func (b *NullBitmap) Set(i int)
Set marks position i as null. Out-of-range positions extend the backing slice. Calling Set twice on the same position is a no-op for Count.
type NullableColumn ¶
type NullableColumn struct {
Inner Column
Nulls *NullBitmap
}
NullableColumn wraps any Column with an optional null bitmap. When the bitmap is nil, the wrapper is transparent. When set, ValueAt returns nil for marked positions and IsNull / NullCount consult the bitmap.
The plan-stage hash join allocates a NullableColumn per output column that may receive unmatched rows; the resolver and inline loader wrap nullable Pulse fields the same way.
func (NullableColumn) Kind ¶
func (n NullableColumn) Kind() Kind
Kind implements Column by delegating to Inner.
func (NullableColumn) NullCount ¶
func (n NullableColumn) NullCount() int
NullCount implements Column.
func (NullableColumn) Unwrap ¶
func (n NullableColumn) Unwrap() Column
Unwrap returns the underlying column. Callers that need to feed a slice-backed reader (e.g. scale extent computation) can read the inner column directly, then consult IsNull per row.
func (NullableColumn) ValueAt ¶
func (n NullableColumn) ValueAt(i int) any
ValueAt returns nil for null rows; otherwise delegates to Inner.
type StringColumn ¶
type StringColumn []string
StringColumn is the storage for KindString columns (categorical values decoded against their Pulse dictionary at materialisation time).
type Table ¶
type Table struct {
// contains filtered or unexported fields
}
Table is the columnar in-memory result of one DAG node. Tables are immutable once constructed (callers receive aliased columns; never mutate them in place). Hash is computed by the producer (Resolver, SourceNode, inline converter) and propagated as the cache key.
func Filter ¶
Filter returns a new Table holding only the rows for which keep[i] is true. The schema is preserved verbatim (same field order, same types); the row count is len(keep_set). The returned table's hash is derived from the source table's hash plus a content-addressed suffix carrying the partition tag, so cached tables remain content-addressed.
Filter is the partition primitive used by the encoder's facet fan-out (D054). It is intentionally narrow — no transform semantics, no column projection — so the hot path stays a single columnar walk.
partitionTag is an opaque string the caller uses to differentiate multiple sibling filters of the same parent. The encoder's facet path passes "facet:<rowVal>:<colVal>" so two distinct partitions of the same parent produce different hashes.
func FromInline ¶
func FromInline(name string, values []map[string]any, fields []spec.FieldSpec) (*Table, *encoding.Schema, error)
FromInline turns inline `data.values` rows (and optional `data.fields` declarations) into a *Table backed by a synthetic *encoding.Schema.
Type resolution:
- If fields is non-empty, every declared field is honoured verbatim; unknown type tokens fall back to KindString (categorical_u8).
- Otherwise the first row's JSON kinds drive inference: string → categorical_u8 / KindString float64 / json.Number → f64 / KindFloat bool → packed_bool / KindBool other (nil, nested arrays/maps) → categorical_u8 / KindString so downstream rules see a usable measure type.
Subsequent rows are validated against the resolved schema; a row whose JSON kind for a given field disagrees with the schema returns PRISM_RESOLVE_INLINE_TYPE_MISMATCH with row index and field name.
Hash is xxhash64 over a canonical JSON encoding of values (rows sorted by key per row, fields written in schema declaration order). Identical inputs map to identical hashes regardless of map iteration order.
func NewTable ¶
func NewTable(schema *encoding.Schema, columns map[string]Column, rowCount int, hash string) (*Table, error)
NewTable builds and validates a Table.
Validation:
- schema must be non-nil and have at least one field.
- columns must contain exactly one entry per schema field (no extras, no missing).
- every column's Len() must equal rowCount.
- every column's Kind() must match KindFromPulseFieldType for its schema field's Type.
- rowCount must be in [0, limits.TableMaxRows()]. Exceeding the cap returns PRISM_RESOLVE_007.
hash is propagated verbatim; the producer owns hashing strategy.
func (*Table) FieldNames ¶
FieldNames returns the field names in schema declaration order.