memory

package
v0.0.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 14, 2026 License: MIT Imports: 15 Imported by: 0

Documentation

Overview

Package memory provides a lightweight Go-slice-backed compute engine for the dataset package. It implements dataset.ColumnFactory, dataset.BuilderFactory, dataset.Aggregator, and dataset.Caster.

Usage:

eng := memory.NewEngine(context.Background())
f := eng.(dataset.ColumnFactory)
ds, _ := f.FromColumns(
    dataset.NewSchema(dataset.FloatCol("x"), dataset.StringCol("label")),
    f.NewFloat64Column("x", []float64{1, 2, 3}),
    f.NewStringColumn("label", []string{"a", "b", "c"}),
)

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrUnsupportedType is returned for unsupported column types.
	ErrUnsupportedType = errors.New("memory: unsupported column type")

	// ErrLengthMismatch is returned when column lengths don't match.
	ErrLengthMismatch = errors.New("memory: column length mismatch")

	// ErrEmptyColumn is returned when an operation requires non-empty data.
	ErrEmptyColumn = errors.New("memory: empty column")

	// ErrRequiresFloat64 is returned when a float64 column is required.
	ErrRequiresFloat64 = errors.New("memory: operation requires float64 column")

	// ErrRequiresInt64 is returned when an int64 column is required.
	ErrRequiresInt64 = errors.New("memory: operation requires int64 column")

	// ErrRequiresNumeric is returned when a numeric column is required.
	ErrRequiresNumeric = errors.New("memory: operation requires numeric column")

	// ErrJoinKeyMismatch is returned when join key types don't match.
	ErrJoinKeyMismatch = errors.New("memory: join key type mismatch")

	// ErrTakeTypeMismatch is returned when a Take/Select result has unexpected type.
	ErrTakeTypeMismatch = errors.New("memory: unexpected result type from Take/Select")
)

Sentinel errors for the memory engine package.

Functions

This section is empty.

Types

type Engine

type Engine struct {
	// contains filtered or unexported fields
}

Engine is the Go-slice compute backend.

func NewEngine

func NewEngine(ctx context.Context) *Engine

NewEngine creates a memory engine with the given lifecycle context.

func (*Engine) Abs

func (e *Engine) Abs(col dataset.AnyColumn) (dataset.AnyColumn, error)

Abs returns the absolute value of each element.

func (*Engine) Acos

func (e *Engine) Acos(col dataset.AnyColumn) (dataset.AnyColumn, error)

Acos returns the arccosine of each element.

func (*Engine) AddCols

func (e *Engine) AddCols(a, b dataset.AnyColumn) (dataset.AnyColumn, error)

AddCols returns element-wise addition of two float64 columns.

func (*Engine) AddScalar

func (e *Engine) AddScalar(col dataset.AnyColumn, val float64) (dataset.AnyColumn, error)

AddScalar adds a scalar value to each element of a float64 column.

func (*Engine) Asin

func (e *Engine) Asin(col dataset.AnyColumn) (dataset.AnyColumn, error)

Asin returns the arcsine of each element.

func (*Engine) Atan

func (e *Engine) Atan(col dataset.AnyColumn) (dataset.AnyColumn, error)

Atan returns the arctangent of each element.

func (*Engine) Atan2

func (e *Engine) Atan2(y, x dataset.AnyColumn) (dataset.AnyColumn, error)

Atan2 returns the two-argument arctangent of (y, x).

func (*Engine) BitAnd

func (e *Engine) BitAnd(a, b dataset.AnyColumn) (dataset.AnyColumn, error)

BitAnd returns element-wise bitwise AND of two int64 columns.

func (*Engine) BitNot

func (e *Engine) BitNot(col dataset.AnyColumn) (dataset.AnyColumn, error)

BitNot returns the bitwise complement of each int64 element.

func (*Engine) BitOr

func (e *Engine) BitOr(a, b dataset.AnyColumn) (dataset.AnyColumn, error)

BitOr returns element-wise bitwise OR of two int64 columns.

func (*Engine) BitShiftLeft

func (e *Engine) BitShiftLeft(col dataset.AnyColumn, n int) (dataset.AnyColumn, error)

BitShiftLeft shifts each int64 element left by n bits.

func (*Engine) BitShiftRight

func (e *Engine) BitShiftRight(col dataset.AnyColumn, n int) (dataset.AnyColumn, error)

BitShiftRight shifts each int64 element right by n bits.

func (*Engine) BitXor

func (e *Engine) BitXor(a, b dataset.AnyColumn) (dataset.AnyColumn, error)

BitXor returns element-wise bitwise XOR of two int64 columns.

func (*Engine) Cast

func (e *Engine) Cast(col dataset.AnyColumn, target dataset.DType) (dataset.AnyColumn, error)

Cast converts a column to the specified dtype.

func (*Engine) Ceil

func (e *Engine) Ceil(col dataset.AnyColumn) (dataset.AnyColumn, error)

Ceil returns the smallest integer ≥ each element.

func (*Engine) Combine

func (e *Engine) Combine(datasets ...dataset.Table) (dataset.Table, error)

Combine horizontally concatenates datasets with equal row counts.

func (*Engine) Complete

func (e *Engine) Complete(ds dataset.Table, cols ...string) (dataset.Table, error)

Complete generates all combinations of the specified columns' unique values, filling missing rows with null values.

func (*Engine) Concatenate

func (e *Engine) Concatenate(ds dataset.Table, col string, from []string, sep string) (dataset.Table, error)

Concatenate joins multiple string columns into one with a separator.

func (*Engine) Context

func (e *Engine) Context() context.Context

Context returns the engine's lifecycle context.

func (*Engine) Cos

func (e *Engine) Cos(col dataset.AnyColumn) (dataset.AnyColumn, error)

Cos returns the cosine of each element (radians).

func (*Engine) Count

func (e *Engine) Count(col dataset.AnyColumn) (dataset.AnyColumn, error)

Count returns the row count of a column as a single-row int64 column.

func (*Engine) CumMax

func (e *Engine) CumMax(col dataset.AnyColumn) (dataset.AnyColumn, error)

CumMax returns the cumulative maximum of a numeric column.

func (*Engine) CumMin

func (e *Engine) CumMin(col dataset.AnyColumn) (dataset.AnyColumn, error)

CumMin returns the cumulative minimum of a numeric column.

func (*Engine) CumSum

func (e *Engine) CumSum(col dataset.AnyColumn) (dataset.AnyColumn, error)

CumSum returns the cumulative sum of a float64 column.

func (*Engine) DenseRank

func (e *Engine) DenseRank(col dataset.AnyColumn) (dataset.AnyColumn, error)

DenseRank returns dense rank (no gaps). E.g. [10,20,20,30] → [1,2,2,3].

func (*Engine) DivCols

func (e *Engine) DivCols(a, b dataset.AnyColumn) (dataset.AnyColumn, error)

DivCols returns element-wise division of two float64 columns.

func (*Engine) DropNA

func (e *Engine) DropNA(ds dataset.Table, cols ...string) (dataset.Table, error)

DropNA removes rows containing null values in the specified columns.

func (*Engine) Erf

func (e *Engine) Erf(col dataset.AnyColumn) (dataset.AnyColumn, error)

Erf returns the Gauss error function of each element.

func (*Engine) Exp

func (e *Engine) Exp(col dataset.AnyColumn) (dataset.AnyColumn, error)

Exp returns e raised to the power of each element.

func (*Engine) Fill

Fill forward- or backward-fills null values in a column.

func (*Engine) Filter

func (e *Engine) Filter(ds dataset.Table, mask dataset.Masker) (dataset.Table, error)

Filter returns a new Table containing only rows where mask evaluates true.

func (*Engine) FilterIndices

func (e *Engine) FilterIndices(mask []bool) []int

FilterIndices returns the indices where mask is true.

func (*Engine) Floor

func (e *Engine) Floor(col dataset.AnyColumn) (dataset.AnyColumn, error)

Floor returns the greatest integer ≤ each element.

func (*Engine) FromColumns

func (e *Engine) FromColumns(schema *dataset.Schema, cols ...dataset.AnyColumn) (dataset.Table, error)

FromColumns constructs a Table from a schema and pre-built columns.

func (*Engine) Join

func (e *Engine) Join(left, right dataset.Table, spec dataset.JoinSpec) (dataset.Table, error)

Join implements the Joiner interface with a hash-join algorithm. It supports Inner, Left, Right, Full, Semi, and Anti joins.

func (*Engine) Lag

func (e *Engine) Lag(col dataset.AnyColumn, n int) (dataset.AnyColumn, error)

Lag shifts a column's values down by n positions, filling with NaN/zero.

func (*Engine) Lead

func (e *Engine) Lead(col dataset.AnyColumn, n int) (dataset.AnyColumn, error)

Lead shifts a column's values up by n positions, filling with NaN/zero.

func (*Engine) Ln

Ln returns the natural logarithm of each element.

func (*Engine) Log2

func (e *Engine) Log2(col dataset.AnyColumn) (dataset.AnyColumn, error)

Log2 returns the base-2 logarithm of each element.

func (*Engine) Log10

func (e *Engine) Log10(col dataset.AnyColumn) (dataset.AnyColumn, error)

Log10 returns the base-10 logarithm of each element.

func (*Engine) Mean

func (e *Engine) Mean(col dataset.AnyColumn) (dataset.AnyColumn, error)

Mean returns the arithmetic mean of a float64 column as a single-row column.

func (*Engine) Median

func (e *Engine) Median(col dataset.AnyColumn) (dataset.AnyColumn, error)

Median returns the median of a float64 column as a single-row column.

func (*Engine) MinMax

MinMax returns two single-row columns containing the min and max values.

func (*Engine) MulCols

func (e *Engine) MulCols(a, b dataset.AnyColumn) (dataset.AnyColumn, error)

MulCols returns element-wise multiplication of two float64 columns.

func (*Engine) MulScalar

func (e *Engine) MulScalar(col dataset.AnyColumn, val float64) (dataset.AnyColumn, error)

MulScalar multiplies each element of a float64 column by a scalar value.

func (*Engine) Name

func (e *Engine) Name() string

Name returns "memory".

func (*Engine) Neg

func (e *Engine) Neg(col dataset.AnyColumn) (dataset.AnyColumn, error)

Neg returns the negation of each element.

func (*Engine) NewBoolColumn

func (e *Engine) NewBoolColumn(name string, data []bool) dataset.AnyColumn

NewBoolColumn creates a bool column from the given slice.

func (*Engine) NewBuilder

func (e *Engine) NewBuilder(schema *dataset.Schema) dataset.Builder

NewBuilder creates a typed row-appender for the given schema.

func (*Engine) NewFloat64Column

func (e *Engine) NewFloat64Column(name string, data []float64) dataset.AnyColumn

NewFloat64Column creates a float64 column from the given slice.

func (*Engine) NewInt64Column

func (e *Engine) NewInt64Column(name string, data []int64) dataset.AnyColumn

NewInt64Column creates an int64 column from the given slice.

func (*Engine) NewStringColumn

func (e *Engine) NewStringColumn(name string, data []string) dataset.AnyColumn

NewStringColumn creates a string column from the given slice.

func (*Engine) NewTimestampColumn

func (e *Engine) NewTimestampColumn(name string, data []int64) dataset.AnyColumn

NewTimestampColumn creates a timestamp column (int64-backed) from the given slice.

func (*Engine) PercentRank

func (e *Engine) PercentRank(col dataset.AnyColumn) (dataset.AnyColumn, error)

PercentRank returns (rank - 1) / (n - 1) as float64. Returns 0 for single element.

func (*Engine) PivotLonger

func (e *Engine) PivotLonger(ds dataset.Table, spec dataset.PivotLongerSpec) (dataset.Table, error)

PivotLonger reshapes a wide dataset to long format. Columns listed in spec.Cols are "gathered" into two new columns: spec.NamesTo (holds original column names) and spec.ValuesTo (holds values). All other columns are repeated for each gathered column.

func (*Engine) PivotWider

func (e *Engine) PivotWider(ds dataset.Table, spec dataset.PivotWiderSpec) (dataset.Table, error)

PivotWider reshapes a long dataset to wide format. spec.NamesFrom identifies the column whose unique values become new column names. spec.ValuesFrom identifies the column whose values fill the new columns. All other columns are the "id" columns that define unique rows.

func (*Engine) Pow

func (e *Engine) Pow(col dataset.AnyColumn, exp float64) (dataset.AnyColumn, error)

Pow raises each element to the given exponent.

func (*Engine) Rank

func (e *Engine) Rank(col dataset.AnyColumn) (dataset.AnyColumn, error)

Rank returns competition rank (1-indexed). Ties get the same rank, next rank skips. E.g. [10,20,20,30] → [1,2,2,4].

func (*Engine) ReadCSV

func (e *Engine) ReadCSV(_ context.Context, r io.Reader, cfg dataset.CSVConfig) (dataset.Table, error)

ReadCSV reads CSV data using go-simdcsv with schema inference.

func (*Engine) ReadParquet

func (e *Engine) ReadParquet(_ context.Context, r io.ReaderAt, size int64, _ dataset.ParquetConfig) (dataset.Table, error)

ReadParquet reads Parquet data using parquet-go (row-based reader).

func (*Engine) ReplaceNA

func (e *Engine) ReplaceNA(col dataset.AnyColumn, defaultVal float64) (dataset.AnyColumn, error)

ReplaceNA replaces null (NaN) values in a float64 column with defaultVal.

func (*Engine) Round

func (e *Engine) Round(col dataset.AnyColumn) (dataset.AnyColumn, error)

Round rounds each element to the nearest integer.

func (*Engine) RowNumber

func (e *Engine) RowNumber(n int) (dataset.AnyColumn, error)

RowNumber returns a 1-indexed sequential column of length n.

func (*Engine) Select

func (e *Engine) Select(col dataset.AnyColumn, indices []int) (dataset.AnyColumn, error)

Select returns a new column containing only the rows at the given indices.

func (*Engine) Separate

func (e *Engine) Separate(ds dataset.Table, col string, into []string, sep string) (dataset.Table, error)

Separate splits a string column by a delimiter into multiple columns.

func (*Engine) Sigmoid

func (e *Engine) Sigmoid(col dataset.AnyColumn) (dataset.AnyColumn, error)

Sigmoid returns the logistic sigmoid of each element.

func (*Engine) Sign

func (e *Engine) Sign(col dataset.AnyColumn) (dataset.AnyColumn, error)

Sign returns the sign of each element (-1, 0, or 1).

func (*Engine) Sin

func (e *Engine) Sin(col dataset.AnyColumn) (dataset.AnyColumn, error)

Sin returns the sine of each element (radians).

func (*Engine) Slice

func (e *Engine) Slice(col dataset.AnyColumn, start, end int) (dataset.AnyColumn, error)

Slice returns a sub-column from start (inclusive) to end (exclusive).

func (*Engine) SortIndices

func (e *Engine) SortIndices(col dataset.AnyColumn) ([]int, error)

SortIndices returns the permutation that sorts the column ascending.

func (*Engine) Sqrt

func (e *Engine) Sqrt(col dataset.AnyColumn) (dataset.AnyColumn, error)

Sqrt returns the square root of each element.

func (*Engine) Stack

func (e *Engine) Stack(datasets ...dataset.Table) (dataset.Table, error)

Stack vertically concatenates datasets with compatible schemas.

func (*Engine) SubCols

func (e *Engine) SubCols(a, b dataset.AnyColumn) (dataset.AnyColumn, error)

SubCols returns element-wise subtraction of two float64 columns.

func (*Engine) Sum

func (e *Engine) Sum(col dataset.AnyColumn) (dataset.AnyColumn, error)

Sum returns the sum of a numeric column as a single-row column.

func (*Engine) Tan

func (e *Engine) Tan(col dataset.AnyColumn) (dataset.AnyColumn, error)

Tan returns the tangent of each element (radians).

func (*Engine) Tanh

func (e *Engine) Tanh(col dataset.AnyColumn) (dataset.AnyColumn, error)

Tanh returns the hyperbolic tangent of each element.

func (*Engine) Variance

func (e *Engine) Variance(col dataset.AnyColumn) (dataset.AnyColumn, error)

Variance returns the sample variance of a float64 column as a single-row column.

func (*Engine) WriteCSV

func (e *Engine) WriteCSV(_ context.Context, w io.Writer, ds dataset.Table, cfg dataset.CSVConfig) error

WriteCSV writes a Dataset as CSV using go-simdcsv.

func (*Engine) WriteParquet

func (e *Engine) WriteParquet(_ context.Context, w io.Writer, ds dataset.Table, _ dataset.ParquetConfig) error

WriteParquet writes a Dataset as Parquet using parquet-go.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL