types

package
v0.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 6, 2026 License: MIT Imports: 5 Imported by: 0

Documentation

Overview

Decimal provides exact decimal arithmetic without external dependencies.

Why Not float64?

float64 cannot represent most decimal fractions exactly:

0.1 + 0.2 = 0.30000000000000004  (IEEE 754 artifact)

Decimal stores the value as a scaled int64:

15.99  →  {value: 1599, scale: 2}
0.001  →  {value:    1, scale: 3}
100    →  {value:  100, scale: 0}

Addition, subtraction, and multiplication are exact. Division is intentionally omitted to avoid unbounded scale growth.

Package types defines the core data types used throughout goframe.

Design Philosophy

In pandas (Python), every value in a Series can be any Python object, and Python's dynamic typing handles everything automatically. Go is statically typed, so we need an explicit "union type" — a single Go type that can hold an int, float, string, bool, or null value.

We solve this with the Value type: a tagged union (also called a discriminated union or sum type). Each Value knows what type it holds via the Kind field, and the actual data lives in one of the concrete fields.

Why Not interface{}?

We could store everything as `interface{}` (or `any`), but that has downsides:

  • Type assertions everywhere make code messy
  • No compile-time safety about what kinds of values exist
  • Harder to implement fast type-specific operations (e.g., numeric sum)

Our tagged union gives us a closed set of supported types, which lets us write exhaustive switch statements and catch missing cases at compile time.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type BoolColumn

type BoolColumn struct {
	// contains filtered or unexported fields
}

BoolColumn stores bool values without per-cell boxing.

func (*BoolColumn) Dtype

func (c *BoolColumn) Dtype() Kind

func (*BoolColumn) Get

func (c *BoolColumn) Get(i int) Value

func (*BoolColumn) IsNull

func (c *BoolColumn) IsNull(i int) bool

func (*BoolColumn) Len

func (c *BoolColumn) Len() int

func (*BoolColumn) RawAt

func (c *BoolColumn) RawAt(i int) bool

RawAt returns the raw bool at i; caller must verify IsNull first.

func (*BoolColumn) Slice

func (c *BoolColumn) Slice(start, end int) Column

type Column

type Column interface {
	Len() int
	Get(i int) Value
	IsNull(i int) bool
	Dtype() Kind
	Slice(start, end int) Column
}

Column is the internal typed storage interface for a Series. Each implementation stores a single native-type slice, eliminating the per-cell boxing overhead of []Value.

Value is still the public API type — Get boxes only when called. Internal aggregations bypass Get entirely via typed fast-path methods.

func NewColumn

func NewColumn(vals []Value) Column

NewColumn creates the most memory-efficient Column for the given values. Homogeneous (single-type) columns get typed storage; mixed columns fall back to GenericColumn which stores []Value directly.

func NewFloatColumn

func NewFloatColumn(data []float64) Column

NewFloatColumn creates a typed FloatColumn directly from []float64 without boxing.

func NewIntColumn

func NewIntColumn(data []int64) Column

NewIntColumn creates a typed IntColumn directly from []int64 without boxing.

func NewStringColumn

func NewStringColumn(data []string) Column

NewStringColumn creates a typed StringColumn directly from []string without boxing.

type DateTimeColumn

type DateTimeColumn struct {
	// contains filtered or unexported fields
}

DateTimeColumn stores time.Time values without per-cell boxing.

func (*DateTimeColumn) Dtype

func (c *DateTimeColumn) Dtype() Kind

func (*DateTimeColumn) Get

func (c *DateTimeColumn) Get(i int) Value

func (*DateTimeColumn) IsNull

func (c *DateTimeColumn) IsNull(i int) bool

func (*DateTimeColumn) Len

func (c *DateTimeColumn) Len() int

func (*DateTimeColumn) Slice

func (c *DateTimeColumn) Slice(start, end int) Column

type Decimal

type Decimal struct {
	// contains filtered or unexported fields
}

Decimal is an exact decimal number backed by a scaled int64.

func NewDecimal

func NewDecimal(value int64, scale uint8) Decimal

NewDecimal creates a Decimal from an unscaled integer and a scale.

NewDecimal(1599, 2)  →  15.99
NewDecimal(100,  0)  →  100
NewDecimal(1,    3)  →  0.001

func ParseDecimal

func ParseDecimal(s string) (Decimal, error)

ParseDecimal parses a decimal string ("15.99", "-3.5", "100") into a Decimal. Returns an error if the string is not a valid decimal number.

func (Decimal) Add

func (d Decimal) Add(other Decimal) Decimal

Add returns d + other, exact.

func (Decimal) Cmp

func (d Decimal) Cmp(other Decimal) int

Cmp compares d and other. Returns -1, 0, or 1.

func (Decimal) Equal

func (d Decimal) Equal(other Decimal) bool

Equal returns true if d == other (value-equality, ignoring trailing zeros).

NewDecimal(150, 1).Equal(NewDecimal(1500, 2))  →  true  (both = 15.0)

func (Decimal) LessThan

func (d Decimal) LessThan(other Decimal) bool

LessThan returns true if d < other.

func (Decimal) Mul

func (d Decimal) Mul(other Decimal) Decimal

Mul returns d * other, exact. The result scale is d.scale + other.scale.

func (Decimal) String

func (d Decimal) String() string

String returns the canonical decimal string.

Decimal{1599, 2}.String()  →  "15.99"
Decimal{100,  0}.String()  →  "100"
Decimal{1,    3}.String()  →  "0.001"

func (Decimal) Sub

func (d Decimal) Sub(other Decimal) Decimal

Sub returns d - other, exact.

func (Decimal) ToFloat64

func (d Decimal) ToFloat64() float64

ToFloat64 converts to float64 (approximate). Use only for display or aggregation — not for exact arithmetic.

type DecimalColumn

type DecimalColumn struct {
	// contains filtered or unexported fields
}

DecimalColumn stores Decimal values without per-cell boxing.

func (*DecimalColumn) Dtype

func (c *DecimalColumn) Dtype() Kind

func (*DecimalColumn) Get

func (c *DecimalColumn) Get(i int) Value

func (*DecimalColumn) IsNull

func (c *DecimalColumn) IsNull(i int) bool

func (*DecimalColumn) Len

func (c *DecimalColumn) Len() int

func (*DecimalColumn) Slice

func (c *DecimalColumn) Slice(start, end int) Column

type FloatColumn

type FloatColumn struct {
	// contains filtered or unexported fields
}

FloatColumn stores float64 values without per-cell boxing.

func (*FloatColumn) Dtype

func (c *FloatColumn) Dtype() Kind

func (*FloatColumn) Get

func (c *FloatColumn) Get(i int) Value

func (*FloatColumn) IsNull

func (c *FloatColumn) IsNull(i int) bool

func (*FloatColumn) Len

func (c *FloatColumn) Len() int

func (*FloatColumn) MinMaxFloat

func (c *FloatColumn) MinMaxFloat() (float64, float64, int)

MinMaxFloat returns (min, max, count) of non-null, non-NaN values.

func (*FloatColumn) Slice

func (c *FloatColumn) Slice(start, end int) Column

func (*FloatColumn) SumFloat

func (c *FloatColumn) SumFloat() (float64, int)

SumFloat returns (sum, count) of non-null, non-NaN values without boxing.

type GenericColumn

type GenericColumn struct {
	// contains filtered or unexported fields
}

GenericColumn is the fallback for mixed-type or all-null columns. It stores []Value directly, preserving the original untyped behavior.

func (*GenericColumn) Dtype

func (c *GenericColumn) Dtype() Kind

func (*GenericColumn) Get

func (c *GenericColumn) Get(i int) Value

func (*GenericColumn) IsNull

func (c *GenericColumn) IsNull(i int) bool

func (*GenericColumn) Len

func (c *GenericColumn) Len() int

func (*GenericColumn) Slice

func (c *GenericColumn) Slice(start, end int) Column

type Index

type Index struct {
	// contains filtered or unexported fields
}

Index holds ordered, labeled row identifiers.

Invariant: len(labels) == len of any Series/DataFrame using this Index. Invariant: posMap[labels[i].String()] == i for all i (when labels are unique).

func NewIndex

func NewIndex(labels []Value) *Index

NewIndex creates an Index from a slice of Values.

If any labels are duplicated, the posMap is NOT populated — label-based lookup will return an error for non-unique indexes, just like pandas raises when you try df.loc["x"] on a DataFrame with duplicate "x" rows.

func NewRangeIndex

func NewRangeIndex(n int) *Index

NewRangeIndex creates a default 0..n-1 integer index, equivalent to pandas' RangeIndex(n).

This is the default index when you create a Series or DataFrame without specifying row labels — matching pandas' behavior.

func NewStringIndex

func NewStringIndex(labels []string) *Index

NewStringIndex is a convenience constructor for string-labeled indexes. Equivalent to pd.Index(["a", "b", "c"]).

func (*Index) IsUnique

func (idx *Index) IsUnique() bool

IsUnique returns true if all labels are distinct.

func (*Index) Label

func (idx *Index) Label(i int) Value

Label returns the label at position i. Panics if i is out of bounds — use bounds-checked access in public APIs.

func (*Index) Labels

func (idx *Index) Labels() []Value

Labels returns a copy of all labels as a slice. We return a copy to preserve the Index's immutability invariant.

func (*Index) Len

func (idx *Index) Len() int

Len returns the number of labels in the index.

func (*Index) Locate

func (idx *Index) Locate(label Value) (int, error)

Locate returns the integer position of the given label. Returns -1 and an error if the label is not found or the index is non-unique.

This is the underlying mechanism for df.loc[label] in pandas.

func (*Index) Slice

func (idx *Index) Slice(start, end int) *Index

Slice returns a new Index containing only positions [start, end). Used when slicing a Series or DataFrame.

func (*Index) String

func (idx *Index) String() string

String returns a readable representation for debugging.

type IntColumn

type IntColumn struct {
	// contains filtered or unexported fields
}

IntColumn stores int64 values without per-cell boxing. nulls is nil when the column has no null values (common case).

func (*IntColumn) Dtype

func (c *IntColumn) Dtype() Kind

func (*IntColumn) Get

func (c *IntColumn) Get(i int) Value

func (*IntColumn) IsNull

func (c *IntColumn) IsNull(i int) bool

func (*IntColumn) Len

func (c *IntColumn) Len() int

func (*IntColumn) MinMaxInt

func (c *IntColumn) MinMaxInt() (int64, int64, int)

MinMaxInt returns (min, max, count) of non-null values without boxing.

func (*IntColumn) Slice

func (c *IntColumn) Slice(start, end int) Column

func (*IntColumn) SumInt

func (c *IntColumn) SumInt() (int64, int)

SumInt returns (sum, count) of non-null values without boxing.

type Kind

type Kind int

Kind represents which data type a Value holds. Using an integer enum (rather than strings) makes comparisons O(1).

const (
	// KindNull represents a missing or undefined value — equivalent to
	// Python's None, pandas' NaN, or SQL's NULL.
	// Null is the zero value of Kind, so a zero-initialized Value is null.
	KindNull Kind = iota

	// KindInt represents a 64-bit signed integer.
	// We always use int64 (not int) so behavior is identical on 32-bit and
	// 64-bit systems — important for reproducibility.
	KindInt

	// KindFloat represents a 64-bit IEEE 754 floating-point number.
	// We use float64 to match Go's default float literal type and pandas'
	// numpy.float64 default.
	KindFloat

	// KindString represents a UTF-8 string. Go strings are immutable byte
	// slices, so storing them in a Value is cheap (just a pointer + length).
	KindString

	// KindBool represents a boolean true/false value.
	KindBool

	// KindDateTime represents a date-time value (time.Time).
	KindDateTime

	// KindDecimal represents an exact decimal number using scaled integer arithmetic.
	// Avoids floating-point rounding errors — ideal for financial and scientific data.
	KindDecimal
)

func (Kind) String

func (k Kind) String() string

String returns a human-readable name for the Kind — used in error messages and dtype display (mimicking pandas' dtype attribute).

type StringColumn

type StringColumn struct {
	// contains filtered or unexported fields
}

StringColumn stores string values without per-cell boxing.

func (*StringColumn) Dtype

func (c *StringColumn) Dtype() Kind

func (*StringColumn) Get

func (c *StringColumn) Get(i int) Value

func (*StringColumn) IsNull

func (c *StringColumn) IsNull(i int) bool

func (*StringColumn) Len

func (c *StringColumn) Len() int

func (*StringColumn) RawAt

func (c *StringColumn) RawAt(i int) string

RawAt returns the raw string at i; caller must verify IsNull first.

func (*StringColumn) Slice

func (c *StringColumn) Slice(start, end int) Column

type Value

type Value struct {
	Kind Kind
	// contains filtered or unexported fields
}

Value is a single data cell — the atom of goframe.

Memory layout:

Kind    int      (8 bytes)
intVal  int64    (8 bytes)
fltVal  float64  (8 bytes)
strVal  string   (16 bytes: pointer + length)
boolVal bool     (1 byte, padded to 8)
timeVal time.Time (24 bytes)
currVal Currency  (int64 + string header = ~32 bytes)
                  ─────────────────────
                  ~104 bytes per Value

This is larger than a raw int64 (8 bytes) but much smaller than an interface{} holding a boxed value (typically 16 bytes header + heap allocation). For a column of 1 million integers, our approach uses ~49 MB vs interface{}'s ~16 MB header + heap allocations — but our approach avoids GC pressure from millions of tiny heap objects.

Production note: A real high-performance library would use columnar storage (e.g., Apache Arrow) where each column is a single typed []float64 or []int64 array. We use this approach for clarity.

func Bool

func Bool(v bool) Value

Bool wraps a bool in a Value.

func DateTime

func DateTime(v time.Time) Value

DateTime wraps a time.Time in a Value.

func Dec

func Dec(v Decimal) Value

Dec wraps a Decimal in a Value.

func Float

func Float(v float64) Value

Float wraps a float64 in a Value. Note: NaN floats are valid but operations should handle them carefully.

func Int

func Int(v int64) Value

Int wraps an int64 in a Value.

func Null

func Null() Value

Null returns a null Value, representing a missing data point. In pandas: pd.NA, np.nan, or None in a column.

func Str

func Str(v string) Value

Str wraps a string in a Value. We name it Str (not String) to avoid shadowing the Stringer interface.

func (Value) AsBool

func (v Value) AsBool() (bool, bool)

AsBool returns the bool value and true if Kind == KindBool.

func (Value) AsDateTime

func (v Value) AsDateTime() (time.Time, bool)

AsDateTime returns the time value and true if Kind == KindDateTime.

func (Value) AsDecimal

func (v Value) AsDecimal() (Decimal, bool)

AsDecimal returns the Decimal value and true if Kind == KindDecimal.

func (Value) AsFloat

func (v Value) AsFloat() (float64, bool)

AsFloat returns the float value and true if Kind == KindFloat.

func (Value) AsInt

func (v Value) AsInt() (int64, bool)

AsInt returns the integer value and true if Kind == KindInt. Returns (0, false) otherwise — never panics.

func (Value) AsString

func (v Value) AsString() (string, bool)

AsString returns the string value and true if Kind == KindString.

func (Value) Equal

func (v Value) Equal(other Value) bool

Equal returns true if two Values are equal.

Null == Null returns TRUE in goframe (unlike SQL's NULL != NULL semantics). This matches pandas behavior: pd.NA == pd.NA is pd.NA (ambiguous), but for practical filtering we treat null == null as true.

Float NaN != NaN always (IEEE 754 standard). This is intentional — if you want NaN-aware equality, use EqualNaN.

func (Value) IsNull

func (v Value) IsNull() bool

IsNull returns true if this Value represents missing data.

func (Value) LessThan

func (v Value) LessThan(other Value) bool

LessThan returns true if v < other for orderable types. Panics if the types are incomparable (e.g., string vs int). Used internally by sorting operations.

func (Value) String

func (v Value) String() string

String returns a human-readable representation of the Value. This implements the fmt.Stringer interface, so fmt.Println(v) works.

func (Value) ToFloat64

func (v Value) ToFloat64() (float64, error)

ToFloat64 converts the Value to float64. This is used internally by numeric aggregation functions so they can operate on mixed int/float columns without branching on every element.

Conversion rules:

  • KindInt → exact conversion (int64 fits in float64 up to 2^53)
  • KindFloat → no-op
  • KindBool → 0.0 or 1.0 (matches pandas behavior)
  • KindString → parse as decimal number; error if malformed
  • KindNull → math.NaN() (so aggregations can use NaN-aware math)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL