Documentation
¶
Overview ¶
Decimal provides exact decimal arithmetic without external dependencies.
Why Not float64? ¶
float64 cannot represent most decimal fractions exactly:
0.1 + 0.2 = 0.30000000000000004 (IEEE 754 artifact)
Decimal stores the value as a scaled int64:
15.99 → {value: 1599, scale: 2}
0.001 → {value: 1, scale: 3}
100 → {value: 100, scale: 0}
Addition, subtraction, and multiplication are exact. Division is intentionally omitted to avoid unbounded scale growth.
Package types defines the core data types used throughout goframe.
Design Philosophy ¶
In pandas (Python), every value in a Series can be any Python object, and Python's dynamic typing handles everything automatically. Go is statically typed, so we need an explicit "union type" — a single Go type that can hold an int, float, string, bool, or null value.
We solve this with the Value type: a tagged union (also called a discriminated union or sum type). Each Value knows what type it holds via the Kind field, and the actual data lives in one of the concrete fields.
Why Not interface{}? ¶
We could store everything as `interface{}` (or `any`), but that has downsides:
- Type assertions everywhere make code messy
- No compile-time safety about what kinds of values exist
- Harder to implement fast type-specific operations (e.g., numeric sum)
Our tagged union gives us a closed set of supported types, which lets us write exhaustive switch statements and catch missing cases at compile time.
Index ¶
- type BoolColumn
- type Column
- type DateTimeColumn
- type Decimal
- func (d Decimal) Add(other Decimal) Decimal
- func (d Decimal) Cmp(other Decimal) int
- func (d Decimal) Equal(other Decimal) bool
- func (d Decimal) LessThan(other Decimal) bool
- func (d Decimal) Mul(other Decimal) Decimal
- func (d Decimal) String() string
- func (d Decimal) Sub(other Decimal) Decimal
- func (d Decimal) ToFloat64() float64
- type DecimalColumn
- type FloatColumn
- func (c *FloatColumn) Dtype() Kind
- func (c *FloatColumn) Get(i int) Value
- func (c *FloatColumn) IsNull(i int) bool
- func (c *FloatColumn) Len() int
- func (c *FloatColumn) MinMaxFloat() (float64, float64, int)
- func (c *FloatColumn) Slice(start, end int) Column
- func (c *FloatColumn) SumFloat() (float64, int)
- type GenericColumn
- type Index
- type IntColumn
- type Kind
- type StringColumn
- type Value
- func (v Value) AsBool() (bool, bool)
- func (v Value) AsDateTime() (time.Time, bool)
- func (v Value) AsDecimal() (Decimal, bool)
- func (v Value) AsFloat() (float64, bool)
- func (v Value) AsInt() (int64, bool)
- func (v Value) AsString() (string, bool)
- func (v Value) Equal(other Value) bool
- func (v Value) IsNull() bool
- func (v Value) LessThan(other Value) bool
- func (v Value) String() string
- func (v Value) ToFloat64() (float64, error)
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type BoolColumn ¶
type BoolColumn struct {
// contains filtered or unexported fields
}
BoolColumn stores bool values without per-cell boxing.
func (*BoolColumn) Dtype ¶
func (c *BoolColumn) Dtype() Kind
func (*BoolColumn) Get ¶
func (c *BoolColumn) Get(i int) Value
func (*BoolColumn) IsNull ¶
func (c *BoolColumn) IsNull(i int) bool
func (*BoolColumn) Len ¶
func (c *BoolColumn) Len() int
func (*BoolColumn) RawAt ¶
func (c *BoolColumn) RawAt(i int) bool
RawAt returns the raw bool at i; caller must verify IsNull first.
func (*BoolColumn) Slice ¶
func (c *BoolColumn) Slice(start, end int) Column
type Column ¶
type Column interface {
Len() int
Get(i int) Value
IsNull(i int) bool
Dtype() Kind
Slice(start, end int) Column
}
Column is the internal typed storage interface for a Series. Each implementation stores a single native-type slice, eliminating the per-cell boxing overhead of []Value.
Value is still the public API type — Get boxes only when called. Internal aggregations bypass Get entirely via typed fast-path methods.
func NewColumn ¶
NewColumn creates the most memory-efficient Column for the given values. Homogeneous (single-type) columns get typed storage; mixed columns fall back to GenericColumn which stores []Value directly.
func NewFloatColumn ¶
NewFloatColumn creates a typed FloatColumn directly from []float64 without boxing.
func NewIntColumn ¶
NewIntColumn creates a typed IntColumn directly from []int64 without boxing.
func NewStringColumn ¶
NewStringColumn creates a typed StringColumn directly from []string without boxing.
type DateTimeColumn ¶
type DateTimeColumn struct {
// contains filtered or unexported fields
}
DateTimeColumn stores time.Time values without per-cell boxing.
func (*DateTimeColumn) Dtype ¶
func (c *DateTimeColumn) Dtype() Kind
func (*DateTimeColumn) Get ¶
func (c *DateTimeColumn) Get(i int) Value
func (*DateTimeColumn) IsNull ¶
func (c *DateTimeColumn) IsNull(i int) bool
func (*DateTimeColumn) Len ¶
func (c *DateTimeColumn) Len() int
func (*DateTimeColumn) Slice ¶
func (c *DateTimeColumn) Slice(start, end int) Column
type Decimal ¶
type Decimal struct {
// contains filtered or unexported fields
}
Decimal is an exact decimal number backed by a scaled int64.
func NewDecimal ¶
NewDecimal creates a Decimal from an unscaled integer and a scale.
NewDecimal(1599, 2) → 15.99 NewDecimal(100, 0) → 100 NewDecimal(1, 3) → 0.001
func ParseDecimal ¶
ParseDecimal parses a decimal string ("15.99", "-3.5", "100") into a Decimal. Returns an error if the string is not a valid decimal number.
func (Decimal) Equal ¶
Equal returns true if d == other (value-equality, ignoring trailing zeros).
NewDecimal(150, 1).Equal(NewDecimal(1500, 2)) → true (both = 15.0)
type DecimalColumn ¶
type DecimalColumn struct {
// contains filtered or unexported fields
}
DecimalColumn stores Decimal values without per-cell boxing.
func (*DecimalColumn) Dtype ¶
func (c *DecimalColumn) Dtype() Kind
func (*DecimalColumn) Get ¶
func (c *DecimalColumn) Get(i int) Value
func (*DecimalColumn) IsNull ¶
func (c *DecimalColumn) IsNull(i int) bool
func (*DecimalColumn) Len ¶
func (c *DecimalColumn) Len() int
func (*DecimalColumn) Slice ¶
func (c *DecimalColumn) Slice(start, end int) Column
type FloatColumn ¶
type FloatColumn struct {
// contains filtered or unexported fields
}
FloatColumn stores float64 values without per-cell boxing.
func (*FloatColumn) Dtype ¶
func (c *FloatColumn) Dtype() Kind
func (*FloatColumn) Get ¶
func (c *FloatColumn) Get(i int) Value
func (*FloatColumn) IsNull ¶
func (c *FloatColumn) IsNull(i int) bool
func (*FloatColumn) Len ¶
func (c *FloatColumn) Len() int
func (*FloatColumn) MinMaxFloat ¶
func (c *FloatColumn) MinMaxFloat() (float64, float64, int)
MinMaxFloat returns (min, max, count) of non-null, non-NaN values.
func (*FloatColumn) Slice ¶
func (c *FloatColumn) Slice(start, end int) Column
func (*FloatColumn) SumFloat ¶
func (c *FloatColumn) SumFloat() (float64, int)
SumFloat returns (sum, count) of non-null, non-NaN values without boxing.
type GenericColumn ¶
type GenericColumn struct {
// contains filtered or unexported fields
}
GenericColumn is the fallback for mixed-type or all-null columns. It stores []Value directly, preserving the original untyped behavior.
func (*GenericColumn) Dtype ¶
func (c *GenericColumn) Dtype() Kind
func (*GenericColumn) Get ¶
func (c *GenericColumn) Get(i int) Value
func (*GenericColumn) IsNull ¶
func (c *GenericColumn) IsNull(i int) bool
func (*GenericColumn) Len ¶
func (c *GenericColumn) Len() int
func (*GenericColumn) Slice ¶
func (c *GenericColumn) Slice(start, end int) Column
type Index ¶
type Index struct {
// contains filtered or unexported fields
}
Index holds ordered, labeled row identifiers.
Invariant: len(labels) == len of any Series/DataFrame using this Index. Invariant: posMap[labels[i].String()] == i for all i (when labels are unique).
func NewIndex ¶
NewIndex creates an Index from a slice of Values.
If any labels are duplicated, the posMap is NOT populated — label-based lookup will return an error for non-unique indexes, just like pandas raises when you try df.loc["x"] on a DataFrame with duplicate "x" rows.
func NewRangeIndex ¶
NewRangeIndex creates a default 0..n-1 integer index, equivalent to pandas' RangeIndex(n).
This is the default index when you create a Series or DataFrame without specifying row labels — matching pandas' behavior.
func NewStringIndex ¶
NewStringIndex is a convenience constructor for string-labeled indexes. Equivalent to pd.Index(["a", "b", "c"]).
func (*Index) Label ¶
Label returns the label at position i. Panics if i is out of bounds — use bounds-checked access in public APIs.
func (*Index) Labels ¶
Labels returns a copy of all labels as a slice. We return a copy to preserve the Index's immutability invariant.
func (*Index) Locate ¶
Locate returns the integer position of the given label. Returns -1 and an error if the label is not found or the index is non-unique.
This is the underlying mechanism for df.loc[label] in pandas.
type IntColumn ¶
type IntColumn struct {
// contains filtered or unexported fields
}
IntColumn stores int64 values without per-cell boxing. nulls is nil when the column has no null values (common case).
type Kind ¶
type Kind int
Kind represents which data type a Value holds. Using an integer enum (rather than strings) makes comparisons O(1).
const ( // KindNull represents a missing or undefined value — equivalent to // Python's None, pandas' NaN, or SQL's NULL. // Null is the zero value of Kind, so a zero-initialized Value is null. KindNull Kind = iota // KindInt represents a 64-bit signed integer. // We always use int64 (not int) so behavior is identical on 32-bit and // 64-bit systems — important for reproducibility. KindInt // KindFloat represents a 64-bit IEEE 754 floating-point number. // We use float64 to match Go's default float literal type and pandas' // numpy.float64 default. KindFloat // KindString represents a UTF-8 string. Go strings are immutable byte // slices, so storing them in a Value is cheap (just a pointer + length). KindString // KindBool represents a boolean true/false value. KindBool // KindDateTime represents a date-time value (time.Time). KindDateTime // KindDecimal represents an exact decimal number using scaled integer arithmetic. // Avoids floating-point rounding errors — ideal for financial and scientific data. KindDecimal )
type StringColumn ¶
type StringColumn struct {
// contains filtered or unexported fields
}
StringColumn stores string values without per-cell boxing.
func (*StringColumn) Dtype ¶
func (c *StringColumn) Dtype() Kind
func (*StringColumn) Get ¶
func (c *StringColumn) Get(i int) Value
func (*StringColumn) IsNull ¶
func (c *StringColumn) IsNull(i int) bool
func (*StringColumn) Len ¶
func (c *StringColumn) Len() int
func (*StringColumn) RawAt ¶
func (c *StringColumn) RawAt(i int) string
RawAt returns the raw string at i; caller must verify IsNull first.
func (*StringColumn) Slice ¶
func (c *StringColumn) Slice(start, end int) Column
type Value ¶
type Value struct {
Kind Kind
// contains filtered or unexported fields
}
Value is a single data cell — the atom of goframe.
Memory layout:
Kind int (8 bytes)
intVal int64 (8 bytes)
fltVal float64 (8 bytes)
strVal string (16 bytes: pointer + length)
boolVal bool (1 byte, padded to 8)
timeVal time.Time (24 bytes)
currVal Currency (int64 + string header = ~32 bytes)
─────────────────────
~104 bytes per Value
This is larger than a raw int64 (8 bytes) but much smaller than an interface{} holding a boxed value (typically 16 bytes header + heap allocation). For a column of 1 million integers, our approach uses ~49 MB vs interface{}'s ~16 MB header + heap allocations — but our approach avoids GC pressure from millions of tiny heap objects.
Production note: A real high-performance library would use columnar storage (e.g., Apache Arrow) where each column is a single typed []float64 or []int64 array. We use this approach for clarity.
func Float ¶
Float wraps a float64 in a Value. Note: NaN floats are valid but operations should handle them carefully.
func Null ¶
func Null() Value
Null returns a null Value, representing a missing data point. In pandas: pd.NA, np.nan, or None in a column.
func Str ¶
Str wraps a string in a Value. We name it Str (not String) to avoid shadowing the Stringer interface.
func (Value) AsDateTime ¶
AsDateTime returns the time value and true if Kind == KindDateTime.
func (Value) AsInt ¶
AsInt returns the integer value and true if Kind == KindInt. Returns (0, false) otherwise — never panics.
func (Value) Equal ¶
Equal returns true if two Values are equal.
Null == Null returns TRUE in goframe (unlike SQL's NULL != NULL semantics). This matches pandas behavior: pd.NA == pd.NA is pd.NA (ambiguous), but for practical filtering we treat null == null as true.
Float NaN != NaN always (IEEE 754 standard). This is intentional — if you want NaN-aware equality, use EqualNaN.
func (Value) LessThan ¶
LessThan returns true if v < other for orderable types. Panics if the types are incomparable (e.g., string vs int). Used internally by sorting operations.
func (Value) String ¶
String returns a human-readable representation of the Value. This implements the fmt.Stringer interface, so fmt.Println(v) works.
func (Value) ToFloat64 ¶
ToFloat64 converts the Value to float64. This is used internally by numeric aggregation functions so they can operate on mixed int/float columns without branching on every element.
Conversion rules:
- KindInt → exact conversion (int64 fits in float64 up to 2^53)
- KindFloat → no-op
- KindBool → 0.0 or 1.0 (matches pandas behavior)
- KindString → parse as decimal number; error if malformed
- KindNull → math.NaN() (so aggregations can use NaN-aware math)