dataframe

package module
v0.1.0-alpha Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 24, 2022 License: MIT Imports: 14 Imported by: 0

README

GitHub all releases GitHub release (latest by date)

dataframe-go

Dataframes are used for statistics, machine-learning, and data manipulation/exploration. This package is based on rocketlaunchr/dataframe-go and rewritten with Go 1.18 generics. This package is still in progress and all of the rocketlaunchr/dataframe-go features will be added in the future. If you are interested in contributing, your help is welcome.

1. Installation and usage

go get -u github.com/tradeoforigin/dataframe-go
import "github.com/tradeoforigin/dataframe-go"

2. Series

Series is a generic struct to store any data you wish. Series is also a type of interface SeriesAny to handle different types in DataFrame.

s := dataframe.NewSeries("weight", nil, 115.5, 93.1)
fmt.Println(s.Table())

Output:

+-----+---------+
|     | WEIGHT  |
+-----+---------+
| 0:  |  115.5  |
| 1:  |  93.1   |
+-----+---------+
| 2X1 | FLOAT64 |
+-----+---------+

Series with type definition:

s := dataframe.NewSeries[float64]("weight", nil, 115, 93.1)
fmt.Println(s.Table())

Output:

+-----+---------+
|     | WEIGHT  |
+-----+---------+
| 0:  |   115   |
| 1:  |  93.1   |
+-----+---------+
| 2X1 | FLOAT64 |
+-----+---------+

You can also define series of your own type:

type Dog struct {
    name string
}
s := dataframe.NewSeries("dogs", nil, 
    Dog { "Abby" }, 
    Dog { "Agas" },
)
fmt.Println(s.Table())

Output:

+-----+----------+
|     |   DOGS   |
+-----+----------+
| 0:  |  {Abby}  |
| 1:  |  {Agas}  |
+-----+----------+
| 2X1 | MAIN DOG |
+-----+----------+

Or series of any type:

s := dataframe.NewSeries[any]("numbers", nil, 10, "ten", 10.0)
fmt.Println(s.Table())

Output:

+-----+---------+
|     | NUMBERS |
+-----+---------+
| 0:  |   10    |
| 1:  |   ten   |
| 2:  |   10    |
+-----+---------+
| 3X1 |   ANY   |
+-----+---------+
2.1. Series manipulation

Series provides a few functions for data manipulation:

  1. s.Value(row int, options ...Options) T returns the value of a particular row.
  2. s.Prepend(val []T, options ...Options) Prepend is used to set a value to the beginning of the series.
  3. s.Append(val []T, options ...Options) int is used to set a value to the end of the series.
  4. s.Insert(row int, val []T, options ...Options) Insert is used to set a value at an arbitrary row in the series. All existing values from that row onwards are shifted by 1.
  5. s.Remove(row int, options ...Options) is used to delete the value of a particular row.
  6. s.Reset(options ...Options) is used clear all data contained in the Series.
  7. s.Update(row int, val T, options ...Options) is used to update the value of a particular row.

Example:

s := dataframe.NewSeries[float64]("numbers", nil, 1, 2, 3) // [1, 2, 3]
s.Append([]float64 { 0, 0 }) // [1, 2, 3, 0, 0]
s.Prepend([] float64 { 0, 0 }) // [0, 0, 1, 2, 3, 0, 0]
s.Insert(2, []float64 { -1 }) // [0, 0, -1, 1, 2, 3, 0, 0]
s.Update(-1, -1) // [0, 0, -1, 1, 2, 3, 0, -1]
s.Remove(0) // [0, -1, 1, 2, 3, 0, -1]
fmt.Println(s.Table())

Output:

+-----+---------+
| 0:  |    0    |
| 1:  |   -1    |
| 2:  |    1    |
| 3:  |    2    |
| 4:  |    3    |
| 5:  |    0    |
| 6:  |   -1    |
+-----+---------+
| 7X1 | FLOAT64 |
+-----+---------+
2.2. Fill values randomly

There is possibility to fill series with random values:

s := dataframe.NewSeries("rand", nil, math.NaN(), math.NaN(), math.NaN())
s.FillRand(dataframe.RandFillerFloat64())

You can also define your own RandFiller as function of type dataframe.RandFn[T any].

2.3. Sorting

To sort series values you need to provide CompareFn[T any] as series less than function:

s := dataframe.NewSeries("sorted", nil, 0, 2, 1, 4, 3, 6, 5, 10, 9, 8, 7)
s.SetIsLessThanFunc(dataframe.IsLessThanFunc[int])
s.Sort(ctx) // DESC -> s.Sort(ctx, dataframe.SortOptions { Desc: true })
fmt.Println(s.Table())

Output:

+------+--------+
|      | SORTED |
+------+--------+
|  0:  |   0    |
|  1:  |   1    |
|  2:  |   2    |
|  3:  |   3    |
|  4:  |   4    |
|  5:  |   5    |
|  6:  |   6    |
|  7:  |   7    |
|  8:  |   8    |
|  9:  |   9    |
| 10:  |   10   |
+------+--------+
| 11X1 |  INT   |
+------+--------+
2.4. Values iterator

Values iterator is used to iterate series data. Iterator provides options to set:

  1. InitialRow - iterator starts at this row. It can be a negative value for indexing from the end of the series.
  2. Step - iteration steps. Can be negative value to iterate backwards.
  3. DontLock - if true is passed, then the series is not locked by the iterator.
s := dataframe.NewSeries("iterate", nil, 1, 2, 3)
iterator := s.Iterator()
for iterator.Next() {
    fmt.Println(iterator.Index, "->", iterator.Value)
}

Output:

0 -> 1
1 -> 2
2 -> 3
2.5. Apply and Filter

You can apply the function to modify the values of the series. Also, you can filter series data and DROP or KEEP values.

Apply:

s := dataframe.NewSeries("apply", nil, 1., 2., 3.) // *dataframe.Series[float64]
    
applyFn := func (val float64, row, nRows int) float64 {
    return val / 2
}
_, err := s.Apply(ctx, applyFn, dataframe.ApplyOptions { InPlace: true })
if err != nil {
    panic(err)
}
fmt.Println(s.Table())

Output:

+-----+---------+
|     |  APPLY  |
+-----+---------+
| 0:  |   0.5   |
| 1:  |    1    |
| 2:  |   1.5   |
+-----+---------+
| 3X1 | FLOAT64 |
+-----+---------+

Filter:

s := dataframe.NewSeries("filter", nil, 1., math.NaN(), 3.)
    
filterFn := func (val float64, row, nRows int) (dataframe.FilterAction, error) {
    if math.IsNaN(val) {
        return dataframe.DROP, nil
    }
    return dataframe.KEEP, nil
}
_, err := s.Filter(ctx, filterFn, dataframe.FilterOptions { InPlace: true })
if err != nil {
    panic(err)
}
fmt.Println(s.Table())

Output:

+-----+---------+
|     | FILTER  |
+-----+---------+
| 0:  |    1    |
| 1:  |    3    |
+-----+---------+
| 2X1 | FLOAT64 |
+-----+---------+
2.6. Copy and Equality

You can create a copy of the series as well as you can compare two different series.

s1 := dataframe.NewSeries[float64]("s1", nil, 1, 2, 3, 4)
s2 := s1.Copy() // copy series s1
eq, err := s.IsEqual(ctx, sc1) // returns true, nil 
// // lines below returns false, nil
// s2.Rename("s2")
// eq, err := s.IsEqual(ctx, sc1, dataframe.IsEqualOptions { CheckName: true }) 

3. DataFrame

DataFrame is a container for a Series of any kind. You can think of a Dataframe as an excel spreadsheet.

x := dataframe.NewSeries("x", nil, 1., 2., 3.)
y := dataframe.NewSeries("y", nil, 1., 2., 3.)
df := dataframe.NewDataFrame(x, y)
fmt.Println(df.Table())

Output:

+-----+---------+---------+
|     |    X    |    Y    |
+-----+---------+---------+
| 0:  |    1    |    1    |
| 1:  |    2    |    2    |
| 2:  |    3    |    3    |
+-----+---------+---------+
| 3X2 | FLOAT64 | FLOAT64 |
+-----+---------+---------+
3.1. DataFrame manipulation

DataFrame provides functions for manipulation with data. Similarly like for the series:

  1. df.Row(row int, options ...Options) map[string]any returns the series' values for a particular row.
  2. df.Prepend(vals any, options ...Options) inserts a row at the beginning.
  3. df.Append(vals any, options ...Options) inserts a row at the end.
  4. df.Insert(row int, vals any, options ...Options) adds a row to a particular position.
  5. df.Remove(row int, options ...Options) deletes a row.
  6. df.UpdateRow(row int, vals any, options ...Options) will update an entire row.
  7. df.Update(row int, col any, val any, options ...Options) is used to update a specific entry. col can be the name of the series or the column number.
  8. df.ReorderColumns(newOrder []string, options ...Options) error ReorderColumns reorders the columns based on an ordered list of column names. The length of newOrder must match the number of columns in the Dataframe. The column names in newOrder must be unique.
  9. df.RemoveSeries(seriesName string, options ...Options) error will remove a Series from the Dataframe.
  10. df.AddSeries(s SeriesAny, colN *int, options ...Options) error will add a Series to the end of the DataFrame, unless set by ColN.
  11. df.Swap(row1, row2 int, options ...Options) is used to swap 2 values based on their row position.

In many cases the values should be provided as map[string]any, map[int]any or []any.

s1 := dataframe.NewSeries[float64]("a", nil, 1, 2, 3, 4)
s2 := dataframe.NewSeries[float64]("b", nil, 1, 2, 3, 4)
df := dataframe.NewDataFrame(s1, s2)
df.Append(map[string]any {
    "a": [] float64 { 0, 0 },
    "b": [] float64 { 0, 0 },
})
df.Prepend(map[string]any {
    "a": [] float64 { 0, 0 },
    "b": [] float64 { 0, 0 },
})
df.Insert(2, map[string]any {
    "a": -1.0,
    "b": -1.0,
})
df.Update(-1, "a", -1.0)
fmt.Println(df.Table())

Output:

+-----+---------+---------+
|     |    A    |    B    |
+-----+---------+---------+
| 0:  |    0    |    0    |
| 1:  |    0    |    0    |
| 2:  |   -1    |   -1    |
| 3:  |    1    |    1    |
| 4:  |    2    |    2    |
| 5:  |    3    |    3    |
| 6:  |    4    |    4    |
| 7:  |    0    |    0    |
| 8:  |   -1    |    0    |
+-----+---------+---------+
| 9X2 | FLOAT64 | FLOAT64 |
+-----+---------+---------+
3.2. Fill values randomly

You can fill values with RandFiller at once:

s1 := dataframe.NewSeries("a", nil, math.NaN(), math.NaN(), math.NaN())
s2 := dataframe.NewSeries("b", nil, math.NaN(), math.NaN(), math.NaN())
df := dataframe.NewDataFrame(s1, s2)
df.FillRand(func() any {
    return rand.Float64()
})
3.3. Sorting

To sort DataFrame you need to provide CompareFn[T any] for all of the series as an input to the function IsLessThanFunc():

s1 := dataframe.NewSeries("a", nil, 0, 2, 1, 4, 3, 6, 5, 10, 9, 8, 7)
s2 := dataframe.NewSeries("b", nil, 0, 2, 1, 4, 3, 6, 5, 10, 9, 8, 7)
s1.SetIsLessThanFunc(dataframe.IsLessThanFunc[int])
s2.SetIsLessThanFunc(dataframe.IsLessThanFunc[int])
df := dataframe.NewDataFrame(s1, s2)
    
df.Sort(ctx, []dataframe.SortKey {
    { Key: "a" }, // Desc: true
    { Key: "b" }, // Desc: true
})
fmt.Println(df.Table())

Output:

+------+-----+-----+
|      |  A  |  B  |
+------+-----+-----+
|  0:  |  0  |  0  |
|  1:  |  1  |  1  |
|  2:  |  2  |  2  |
|  3:  |  3  |  3  |
|  4:  |  4  |  4  |
|  5:  |  5  |  5  |
|  6:  |  6  |  6  |
|  7:  |  7  |  7  |
|  8:  |  8  |  8  |
|  9:  |  9  |  9  |
| 10:  | 10  | 10  |
+------+-----+-----+
| 11X2 | INT | INT |
+------+-----+-----+
3.4. Values iterator

Values iterator is used to iterate dataframe rows. Iterator provides options to set:

  1. InitialRow - iterator starts at this row. It can be a negative value for indexing from the end of the series.
  2. Step - iteration steps. It can be a negative value to iterate backwards.
  3. DontLock - if true is passed, then the dataframe is not locked by the iterator.
s1 := dataframe.NewSeries("a", nil, 1, 2, 3)
s2 := dataframe.NewSeries("b", nil, 1, 2, 3)
df := dataframe.NewDataFrame(s1, s2)
var iterator = df.Iterator()
for iterator.Next() {
    fmt.Println(iterator.Index, iterator.Value)
}

Output:

0 map[a:1 b:1]
1 map[a:2 b:2]
2 map[a:3 b:3]
3.5. Apply and Filter

You can apply the function to modify rows of the dataframe. Also, you can filter data of the dataframe and DROP or KEEP values.

Apply:

y1  := dataframe.NewSeries[float64]("y1", &dataframe.SeriesInit{Size: 24})
y2 := dataframe.NewSeries[float64]("y2", &dataframe.SeriesInit{Size: 24})
    
df := dataframe.NewDataFrame(y1, y2)

fn := func (vals map[string]any, row, nRows int) map[string]any {
    x := float64(row + 1)
    y := math.Sin(2 * math.Pi * x / 24)
    if y == 1 || y == -1 {
        return map[string]any{
            "y1": y,
            "y2": y,
        }
    }
    // We can also update just one column
    return map[string]any{
        "y1": y,
    }
}
_, err := df.Apply(ctx, fn, dataframe.ApplyOptions { InPlace: true })
if err != nil {
    panic(err)
}
fmt.Println(df.Table())

Output:

+------+------------------------+---------+
|      |           Y1           |   Y2    |
+------+------------------------+---------+
|  0:  |  0.25881904510252074   |   NaN   |
|  1:  |  0.49999999999999994   |   NaN   |
|  2:  |   0.7071067811865475   |   NaN   |
|  3:  |   0.8660254037844386   |   NaN   |
|  4:  |   0.9659258262890683   |   NaN   |
|  5:  |           1            |    1    |
|  6:  |   0.9659258262890683   |   NaN   |
|  7:  |   0.8660254037844388   |   NaN   |
|  8:  |   0.7071067811865476   |   NaN   |
|  9:  |  0.49999999999999994   |   NaN   |
| 10:  |   0.258819045102521    |   NaN   |
| 11:  | 1.2246467991473515e-16 |   NaN   |
| 12:  |  -0.2588190451025208   |   NaN   |
| 13:  |  -0.4999999999999998   |   NaN   |
| 14:  |  -0.7071067811865471   |   NaN   |
| 15:  |  -0.8660254037844384   |   NaN   |
| 16:  |  -0.9659258262890683   |   NaN   |
| 17:  |           -1           |   -1    |
| 18:  |  -0.9659258262890684   |   NaN   |
| 19:  |  -0.8660254037844386   |   NaN   |
| 20:  |  -0.7071067811865477   |   NaN   |
| 21:  |  -0.5000000000000004   |   NaN   |
| 22:  |  -0.2588190451025215   |   NaN   |
| 23:  | -2.449293598294703e-16 |   NaN   |
+------+------------------------+---------+
| 24X2 |        FLOAT64         | FLOAT64 |
+------+------------------------+---------+

Filter:

s := dataframe.NewSeries("s", nil, 1, 2, 3, 4, 5)
df := dataframe.NewDataFrame(s)
    
fn := func (vals map[string]any, row, nRows int) (dataframe.FilterAction, error) {
    if row % 2 != 0 {
        return dataframe.DROP, nil
    }
    return dataframe.KEEP, nil
}
_, err := df.Filter(ctx, fn, dataframe.FilterOptions { InPlace: true })
if err != nil {
    panic(err)
}
fmt.Println(df.Table())

Output:

+-----+-----+
|     |  S  |
+-----+-----+
| 0:  |  1  |
| 1:  |  3  |
| 2:  |  5  |
+-----+-----+
| 3X1 | INT |
+-----+-----+
3.6. Copy and Equality

You can create a copy of the dataframe and compare two different dataframes.

s := dataframe.NewSeries[float64]("s", nil, 1, 2, 3, 4)
df1 := dataframe.NewDataFrame(s)
df2 := df1.Copy() // To copy series s1
eq, err := df1.IsEqual(ctx, df2) // returns true, nil 
3.7. Import dataframe from CSV

There is possibility to import dataframe directly from CSV:

csvString := `
A,B,C,D
0.0,0.0,0.02,0
0.0,1.6739,0.04,0
0.0,1.6739,0.06,0
0.0,1.673738,0.06,0
0.0,1.6736,0.06,0
0.0,1.673456,0.08,0
0.0,1.67302752,0.08,0
0.0,1.6726333184,0.08,0
1.6681,0.0,0.02,1`
reader := strings.NewReader(csvString)
df, err := csv.Load(ctx, reader, map[string]csv.ConverterAny {
    "A": csv.Float64,
    "B": csv.Float64,
    "C": csv.Float64,
    "D": csv.Float64,
})
if err != nil {
    t.Fatal(err)
}
fmt.Println(df.Table())

Output:

+-----+---------+---------+--------------+---------+
|     |    D    |    A    |      B       |    C    |
+-----+---------+---------+--------------+---------+
| 0:  |    0    |    0    |      0       |  0.02   |
| 1:  |    0    |    0    |    1.6739    |  0.04   |
| 2:  |    0    |    0    |    1.6739    |  0.06   |
| 3:  |    0    |    0    |   1.673738   |  0.06   |
| 4:  |    0    |    0    |    1.6736    |  0.06   |
| 5:  |    0    |    0    |   1.673456   |  0.08   |
| 6:  |    0    |    0    |  1.67302752  |  0.08   |
| 7:  |    0    |    0    | 1.6726333184 |  0.08   |
| 8:  |    1    | 1.6681  |      0       |  0.02   |
+-----+---------+---------+--------------+---------+
| 9X4 | FLOAT64 | FLOAT64 |   FLOAT64    | FLOAT64 |
+-----+---------+---------+--------------+---------+

You can also define custom converter to fit your needs.

For export dataframe to CSV you can use:

s1 := dataframe.NewSeries("str", nil, "one", "one,two", "one,two,three")
s2 := dataframe.NewSeries("num", nil, 1, 12, 123)
df := dataframe.NewDataFrame(s1, s2)
f, err := os.OpenFile("data/export.csv", os.O_WRONLY|os.O_CREATE, 0600)
if err != nil {
    panic(err)
}
err = csv.Export(ctx, f, df)
if err != nil {
    panic(err)
}
f.Close()
3.8. Math functions and fakers

There is no need for creating series by string expressions. Math functions for series can be covered by df.Apply or s.Apply function. The faker can be covered by custom RandFillers. Math functions and fakers may be added in future.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var DontLock = dontLock

shortcut for Options { DontLock: true }

Functions

func DefaultOptions

func DefaultOptions[T any](o ...T) T

DefaultOptions is helper function to resolve variadic options.

func DefaultValueFormatter

func DefaultValueFormatter(v interface{}) string

DefaultValueFormatter will return a string representation of the data in a particular row.

func IsEqualDefaultFunc

func IsEqualDefaultFunc[T any](f1, f2 T) bool

IsEqualDefaultFunc provides comparaision for any type

func IsEqualFunc

func IsEqualFunc[T comparable](f1, f2 T) bool

IsEqualFunc provides basic comparison for comparable types

func IsEqualPtrFunc

func IsEqualPtrFunc[T comparable](f1, f2 *T) bool

IsEqualPtrFunc provides comparision for pointers of comparable types

func IsLessThanFunc

func IsLessThanFunc[T constraints.Ordered](f1, f2 T) bool

IsLessThanFunc provides (less than) comparision for Ordered types

func IsLessThanPtrFunc

func IsLessThanPtrFunc[T constraints.Ordered](f1, f2 *T) bool

IsLessThanPtrFunc provides (less than) comparision for pointers of Ordered types

Types

type ApplyDataFrameFn

type ApplyDataFrameFn func(vals map[string]any, row, nRows int) map[string]any

ApplyDataFrameFn is used by the Apply function when used with DataFrames. vals contains the values for the current row. The keys contain ints (index of Series) and strings (name of Series). The returned map must only contain what values you intend to update. The key can be a string (name of Series) or int (index of Series). If nil is returned, the existing values for the row are unchanged.

type ApplyOptions

type ApplyOptions = FilterOptions

ApplyOptions is defined as an optional parameters for Apply(...) on top of Series or DataFrame.

Defaults:

ApplyOptions { InPlace: false, DontLock: false }

Properties:

  • `InPlace` - Apply affects current Series/DataFrame and no new one is returned
  • `DontLock` - if set to true, then operation is performed without locking RWMutex

type ApplySeriesFn

type ApplySeriesFn[T any] func(val T, row, nRows int) T

ApplySeriesFn is used by the Apply function when used with Series. val contains the value of the current row. The returned value is the updated value.

type CompareFn

type CompareFn[T any] func(T, T) bool

CompareFn type for compare function for comparision values of same type

type DataFrame

type DataFrame struct {
	Series []SeriesAny
	// contains filtered or unexported fields
}

DataFrame allows you to handle numerous series of data conveniently.

func ApplyDataFrame

func ApplyDataFrame(ctx context.Context, df *DataFrame, fn ApplyDataFrameFn, options ...ApplyOptions) (*DataFrame, error)

ApplyDataFrame applies function to DataFrame. If ApplyOptions are set as `ApplyOptions { InPlace: true }` then dataframe is modified, otherwise new dataframe is returned.

func FilterDataFrame

func FilterDataFrame(ctx context.Context, df *DataFrame, fn FilterDataFrameFn, options ...FilterOptions) (*DataFrame, error)

FilterDataFrame applies filter function to DataFrame. If FilterOptions are set as `FilterOptions { InPlace: true }` then dataframe is modified, otherwise new dataframe is returned.

func NewDataFrame

func NewDataFrame(se ...SeriesAny) *DataFrame

NewDataFrame creates a dataframe from passed series.

Example:

x := NewSeries[float64]("x", nil, 1, 2, 3) y := NewSeries("y", nil, 1., 2., 3.) df := NewDataFrame(x, y)

func (*DataFrame) AddSeries

func (df *DataFrame) AddSeries(s SeriesAny, colN *int, options ...Options) error

AddSeries will add a Series to the end of the DataFrame, unless set by ColN.

func (*DataFrame) Append

func (df *DataFrame) Append(vals any, options ...Options)

Append inserts a row at the end.

func (*DataFrame) Apply

func (df *DataFrame) Apply(ctx context.Context, fn ApplyDataFrameFn, options ...ApplyOptions) (*DataFrame, error)

Apply applies function to DataFrame. If ApplyOptions are set as `ApplyOptions { InPlace: true }` then dataframe is modified, otherwise new dataframe is returned.

func (*DataFrame) Copy

func (df *DataFrame) Copy(options ...RangeOptions) *DataFrame

Copy will create a new copy of the Dataframe. It is recommended that you lock the Dataframe before attempting to Copy.

func (*DataFrame) FillRand

func (df *DataFrame) FillRand(rnd RandFn[any])

FillRand will randomly fill all the Series in the Dataframe.

func (*DataFrame) Filter

func (df *DataFrame) Filter(ctx context.Context, fn FilterDataFrameFn, options ...FilterOptions) (*DataFrame, error)

Filter applies filter function to DataFrame. If FilterOptions are set as `FilterOptions { InPlace: true }` then dataframe is modified, otherwise new dataframe is returned.

func (*DataFrame) Insert

func (df *DataFrame) Insert(row int, vals any, options ...Options)

Insert adds a row to a particular position.

func (*DataFrame) IsEqual

func (df *DataFrame) IsEqual(ctx context.Context, df2 *DataFrame, options ...IsEqualOptions) (bool, error)

IsEqual returns true if df2's values are equal to df.

func (*DataFrame) Iterator

func (s *DataFrame) Iterator(options ...IteratorOptions) Iterator[map[string]any]

Iterator will return a function that can be used to iterate through all the values.

func (*DataFrame) Lock

func (df *DataFrame) Lock(deep ...bool)

Lock will lock the Dataframe allowing you to directly manipulate the underlying Series with confidence.

func (*DataFrame) MustNameToColumn

func (df *DataFrame) MustNameToColumn(seriesName string, options ...Options) int

MustNameToColumn returns the index of the series based on the name. The starting index is 0. If seriesName doesn't exist it panics.

func (*DataFrame) NRows

func (df *DataFrame) NRows(options ...Options) int

NRows returns the number of rows of data. Each series must contain the same number of rows.

func (*DataFrame) NameToColumn

func (df *DataFrame) NameToColumn(seriesName string, options ...Options) (int, error)

NameToColumn returns the index of the series based on the name. The starting index is 0.

func (*DataFrame) Names

func (df *DataFrame) Names(options ...Options) []string

Names will return a list of all the series names.

func (*DataFrame) Prepend

func (df *DataFrame) Prepend(vals any, options ...Options)

Prepend inserts a row at the beginning.

func (*DataFrame) RLock

func (df *DataFrame) RLock(deep ...bool)

Lock will lock the Dataframe allowing you to directly manipulate the underlying Series with confidence.

func (*DataFrame) RUnlock

func (df *DataFrame) RUnlock(deep ...bool)

Unlock will unlock the Dataframe that was previously locked.

func (*DataFrame) Remove

func (df *DataFrame) Remove(row int, options ...Options)

Remove deletes a row.

func (*DataFrame) RemoveSeries

func (df *DataFrame) RemoveSeries(seriesName string, options ...Options) error

RemoveSeries will remove a Series from the Dataframe.

func (*DataFrame) ReorderColumns

func (df *DataFrame) ReorderColumns(newOrder []string, options ...Options) error

ReorderColumns reorders the columns based on an ordered list of column names. The length of newOrder must match the number of columns in the Dataframe. The column names in newOrder must be unique.

func (*DataFrame) Row

func (df *DataFrame) Row(row int, options ...Options) map[string]any

Row returns the series' values for a particular row.

func (*DataFrame) Sort

func (df *DataFrame) Sort(ctx context.Context, keys []SortKey, options ...SortOptions) (completed bool)

Sort is used to sort the Dataframe according to different keys. It will return true if sorting was completed or false when the context is canceled.

func (*DataFrame) String

func (df *DataFrame) String() string

String implements the fmt.Stringer interface. It does not lock the DataFrame.

func (*DataFrame) Swap

func (df *DataFrame) Swap(row1, row2 int, options ...Options)

Swap is used to swap 2 values based on their row position.

func (*DataFrame) Table

func (df *DataFrame) Table(options ...TableOptions) string

Table will produce the DataFrame in a table.

func (*DataFrame) Unlock

func (df *DataFrame) Unlock(deep ...bool)

Unlock will unlock the Dataframe that was previously locked.

func (*DataFrame) Update

func (df *DataFrame) Update(row int, col any, val any, options ...Options)

Update is used to update a specific entry. col can be the name of the series or the column number.

func (*DataFrame) UpdateRow

func (df *DataFrame) UpdateRow(row int, vals any, options ...Options)

UpdateRow will update an entire row.

type FilterAction

type FilterAction int

FilterAction is the return value of FilterSeriesFn and FilterDataFrameFn.

const (
	// DROP is used to signify that a row must be dropped.
	DROP FilterAction = 0

	// KEEP is used to signify that a row must be kept.
	KEEP FilterAction = 1

	// CHOOSE is used to signify that a row must be kept.
	CHOOSE FilterAction = 1
)

type FilterDataFrameFn

type FilterDataFrameFn func(vals map[string]any, row, nRows int) (FilterAction, error)

FilterDataFrameFn is used by the Filter function to determine which rows are selected. vals contains the values for the current row. The keys contain ints (index of Series) and strings (name of Series). If the function returns DROP, then the row is removed. If KEEP or CHOOSE is chosen, the row is kept.

type FilterOptions

type FilterOptions struct {
	InPlace, DontLock bool
}

FilterOptions is defined as an optional parameters for Filter(...) on top of Series or DataFrame.

Defaults:

FilterOptions { InPlace: false, DontLock: false }

Properties:

  • `InPlace` - Filter affects current Series/DataFrame and no new one is returned
  • `DontLock` - if set to true, then operation is performed without locking RWMutex

type FilterSeriesFn

type FilterSeriesFn[T any] func(val T, row, nRows int) (FilterAction, error)

FilterSeriesFn is used by the Filter function to determine which rows are selected. val contains the value of the current row. If the function returns DROP, then the row is removed. If KEEP or CHOOSE is chosen, the row is kept.

type IsEqualOptions

type IsEqualOptions struct {
	CheckName, DontLock bool
}

IsEqualOptions is defined as an optional parameters for IsEqual(...) on top of Series or DataFrame.

Defaults:

IsEqualOptions { CheckName: false, DontLock: false }

Properties:

  • `CheckName` - indicates that name should be checked in form of equality
  • `DontLock` - if set to true, then operation is performed without locking RWMutex

type Iterator

type Iterator[T any] struct {
	Index, Total int

	Value T
	// contains filtered or unexported fields
}

Iterator is an structure for iterating Series or DataFrames. When `Next()` is called, new Index and Value is filled until `notDone` is true.

func NewIterator

func NewIterator[T any](iterator IteratorFn[T]) Iterator[T]

NewIterator creates Iterator instance with iterator function of type `IteratorFn[T any]`. Iterator function is called with `iterator.Next()`

func (*Iterator[T]) Next

func (it *Iterator[T]) Next() bool

Function to iterate all values by iterator function. This function returns true if there is next value to read.

type IteratorFn

type IteratorFn[T any] func() (int, T, int, bool)

Iterator function returns actual row, value for that row, total number of elements and "not done" flag

type IteratorOptions

type IteratorOptions struct {
	InitialRow, Step int
	DontLock         bool
}

SortOptions is defined as an optional parameters for Iterator(...) on top of Series or DataFrame.

Defaults:

IteratorOptions {
	InitialRow: 0,
	Step: 1,
	DontLock: false
}

Properties:

  • `InitialRow` - if set then iterator will start at this row.
  • `Step` - iteration step. Negative values causes backward iterations
  • `DontLock` - if set to true, then operation is performed without locking RWMutex

type Options

type Options struct {
	DontLock bool
}

Options is used to perform operation with DontLock. Notice that all operations on the series or dataframes are performed with locked RWMutex.

Defaults:

Options { DontLock: false }

Properties:

  • `DontLock` - if set to true, then operation is performed without locking RWMutex

type RandFn

type RandFn[T any] func() T

func RandFillerFloat64

func RandFillerFloat64(probNil ...float64) RandFn[float64]

RandFillerFloat64 is helper function to fill data of *Series[float64] randomly. probNil is an optional parameter which indicates probability of NaN value as a return.

type RangeOptions

type RangeOptions struct {
	Start int
	End   *int
}

RangeOptions is defined as an optional parameters for functions which needs range like Copy(...), Apply(...), Filter(...), etc.

Notice that DataFrame and Series calls Limits(length) on top of RangeOptions passed. In case of `r.End == nil`, end is set to -1. Negative values provides indexing from the end. For example the Range(0, -1) is the same as Range(0, len(arr) - 1)

Defaults:

RangeOptions { Start: 0, End: nil }

Properties:

  • `Start` - Defines start row/index for iteration/copy
  • `End` - Defines where iteration/copy should end

func Range

func Range(r ...int) RangeOptions

Range is helper function for creating RangeOptions.

Example:

	r1 := Range(0, 10) // Equivalent to RangeOptions { Start: 0, End: &[]int { 10 }[0]}
 r2 := Range(10) // Equivalent to RangeOptions { Start: 10 }

func (RangeOptions) Limits

func (r RangeOptions) Limits(length int) (int, int, error)

Limits sets start and end index.

type Series

type Series[T any] struct {

	// Values is exported to better improve interoperability with the gonum package.
	//
	// See: https://godoc.org/gonum.org/v1/gonum
	//
	// WARNING: Do not modify directly.
	Values []T

	sync.RWMutex
	// contains filtered or unexported fields
}

func ApplySeries

func ApplySeries[T any](ctx context.Context, s *Series[T], fn ApplySeriesFn[T], options ...ApplyOptions) (*Series[T], error)

ApplySeries applies filter function to series. If ApplyOptions are set as `ApplyOptions { InPlace: true }` then series is modified, otherwise new series is returned.

func FilterSeries

func FilterSeries[T any](ctx context.Context, s *Series[T], fn FilterSeriesFn[T], options ...FilterOptions) (*Series[T], error)

FilterSeries applies filter function to series. If FilterOptions are set as `FilterOptions { InPlace: true }` then series is modified, otherwise new series is returned.

func GetSeries

func GetSeries[T any, U int | string](df *DataFrame, name U) *Series[T]

GetSeries helps get series of `DataFrame` as a series of concrete type

func NewSeries

func NewSeries[T any](name string, init *SeriesInit, vals ...T) *Series[T]

NewSeries creates a series of type T with defined name. Size of the series can be prealocated by passing `init`. Series can also by filled by data passed as vals.

Example:

x := NewSeries[float64]("x", nil, 1, 2, 3) y := NewSeries("y", nil, 1., 2., 3.)

func (*Series[T]) Append

func (s *Series[T]) Append(val []T, options ...Options) int

Append is used to set a value to the end of the series.

func (*Series[T]) AppendAny

func (s *Series[T]) AppendAny(val any, options ...Options) int

AppendAny is used to set a value to the end of the series.

func (*Series[T]) Apply

func (s *Series[T]) Apply(ctx context.Context, fn ApplySeriesFn[T], options ...ApplyOptions) (*Series[T], error)

Apply applies filter function to series. If ApplyOptions are set as `ApplyOptions { InPlace: true }` then series is modified, otherwise new series is returned.

func (*Series[T]) Copy

func (s *Series[T]) Copy(options ...RangeOptions) *Series[T]

Copy will create a new copy of the series. It is recommended that you lock the Series before attempting to Copy.

func (*Series[T]) CopyAny

func (s *Series[T]) CopyAny(options ...RangeOptions) SeriesAny

CopyAny will create a new copy of the series. It is recommended that you lock the Series before attempting to Copy.

func (*Series[T]) FillRand

func (s *Series[T]) FillRand(rnd RandFn[T])

FillRand will fill a Series with random data.

func (*Series[T]) FillRandAny

func (s *Series[T]) FillRandAny(rnd RandFn[any])

FillRandAny will fill a Series with random data.

func (*Series[T]) Filter

func (s *Series[T]) Filter(ctx context.Context, fn FilterSeriesFn[T], options ...FilterOptions) (*Series[T], error)

Filter applies filter function to series. If FilterOptions are set as `FilterOptions { InPlace: true }` then series is modified, otherwise new series is returned.

func (*Series[T]) Insert

func (s *Series[T]) Insert(row int, val []T, options ...Options)

Insert is used to set a value at an arbitrary row in the series. All existing values from that row onwards are shifted by 1.

func (*Series[T]) InsertAny

func (s *Series[T]) InsertAny(row int, val any, options ...Options)

InsertAny is used to set a value at an arbitrary row in the series. All existing values from that row onwards are shifted by 1.

func (*Series[T]) IsEqual

func (s *Series[T]) IsEqual(ctx context.Context, s2 *Series[T], options ...IsEqualOptions) (bool, error)

IsEqual returns true if s2's values are equal to s.

func (*Series[T]) IsEqualAny

func (s *Series[T]) IsEqualAny(ctx context.Context, s2 SeriesAny, options ...IsEqualOptions) (bool, error)

IsEqualAny returns true if s2's values are equal to s.

func (*Series[T]) IsEqualAnyFunc

func (s *Series[T]) IsEqualAnyFunc(a, b any) bool

IsEqualAnyFunc returns true if a is equal to b.

func (*Series[T]) IsEqualFunc

func (s *Series[T]) IsEqualFunc(a, b T) bool

IsEqualFunc returns true if a is equal to b.

func (*Series[T]) IsLessThanAnyFunc

func (s *Series[T]) IsLessThanAnyFunc(a, b any) bool

IsLessThanAnyFunc returns true if a is less than b.

func (*Series[T]) IsLessThanFunc

func (s *Series[T]) IsLessThanFunc(a, b T) bool

IsLessThanFunc returns true if a is less than b.

func (*Series[T]) Iterator

func (s *Series[T]) Iterator(options ...IteratorOptions) Iterator[T]

Iterator will return a iterator that can be used to iterate through all the values.

func (*Series[T]) IteratorAny

func (s *Series[T]) IteratorAny(options ...IteratorOptions) Iterator[any]

IteratorAny will return a iterator that can be used to iterate through all the values.

func (*Series[T]) NRows

func (s *Series[T]) NRows(options ...Options) int

NRows returns how many rows the series contains.

func (*Series[T]) Name

func (s *Series[T]) Name(options ...Options) string

Name returns the series name.

func (*Series[T]) Prepend

func (s *Series[T]) Prepend(val []T, options ...Options)

Prepend is used to set a value to the beginning of the series.

func (*Series[T]) PrependAny

func (s *Series[T]) PrependAny(val any, options ...Options)

PrependAny is used to set a value to the beginning of the series.

func (*Series[T]) Remove

func (s *Series[T]) Remove(row int, options ...Options)

Remove is used to delete the value of a particular row.

func (*Series[T]) Rename

func (s *Series[T]) Rename(n string, options ...Options)

Rename renames the series.

func (*Series[T]) Reset

func (s *Series[T]) Reset(options ...Options)

Reset is used clear all data contained in the Series.

func (*Series[T]) SetIsEqualAnyFunc

func (s *Series[T]) SetIsEqualAnyFunc(f CompareFn[any])

SetIsEqualAnyFunc sets a function which can be used to determine if 2 values in the series are equal.

func (*Series[T]) SetIsEqualFunc

func (s *Series[T]) SetIsEqualFunc(f CompareFn[T])

SetIsEqualFunc sets a function which can be used to determine if 2 values in the series are equal.

func (*Series[T]) SetIsLessThanAnyFunc

func (s *Series[T]) SetIsLessThanAnyFunc(f CompareFn[any])

SetIsLessThanAnyFunc sets a function which can be used to determine if a value is less than another in the series.

func (*Series[T]) SetIsLessThanFunc

func (s *Series[T]) SetIsLessThanFunc(f CompareFn[T])

SetIsLessThanFunc sets a function which can be used to determine if a value is less than another in the series.

func (*Series[T]) SetValueToStringFormatter

func (s *Series[T]) SetValueToStringFormatter(f ValueToStringFormatter)

SetValueToStringFormatter is used to set a function to convert the value of a particular row to a string representation.

func (*Series[T]) Sort

func (s *Series[T]) Sort(ctx context.Context, options ...SortOptions) (completed bool)

Sort will sort the series. It will return true if sorting was completed or false when the context is canceled.

func (*Series[T]) String

func (s *Series[T]) String() string

String implements the fmt.Stringer interface. It does not lock the Series.

func (*Series[T]) Swap

func (s *Series[T]) Swap(row1, row2 int, options ...Options)

Swap is used to swap 2 values based on their row position.

func (*Series[T]) Table

func (s *Series[T]) Table(options ...TableOptions) string

Table will produce the Series in a table.

func (*Series[T]) Type

func (s *Series[T]) Type() string

func (*Series[T]) Update

func (s *Series[T]) Update(row int, val T, options ...Options)

Update is used to update the value of a particular row.

func (*Series[T]) UpdateAny

func (s *Series[T]) UpdateAny(row int, val any, options ...Options)

UpdateAny is used to update the value of a particular row.

func (*Series[T]) Value

func (s *Series[T]) Value(row int, options ...Options) T

Value returns the value of a particular row.

func (*Series[T]) ValueAny

func (s *Series[T]) ValueAny(row int, options ...Options) any

ValueAny returns the value of a particular row.

func (*Series[T]) ValueString

func (s *Series[T]) ValueString(row int, options ...Options) string

ValueString returns a string representation of a particular row. The string representation is defined by the function set in SetValueToStringFormatter.

type SeriesAny

type SeriesAny interface {

	// Name returns the series name.
	Name(options ...Options) string

	// Rename renames the series.
	Rename(n string, options ...Options)

	// Type returns type of the series as string value.
	Type() string

	// NRows returns how many rows the series contains.
	NRows(options ...Options) int

	// ValueAny returns the value of a particular row.
	ValueAny(row int, options ...Options) any

	// ValueString returns a string representation of a
	// particular row. The string representation is defined
	// by the function set in SetValueToStringFormatter.
	// By default, a nil value is returned as "NaN".
	ValueString(row int, options ...Options) string

	// Prepend is used to set a value to the beginning of the
	// series.
	PrependAny(val any, options ...Options)

	// AppendAny is used to set a value to the end of the series.
	AppendAny(val any, options ...Options) int

	// InsertAny is used to set a value at an arbitrary row in
	// the series. All existing values from that row onwards
	// are shifted by 1.
	InsertAny(row int, val any, options ...Options)

	// Remove is used to delete the value of a particular row.
	Remove(row int, options ...Options)

	// Reset is used clear all data contained in the Series.
	Reset(options ...Options)

	// Update is used to update the value of a particular row.
	UpdateAny(row int, val any, options ...Options)

	// IteratorAny will return a iterator that can be used to iterate through all the values.
	IteratorAny(options ...IteratorOptions) Iterator[any]

	// SetValueToStringFormatter is used to set a function
	// to convert the value of a particular row to a string
	// representation.
	SetValueToStringFormatter(f ValueToStringFormatter)

	// Swap is used to swap 2 values based on their row position.
	Swap(row1, row2 int, options ...Options)

	// IsEqualAnyFunc	returns true if a is equal to b.
	IsEqualAnyFunc(a, b any) bool

	// IsLessThanAnyFunc	returns true if a is less than b.
	IsLessThanAnyFunc(a, b any) bool

	// SetIsEqualAnyFunc	sets a function which can be used to determine
	// if 2 values in the series are equal.
	SetIsEqualAnyFunc(f CompareFn[any])

	// SetIsLessThanAnyFunc	sets a function which can be used to determine
	// if a value is less than another in the series.
	SetIsLessThanAnyFunc(f CompareFn[any])

	// Sort will sort the series.
	// It will return true if sorting was completed or false when the context is canceled.
	Sort(ctx context.Context, options ...SortOptions) (completed bool)

	// CopyAny will create a new copy of the series.
	// It is recommended that you lock the Series before attempting
	// to Copy.
	CopyAny(options ...RangeOptions) SeriesAny

	// Table will produce the Series in a table.
	Table(options ...TableOptions) string

	// String implements the fmt.Stringer interface. It does not lock the Series.
	String() string

	// FillRandAny will fill a Series with random data.
	FillRandAny(rnd RandFn[any])

	// IsEqualAny returns true if s2's values are equal to s.
	IsEqualAny(ctx context.Context, s2 SeriesAny, options ...IsEqualOptions) (bool, error)

	// RWMutex Lock
	Lock()

	// RWMutex Unlock
	Unlock()

	// RWMutex RLock
	RLock()

	// RWMutex RUnlock
	RUnlock()
	// contains filtered or unexported methods
}

type SeriesInit

type SeriesInit struct {
	// Prefill the series with nil ("NaN") or default value with
	// Size number of rows.
	Size int

	// How much memory to preallocate.
	// If you know the size of the series in advance,
	// it is better to preallocate the capacity of the
	// underlying slice.
	Capacity int
}

SeriesInit is used to configure the series when it is initialized

type SortKey

type SortKey struct {

	// Key can be an int (position of series) or string (name of series).
	Key any

	// Desc can be set to sort in descending order.
	Desc bool
	// contains filtered or unexported fields
}

SortKey is the key to sort a Dataframe

type SortOptions

type SortOptions struct {
	Stable, Desc, DontLock bool
}

SortOptions is defined as an optional parameters for Sort(...) on top of Series or DataFrame.

Defaults:

SortOptions {
	Stable: false,
	Desc: false,
	DontLock: false
}

Properties:

  • `Desc` - if true, then values will be sorted in descending order
  • `DontLock` - if set to true, then operation is performed without locking RWMutex

type TableOptions

type TableOptions struct {
	Series   []any
	Range    RangeOptions
	DontLock bool
}

TableOptions is defined as an optional parameters for Table(...) on top of Series or DataFrame.

Defaults:

TableOptions {
	Series: nil,
	Range: RangeOptions { Start: 0, End: nil }
	DontLock: false
}

Properties:

  • `Series` - is int or string and indicates which series should table contains. Affets only DataFrame
  • `Range` - specifies range for displayed table
  • `DontLock` - if set to true, then operation is performed without locking RWMutex

type ValueToStringFormatter

type ValueToStringFormatter func(val any) string

ValueToStringFormatter is used to convert a value into a string.

Directories

Path Synopsis
csv

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL