README

datatable

Go package providing a column-centric data structure for aggregating data. Inspired by this R-based datatable

Build Status

Installation

Simply run

go get -u github.com/iand/datatable

Documentation is at http://godoc.org/github.com/iand/datatable

Author

Note that this package was initially developed for Avocet and is released here with their permission.

License

This is free and unencumbered software released into the public domain. For more information, see http://unlicense.org/ or the accompanying UNLICENSE file.

Documentation

Overview

    Package datatable provides a column-centric data structure for aggregating data See https://github.com/Rdatatable/data.table/wiki for inspiration

    Index

    Constants

    This section is empty.

    Variables

    View Source
    var ErrInvalidColumnLength = errors.New("invalid column length")
    View Source
    var ErrMismatchedColumnTypes = errors.New("mismatched column types")
    View Source
    var ErrWrongNumberOfColumns = errors.New("wrong number of columns in data")

    Functions

    This section is empty.

    Types

    type Aggregator

    type Aggregator interface {
    	Aggregate(rg RowGroup) float64
    }

    func Count

    func Count() Aggregator

      Count returns an Aggregator that finds the count of numeric values in a group of rows.

      func DifferenceOfSums

      func DifferenceOfSums(a, b string) Aggregator

      func Max

      func Max(name string) Aggregator

        Max returns an Aggregator that finds the maximum value of a numeric column in a group of rows.

        func Mean

        func Mean(name string) Aggregator

          Mean returns an Aggregator that finds the mean value of a numeric column in a group of rows.

          func Min

          func Min(name string) Aggregator

            Min returns an Aggregator that finds the minimum value of a numeric column in a group of rows.

            func RatioOfSums

            func RatioOfSums(a, b string) Aggregator

            func Sum

            func Sum(name string) Aggregator

              Sum returns an Aggregator that sums a numeric column in a group of rows.

              func Variance

              func Variance(name string) Aggregator

                Variance returns an Aggregator that finds the variance of a numeric column in a group of rows.

                type AggregatorFunc

                type AggregatorFunc func(rg RowGroup) float64

                  AggregatorFunc adapts a function to an Aggregator interface

                  func (AggregatorFunc) Aggregate

                  func (fn AggregatorFunc) Aggregate(rg RowGroup) float64

                  type Calculator

                  type Calculator interface {
                  	Calculate(row RowRef) float64
                  }

                    A Calculator performs a calculation on a single row of numeric data.

                    func Constant

                    func Constant(v float64) Calculator

                      Constant returns a Calculator that always returns the constant value v

                      func Zero

                      func Zero() Calculator

                        Zero returns a Calculator that always returns zero

                        type CalculatorFunc

                        type CalculatorFunc func(row RowRef) float64

                          CalculatorFunc adapts a function to a Calculator interface

                          func (CalculatorFunc) Calculate

                          func (fn CalculatorFunc) Calculate(row RowRef) float64

                          type DataTable

                          type DataTable struct {
                          	// contains filtered or unexported fields
                          }

                            DataTable is a column-centric table of data. Columns can be either numeric (float64) or text (string). A DataTable is not safe for concurrent use.

                            func (*DataTable) AddColumn

                            func (dt *DataTable) AddColumn(name string, values []float64) error

                              AddColumn adds a column of float64 data. The length of the column must equal the length of any other columns already present in the table.

                              func (*DataTable) AddStringColumn

                              func (dt *DataTable) AddStringColumn(name string, values []string) error

                                AddStringColumn adds a column of string data. The length of the column must equal the length of any other columns already present in the table.

                                func (*DataTable) Aggregate

                                func (dt *DataTable) Aggregate(colName string, a Aggregator)

                                  Aggregate appends a new numeric column to the table whose values will be populated by executing the aggregator a against each group of rows that share the same key column values. Each row in a group will be assigned the same value. Rows are evaluated in the table's current sort order as specified by its keys.

                                  func (*DataTable) AggregateIndex

                                  func (dt *DataTable) AggregateIndex(colName string, a Aggregator, indices []int)

                                    AggregateIndex appends a new numeric column to the table whose values will be populated by executing the aggregator a against each group of rows that share the same key column values and are present in indices. Each row in a group will be assigned the same value. Rows are evaluated in the order they appear in indices. Rows not present in indices will be assigned a NaN value in the new column.

                                    func (*DataTable) AggregateIndexFill

                                    func (dt *DataTable) AggregateIndexFill(col []float64, a Aggregator, indices []int)

                                      AggregateIndexFill populates col with values found by executing the aggregator a against each group of rows that share the same key column values and are present in indices. col must be of the same length as the datatable

                                      func (*DataTable) AggregateWhere

                                      func (dt *DataTable) AggregateWhere(colName string, a Aggregator, m Matcher)

                                        AggregateWhere appends a new numeric column to the table whose values will be populated by executing the aggregator a against each group of rows that share the same key column values and match m. Each row in a group will be assigned the same value. Rows are evaluated in the table's current sort order as specified by its keys. Rows not matched by m will be assigned a NaN value in the new column.

                                        func (*DataTable) Append

                                        func (dt *DataTable) Append(dt2 *DataTable) error

                                          Append appends the rows of dt2 to the data table. An error is returned if the tables share a column name with differing types (numeric vs text). Columns present in dt but not in dt2 will be expanded to the correct length with either NaN or the empty string. Columns present in dt2 but not dt will be pre-filled with NaN or empty strings before the dt2's data is appened. The data table remains sorted according to its keys after the append.

                                          func (*DataTable) AppendRow

                                          func (dt *DataTable) AppendRow(row []interface{}) error

                                            AppendRow appends the data in row to the data table.

                                            func (*DataTable) Apply

                                            func (dt *DataTable) Apply(g Grouper)

                                              Apply executes the grouper function g against each group of rows that share the same key column values. Rows are evaluated in the table's current sort order as specified by its keys.

                                              func (*DataTable) ApplyIndex

                                              func (dt *DataTable) ApplyIndex(g Grouper, indices []int)

                                                ApplyIndex executes the grouper function g against each group of rows that share the same key column values and are present in indices. Rows are evaluated in the order they appear in indices.

                                                func (*DataTable) ApplyWhere

                                                func (dt *DataTable) ApplyWhere(g Grouper, m Matcher)

                                                  ApplyWhere executes the grouper function g against each group of rows that share the same key column values and match m. Rows are evaluated in the table's current sort order as specified by its keys.

                                                  func (*DataTable) CSV

                                                  func (dt *DataTable) CSV(w io.Writer) error

                                                    CSV writes the datatable as CSV

                                                    func (*DataTable) Calc

                                                    func (dt *DataTable) Calc(colName string, c Calculator)

                                                      Calc appends a new numeric column to the table whose values will be populated by executing the calculator c against each row of data. Rows are evaluated in the table's current sort order as specified by its keys.

                                                      func (*DataTable) CalcIndex

                                                      func (dt *DataTable) CalcIndex(colName string, c Calculator, indices []int)

                                                        CalcIndex appends a new numeric column to the table whose values will be populated by execting the calculator c against each row of data whose index is contained in indices. Rows are evaluated in the order they appear in indices. Rows not present in indices will be assigned a NaN value in the new column.

                                                        func (*DataTable) CalcIndexFill

                                                        func (dt *DataTable) CalcIndexFill(col []float64, c Calculator, indices []int)

                                                        func (*DataTable) CalcWhere

                                                        func (dt *DataTable) CalcWhere(colName string, c Calculator, m Matcher)

                                                          CalcWhere appends a new numeric column to the table whose values will be populated by execting the calculator c against each row of data that matches m. Rows are evaluated in the table's current sort order as specified by its keys. Rows not matched by m will be assigned a NaN value in the new column.

                                                          func (*DataTable) Clone

                                                          func (dt *DataTable) Clone() *DataTable

                                                            Clone returns a new data table containing copies of the columns contained in dt. The returned data table will have no keys set.

                                                            func (*DataTable) CloneEmpty

                                                            func (dt *DataTable) CloneEmpty() *DataTable

                                                              CloneEmpty creates an identical but empty data table with no keys set.

                                                              func (*DataTable) CountWhere

                                                              func (dt *DataTable) CountWhere(m Matcher) int

                                                                CountWhere counts the number of rows that match m. Rows are evaluated in the table's current sort order as specified by its keys.

                                                                func (*DataTable) Equal

                                                                func (dt *DataTable) Equal(i, j int) bool

                                                                  Equal compares two rows and returns whether they contain the same values. If the table has keys specified then only those columns will be used in the comparison, in the order specified by the keys. Otherwise all columns are compared in the order they were added to the table.

                                                                  func (*DataTable) KeyNames

                                                                  func (dt *DataTable) KeyNames() []string

                                                                  func (*DataTable) Len

                                                                  func (dt *DataTable) Len() int

                                                                    Len returns the number of rows in the data table

                                                                    func (*DataTable) Less

                                                                    func (dt *DataTable) Less(i, j int) bool

                                                                      Less compares two rows and returns whether the row with index i should sort before the row at index j. If the table has keys specified then only those columns will be used in the comparison, in the order specified by the keys. Otherwise all columns are compared in the order they were added to the table.

                                                                      func (*DataTable) Matches

                                                                      func (dt *DataTable) Matches(m Matcher) []int

                                                                      func (*DataTable) N

                                                                      func (dt *DataTable) N() int

                                                                        N returns the number of columns in the data table

                                                                        func (*DataTable) Names

                                                                        func (dt *DataTable) Names() []string

                                                                          Names returns a slice of the column names in the data table in the order the columns were added to the table.

                                                                          func (*DataTable) ParseRow

                                                                          func (dt *DataTable) ParseRow(values ...string) error

                                                                            ParseRow attempts to append a row of data by parsing values as either float64 or string depending on the existing type of the relevant column. Values are processed in the order that columns were added to the table.

                                                                            func (*DataTable) RawRows

                                                                            func (dt *DataTable) RawRows(headers bool) [][]interface{}

                                                                              RawRows returns all the rows in the datatable. If headers is true then the first row returned will contain the column names. Values in each row are in the order the column was added to the table.

                                                                              func (*DataTable) Reduce

                                                                              func (dt *DataTable) Reduce(a Aggregator) float64

                                                                                Reduce returns the value obtained by executing the aggregator a against each row in the datatable.

                                                                                func (*DataTable) RemoveColumn

                                                                                func (dt *DataTable) RemoveColumn(name string) error

                                                                                  RemoveColumn removes a column of any type from the data table.

                                                                                  func (*DataTable) RemoveRows

                                                                                  func (dt *DataTable) RemoveRows(m Matcher)

                                                                                    RemoveRows removes any rows that match m without altering their order.

                                                                                    func (*DataTable) Row

                                                                                    func (dt *DataTable) Row(n int) ([]interface{}, bool)

                                                                                      Row returns a single row of data as a slice or an empty slice and false if the row number exceed the bounds of the table. The returned slice contains one value per column in the order the columns were added to the table.

                                                                                      func (*DataTable) RowMap

                                                                                      func (dt *DataTable) RowMap(n int) (RowMap, bool)

                                                                                        RowMap returns a single row of data as a map or an empty map and false if the row number exceed the bounds of the table. The keys in the returned map correspond to the names of the columns.

                                                                                        func (*DataTable) RowRef

                                                                                        func (dt *DataTable) RowRef(n int) (RowRef, bool)

                                                                                        func (*DataTable) Rows

                                                                                        func (dt *DataTable) Rows() RowGroup

                                                                                        func (*DataTable) RowsWhere

                                                                                        func (dt *DataTable) RowsWhere(m Matcher) RowGroup

                                                                                        func (*DataTable) Select

                                                                                        func (dt *DataTable) Select(names []string) (*DataTable, error)

                                                                                          Select returns a new data table containing copies of the columns specified in names. The returned data table will have no keys set.

                                                                                          func (*DataTable) SelectIndex

                                                                                          func (dt *DataTable) SelectIndex(names []string, indices []int) (*DataTable, error)

                                                                                            SelectIndex returns a new data table containing copies of the columns specified in names where the rows are in indices. The returned data table will have no keys set.

                                                                                            func (*DataTable) SelectWhere

                                                                                            func (dt *DataTable) SelectWhere(names []string, m Matcher) (*DataTable, error)

                                                                                              SelectWhere returns a new data table containing copies of the columns specified in names where the rows match m. The returned data table will have no keys set.

                                                                                              func (*DataTable) SetFloatValue

                                                                                              func (dt *DataTable) SetFloatValue(name string, row int, v float64) error

                                                                                              func (*DataTable) SetKeys

                                                                                              func (dt *DataTable) SetKeys(keys ...string) error

                                                                                                SetKeys assigns a set of column names to be used as keys when sorting or aggregating. Setting keys sorts the table immediately by the specified keys.

                                                                                                func (*DataTable) Swap

                                                                                                func (dt *DataTable) Swap(i, j int)

                                                                                                  Swap exchanges the data in one row of the table for the data in another row.

                                                                                                  func (*DataTable) Unique

                                                                                                  func (dt *DataTable) Unique() *DataTable

                                                                                                    Unique returns a new data table containing only the unique rows from dt. The returned data table will contain the same number of columns in the same order as dt and will have no keys set.

                                                                                                    type Grouper

                                                                                                    type Grouper interface {
                                                                                                    	Group(rg RowGroup)
                                                                                                    }

                                                                                                      A Grouper performs an action given a group of rows.

                                                                                                      type GrouperFunc

                                                                                                      type GrouperFunc func(rg RowGroup)

                                                                                                        GrouperFunc adapts a function to a Grouper interface

                                                                                                        func (GrouperFunc) Group

                                                                                                        func (fn GrouperFunc) Group(rg RowGroup)

                                                                                                        type Matcher

                                                                                                        type Matcher interface {
                                                                                                        	Match(row RowRef) bool
                                                                                                        }

                                                                                                          A Matcher tests a single row of data to determine whether it matches a particular set of criteria.

                                                                                                          func CloselyEqual

                                                                                                          func CloselyEqual(name string, v float64, e float64) Matcher

                                                                                                            CloselyEqual returns a Matcher that tests whether the named column is equal to v within the range +/- e

                                                                                                            func GreaterThan

                                                                                                            func GreaterThan(name string, v float64) Matcher

                                                                                                              GreaterThan returns a Matcher that tests whether the named column is greater than v or not

                                                                                                              func IsEqualString

                                                                                                              func IsEqualString(col string, val string) Matcher

                                                                                                                IsEqualString returns a Matcher that tests whether the named column is equal to the given string

                                                                                                                func IsInf

                                                                                                                func IsInf(name string) Matcher

                                                                                                                  IsInf returns a Matcher that tests whether the named column is infinite (either positive or negative infinity will return true).

                                                                                                                  func IsNan

                                                                                                                  func IsNan(name string) Matcher

                                                                                                                    IsNan returns a Matcher that tests whether the named column is NaN or not

                                                                                                                    func IsZero

                                                                                                                    func IsZero(name string) Matcher

                                                                                                                      IsZero returns a Matcher that tests whether the named column is zero or not

                                                                                                                      func LessThan

                                                                                                                      func LessThan(name string, v float64) Matcher

                                                                                                                        LessThan returns a Matcher that tests whether the named column is less than v or not

                                                                                                                        func MultiColumnMatcher

                                                                                                                        func MultiColumnMatcher(m map[string]string) Matcher

                                                                                                                          MultiColumnMatcher returns a Matcher that tests whether the a rown matches the names and values in the map m

                                                                                                                          func Not

                                                                                                                          func Not(m Matcher) Matcher

                                                                                                                            Not returns a Matcher that inverts the value of the supplied matcher

                                                                                                                            func NumericColumnMatcher

                                                                                                                            func NumericColumnMatcher(name string, fn func(float64) bool) Matcher

                                                                                                                              NumericColumnMatcher returns a Matcher that tests the value of a single column in a row of data against the numeric function fn.

                                                                                                                              func StringColumnMatcher

                                                                                                                              func StringColumnMatcher(name string, fn func(string) bool) Matcher

                                                                                                                                StringColumnMatcher returns a Matcher that tests the value of a single column in a row of data against the string function fn.

                                                                                                                                type MatcherFunc

                                                                                                                                type MatcherFunc func(row RowRef) bool

                                                                                                                                  MatcherFunc adapts a function to a Matcher interface

                                                                                                                                  func (MatcherFunc) Match

                                                                                                                                  func (fn MatcherFunc) Match(row RowRef) bool

                                                                                                                                  type MatchingRowGroup

                                                                                                                                  type MatchingRowGroup struct {
                                                                                                                                  	// contains filtered or unexported fields
                                                                                                                                  }

                                                                                                                                  func (*MatchingRowGroup) FloatValue

                                                                                                                                  func (m *MatchingRowGroup) FloatValue(name string) (float64, bool)

                                                                                                                                  func (*MatchingRowGroup) Next

                                                                                                                                  func (m *MatchingRowGroup) Next() bool

                                                                                                                                  func (*MatchingRowGroup) Reset

                                                                                                                                  func (m *MatchingRowGroup) Reset()

                                                                                                                                  func (*MatchingRowGroup) RowIndex

                                                                                                                                  func (m *MatchingRowGroup) RowIndex() int

                                                                                                                                  func (*MatchingRowGroup) StringValue

                                                                                                                                  func (m *MatchingRowGroup) StringValue(name string) (string, bool)

                                                                                                                                  func (*MatchingRowGroup) Value

                                                                                                                                  func (m *MatchingRowGroup) Value(name string) (interface{}, bool)

                                                                                                                                  type RowGroup

                                                                                                                                  type RowGroup interface {
                                                                                                                                  	Valuer
                                                                                                                                  	Reset()
                                                                                                                                  	RowIndex() int
                                                                                                                                  	Next() bool
                                                                                                                                  }

                                                                                                                                  type RowMap

                                                                                                                                  type RowMap map[string]interface{}

                                                                                                                                  func (RowMap) FloatValue

                                                                                                                                  func (r RowMap) FloatValue(name string) (float64, bool)

                                                                                                                                  func (RowMap) StringValue

                                                                                                                                  func (r RowMap) StringValue(name string) (string, bool)

                                                                                                                                  func (RowMap) Value

                                                                                                                                  func (r RowMap) Value(name string) (interface{}, bool)

                                                                                                                                  type RowRef

                                                                                                                                  type RowRef struct {
                                                                                                                                  	// contains filtered or unexported fields
                                                                                                                                  }

                                                                                                                                  func (*RowRef) FloatValue

                                                                                                                                  func (r *RowRef) FloatValue(name string) (float64, bool)

                                                                                                                                  func (*RowRef) StringValue

                                                                                                                                  func (r *RowRef) StringValue(name string) (string, bool)

                                                                                                                                  func (*RowRef) Value

                                                                                                                                  func (r *RowRef) Value(name string) (interface{}, bool)

                                                                                                                                  type StaticRowGroup

                                                                                                                                  type StaticRowGroup struct {
                                                                                                                                  	// contains filtered or unexported fields
                                                                                                                                  }

                                                                                                                                  func (*StaticRowGroup) FloatValue

                                                                                                                                  func (r *StaticRowGroup) FloatValue(name string) (float64, bool)

                                                                                                                                  func (*StaticRowGroup) Next

                                                                                                                                  func (r *StaticRowGroup) Next() bool

                                                                                                                                  func (*StaticRowGroup) Reset

                                                                                                                                  func (r *StaticRowGroup) Reset()

                                                                                                                                  func (*StaticRowGroup) RowIndex

                                                                                                                                  func (r *StaticRowGroup) RowIndex() int

                                                                                                                                    RowIndex returns the datatable index of the current row in the row group. It is an error if this is called before calling Next and the function will panic.

                                                                                                                                    func (*StaticRowGroup) StringValue

                                                                                                                                    func (r *StaticRowGroup) StringValue(name string) (string, bool)

                                                                                                                                    func (*StaticRowGroup) Value

                                                                                                                                    func (r *StaticRowGroup) Value(name string) (interface{}, bool)

                                                                                                                                    func (*StaticRowGroup) Where

                                                                                                                                    func (r *StaticRowGroup) Where(m Matcher) *StaticRowGroup

                                                                                                                                      Where applies a matcher to the rows in this row group, returning a new row group that contains only the rows that matched. It does not affect the current position of r's iteration.

                                                                                                                                      type Valuer

                                                                                                                                      type Valuer interface {
                                                                                                                                      	Value(name string) (interface{}, bool)
                                                                                                                                      	FloatValue(name string) (float64, bool)
                                                                                                                                      	StringValue(name string) (string, bool)
                                                                                                                                      }

                                                                                                                                        A Valuer can get the value of a column in a particular context

                                                                                                                                        Source Files