Documentation ¶
Index ¶
- Constants
- func EqualFrequencyBinning(fileRows [][]string, binOpts map[string]BinOpt) ([][]string, error)
- func EqualFrequencyBinningForOne(fileRows [][]string, feature string, binNum int) ([][]string, error)
- func EqualWidthBinning(fileRows [][]string, binOpts map[string]BinOpt) ([][]string, error)
- func EqualWidthBinningForOne(fileRows [][]string, feature string, binNum int) ([][]string, error)
- func FeatureSelect(fileRows [][]string, features []string) [][]string
- func FillByAverage(fileRows [][]string, feature string) ([][]string, error)
- func FillByMedian(fileRows [][]string, feature string) ([][]string, error)
- func FillBySpecified(fileRows [][]string, feature, value string) ([][]string, error)
- func FillMissingValues(fileRows [][]string, fillOpts map[string]FillOpt) ([][]string, error)
- func FilterByThreshold(fileRows [][]string, filOpt map[string]FilterOpt) ([][]string, error)
- func FilterByThresholdForOne(fileRows [][]string, feature string, opt FilterOpt) ([][]string, error)
- func PrintFileRows(fileRows [][]string)
- func ReplaceIntervalsBySpecifiedValues(fileRows [][]string, repOpt map[string]ReplaceOpt) ([][]string, error)
- func ReplaceIntervalsForOne(fileRows [][]string, feature string, opt ReplaceOpt) ([][]string, error)
- type BinOpt
- type FillOpt
- type FilterOpt
- type IntervalValue
- type ReplaceOpt
- type StatisticsInfo
Constants ¶
const ( FillMissingValuesByAverage = "average" FillMissingValuesByMedian = "median" FillMissingValuesBySpecifiedValue = "specified" )
const ( FilterByGreater = "g" FilterByGreaterOrEqual = "ge" FilterByEqual = "e" FilterByNotEqual = "ne" FilterByLess = "l" FilterByLessOrEqual = "le" )
Variables ¶
This section is empty.
Functions ¶
func EqualFrequencyBinning ¶
EqualFrequencyBinning equal frequency binning for selected feature list fileRows is sample content, the first row contains just names of features binOpts are binning options, map from feature name to BinOpt
func EqualFrequencyBinningForOne ¶
func EqualFrequencyBinningForOne(fileRows [][]string, feature string, binNum int) ([][]string, error)
EqualFrequencyBinningForOne binning for one selected feature fileRows is sample content, the first row contains just names of features feature is selected feature to be processed binNum is target number of bins return new samples, value of selected feature is a discrete number among [0,1,...,bins-1]
func EqualWidthBinning ¶
EqualWidthBinning equal width binning for selected feature list fileRows is sample content, the first row contains just names of features binOpts are binning options, map from feature name to BinOpt
func EqualWidthBinningForOne ¶
EqualWidthBinningForOne binning for one selected feature fileRows is sample content, the first row contains just names of features feature is selected feature to be processed binNum is target number of bins return new samples, value of selected feature is a discrete number among [0,1,...,bins-1]
func FeatureSelect ¶
FeatureSelect filter samples with selected features fileRows is sample content, the first row contains just names of features features is target feature list to be selected
func FillByAverage ¶
FillByAverage fills missing values of target features by average fileRows is sample content, the first row contains just names of features
func FillByMedian ¶
FillByMedian fills missing values of target features by median fileRows is sample content, the first row contains just names of features
func FillBySpecified ¶
FillBySpecified fills missing values of target features by specified value fileRows is sample content, the first row contains just names of features
func FillMissingValues ¶
FillMissingValues fills missing values of target features by average, median or specified value support filling several features at one time, return processed samples fileRows is sample content, the first row contains just names of features fillOpts are filling options for selected features
func FilterByThreshold ¶
FilterByThreshold filter samples by given threshold and method support several features filter at one time, return processed samples fileRows is sample content, the first row contains just names of features filOpt are filter options for selected features
func FilterByThresholdForOne ¶
func FilterByThresholdForOne(fileRows [][]string, feature string, opt FilterOpt) ([][]string, error)
FilterByThresholdForOne filter by threshold for one feature
func PrintFileRows ¶
func PrintFileRows(fileRows [][]string)
PrintFileRows print sample content in human-readable format
func ReplaceIntervalsBySpecifiedValues ¶
func ReplaceIntervalsBySpecifiedValues(fileRows [][]string, repOpt map[string]ReplaceOpt) ([][]string, error)
ReplaceIntervalsBySpecifiedValues replace interval values by specified values support several features replacement at one time, return processed samples fileRows is sample content, the first row contains just names of features repOpt are replacement options for selected features
func ReplaceIntervalsForOne ¶
func ReplaceIntervalsForOne(fileRows [][]string, feature string, opt ReplaceOpt) ([][]string, error)
ReplaceIntervalsForOne replace intervals by specified values for one feature
Types ¶
type FilterOpt ¶
type FilterOpt struct { Threshold float64 FilterMethod string // "g", "ge", "e", "ne", "l" or "le" }
filter condition
type IntervalValue ¶
replace sample value in [left, right) with specified value
type ReplaceOpt ¶
type ReplaceOpt struct {
Intervals []IntervalValue
}
specify several intervals to replace
type StatisticsInfo ¶
type StatisticsInfo struct { Maximum float64 // largest value Minimum float64 // smallest value Mean float64 // average value Mode []string // values appear most frequently, a set may have multiple modes, or no mode(all values appear with same frequency) Median float64 // median value StandDev float64 // standard deviation }
func FeatureStatistics ¶
func FeatureStatistics(fileRows [][]string, feature string, valueIsStr bool) (StatisticsInfo, error)
FeatureStatistics computes stats info for given feature fileRows is sample content, the first row contains just names of features valueIsStr is true if sample value type of the feature is 'string'