sqlset

package
v0.0.0-...-dbea759 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 23, 2018 License: MIT Imports: 5 Imported by: 0

Documentation

Overview

Package sqlset provides implementations of set.Set that use SQL database as backends.

The set uses 2 database tables:

  • One for storing discrete values
  • One for the samples

Samples are stored on the samples database, with their discrete values as references to values in the discrete value table.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Adapter

type Adapter interface {
	ColumnName(string) (string, error)

	CreateDiscreteValuesTable(ctx context.Context) error
	CreateSampleTable(ctx context.Context, discreteFeatureColumns, continuousFeatureColumns []string) error

	AddDiscreteValues(context.Context, []string) (int, error)
	ListDiscreteValues(ctx context.Context) (map[int]string, error)

	AddSamples(ctx context.Context, rawSamples []map[string]interface{}, discreteFeatureColumns, continuousFeatureColumns []string) (int, error)
	ListSamples(ctx context.Context, criteria []*FeatureCriterion, discreteFeatureColumns, continuousFeatureColumns []string) ([]map[string]interface{}, error)
	IterateOnSamples(ctx context.Context, criteria []*FeatureCriterion, discreteFeatureColumns, continuousFeatureColumns []string, lambda func(int, map[string]interface{}) (bool, error)) error
	CountSamples(context.Context, []*FeatureCriterion) (int, error)

	ListSampleDiscreteFeatureValues(context.Context, string, []*FeatureCriterion) ([]int, error)
	ListSampleContinuousFeatureValues(context.Context, string, []*FeatureCriterion) ([]float64, error)
	CountSampleDiscreteFeatureValues(context.Context, string, []*FeatureCriterion) (map[int]int, error)
	CountSampleContinuousFeatureValues(context.Context, string, []*FeatureCriterion) (map[float64]int, error)
}

Adapter is an interface providing the methods needed to implement a Set with a database backend.

ColumnName takes a string feature name and returns a column name for the feature in a string or an error

CreateDiscreteValuesTable should create a table containing the different values discrete features can take in the samples of the working sets.

CreateSampleTable should create a table for the samples, using foreign keys to the discrete value table for discrete features and a suitable float64 representation for continuous ones. It should also generate an id column.

AddDiscreteValues should add to the discrete value table the given discrete values, and return an error if any cannot be added.

ListDiscreteValues should return a map of integer to string that relates numeric ids of the discrete values to their string values, or an error.

AddSamples should add a sample to the samples table for each rawSample received. A rawSample here is a map of column name to an interface containing the numeric id for a discrete feature value or a float64 for a continuous feature value. Samples should be added considering all discrete and continuous feature columns only. NULL values should be used for column values not available in the rawSample. The number of samples added or an error must be returned.

ListSamples should provide a slice of rawSamples as described above satisfying the given feature criteria and specifying the values for the given discrete and continuous feature columns, or an error.

IterateOnSamples is similar to ListSamples, but takes an additional lambda to iterate on the samples rather than returned them all. This method should call the lambda for every sample satisfying the criteria. The lambda takes an index for the sample (0,1,2,...) and a raw sample and returns a boolean and an error, which must be true and nil in order for this method not to stop. This method should return an error if the samples cannot be traversed or any error the lambda returns.

CountSamples should return the number of samples in the samples table that satisfy the riven feature criteria or an error if they cannot be counted.

ListSampleDiscreteFeatureValues takes a discrete feature column name and a slice of feature criteria and should return an slice with the numeric IDs for the different values for the given feature column name on samples satisfying the given criteria, or an error.

ListSampleContinuousFeatureValues takes a continuous feature column name and a slice of feature criteria and should return an slice with the different values for the given feature column name on samples satisfying the given criteria, or an error.

CountSampleDiscreteFeatureValues takes a discrete feature column name and a slice of feature criteria and should return a map relating the numeric IDs for the discrete values for the given feature column on samples in the table satisfying the given criteria to the number of times they appear among the samples satisfying the given criteria or an error.

CountSampleContinuousFeatureValues takes a continuous feature column name and a slice of feature criteria and should return a map relating the continuous values for the given column name on samples in the table satisfying the given criteria to the number of times they appear among the samples satisfying the given criteria or an error.

type ColumnNameFunc

type ColumnNameFunc func(string) (string, error)

ColumnNameFunc is a function that takes the name of a feature and returns column name for it or an error if the name could not be transformed.

type FeatureCriterion

type FeatureCriterion struct {
	/*
		FeatureColumn is the column name for the feature.Feature
		the criterion is applying the restriction to.
	*/
	FeatureColumn string
	/*
		DiscreteFeature defines whether the feature criterion applies
		to a discrete feature
	*/
	DiscreteFeature bool
	/*
		Operator is a string representing the
		comparison against the value in the criterion
		that is applied to samples. It must be one of
		the following: "=", "<", ">", "<=" or ">=".
		The semantics are the result from reading
		the criterion as Feature Operator Value
	*/
	Operator string
	/*
		Value is the value against which a comparison
		is applied to samples. It should be either an
		integer for discrete features or a float64 for
		continuous features.
	*/
	Value interface{}
}

FeatureCriterion are used to represent feature.Criterion on SQL DB-backed sets, they should be easily translatable to a condition on an SQL SELECT statement's WHERE clause on a samples table.

func NewFeatureCriteria

func NewFeatureCriteria(fc feature.Criterion, cnf ColumnNameFunc, dictionary map[string]int) ([]*FeatureCriterion, error)

NewFeatureCriteria takes a feature.Criterion, a ColumnNameFunc and a map of string to int containing a dictionary for converting discrete string values into their integer representations and returns a slice of FeatureCriterion equivalent to the given feature.Criterion or an error.

An error will be returned the ColumnNameFunc cannot provide a name for the feature of the feature criterion, or if the given feature.Criterion is a feature.DiscreteCriterion and its value has no representation defined on the given dictionary.

For a feature.Criterion that is no feature.DiscreteCriterion nor feature.ContinuousCriterion it returns an empty slice and no error. In other words, it is interpreted as an undefined feature criterion, which imposes no conditions on samples.

type Sample

type Sample struct {
	/*
		Values is a map of string columns names to interface{}.
		Specifically, the value must be
		* nil for an undefined value for any feature the column
		  is representing or
		* an int for the value of a discrete feature the column
		  is representing or
		* a float64 for the value of a continuous feature the
		  column is representing
	*/
	Values map[string]interface{}
	/*
		DiscreteFeatureValues is a map of int to strings
		that holds the relation of int representations on
		the Sample's Values map to their string
		representations
	*/
	DiscreteFeatureValues map[int]string
	/*
		FeatureNamesColumns is a map that translates the name
		of a feature to the column representing it on the database.
		This column is also the string value that acts as key for
		the feature value on the Sample's Value map.
	*/
	FeatureNamesColumns map[string]string
}

Sample is an implementation of botanic.Sample optimized to represent samples belonging to Set.

func (*Sample) ValueFor

func (s *Sample) ValueFor(f feature.Feature) (interface{}, error)

ValueFor takes a feature and returns the value for the feature according to the sample or nil if is undefined. For continuous feature the value is the one available on its Values map for the name of the column corresponding to the feature's name, whereas for discrete features this value is used as key on the DiscreteFeaturesValue dictionary to obtain the string representation for it.

type Set

type Set interface {
	set.Set
	Write(context.Context, []set.Sample) (int, error)
	Read(context.Context) (<-chan set.Sample, <-chan error)
}

Set is a set.Set to which samples can be added

Its AddSample takes a set.Sample and adds it to the set, returning an error if any errors occur or nil otherwise.

func Create

func Create(ctx context.Context, dbAdapter Adapter, features []feature.Feature) (Set, error)

Create takes an Adapter and a slice of feature.Feature and returns a Set backed by the given adapter or an error.

This function will ensure that the samples and discrete value tables are created on the database, and that the discrete value table has all the values for the discrete features on the features slice.

func Open

func Open(ctx context.Context, dbAdapter Adapter, features []feature.Feature) (Set, error)

Open takes an Adapter to a db backend and a slice of feature.Feature and returns a Set backed by the given adapter or an error if no set is available through the given adapter.

This function expects the adapter to have the samples and discrete value tables already created, and the discrete value table initialized with all the values of the discrete features in the features slice.

Directories

Path Synopsis
Package pgadapter provides an implementation of the Adapter interface in the sqlset package that works over a PostgreSQL database.
Package pgadapter provides an implementation of the Adapter interface in the sqlset package that works over a PostgreSQL database.
Package sqlite3adapter provides an implementation of the Adapter interface in the sqlset package that works over a SQLite3 database.
Package sqlite3adapter provides an implementation of the Adapter interface in the sqlset package that works over a SQLite3 database.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL