Documentation ¶
Overview ¶
mapreduce is used to run calculations on data the is available across several remote nodes. Each node that has some data of interst will run calculations against it.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AlgFetcherMap ¶
AlgFetcherMap implements AlgorithmFetcher.
type AlgorithmFetcher ¶
AlgorithmFetcher is used to store and fetch algorithms based on name and meta information.
type Executor ¶
type Executor struct {
// contains filtered or unexported fields
}
Executor is used to apply algorithms to data. It maps and then returns the reduced data from the given algorithm.
An Executor has to be created with NewExecutor().
func NewExecutor ¶
func NewExecutor(algFetcher AlgorithmFetcher, fs FileSystem) *Executor
NewExecutor returns a new Executor.
type FileSystem ¶
type FileSystem interface { // Files returns a set of files that matches the given route and meta information. A non-nil // error will exit the operation. Files(route string, ctx context.Context, meta []byte) (files map[string][]string, err error) // Reader returns a reader to read data from the given file. Reader(file string, ctx context.Context, meta []byte) (reader func() (data []byte, err error), err error) }
FileSystem is used to store data local to the node.
type Log ¶
type Log interface {
Printf(format string, v ...interface{})
}
Log is used to write debug logs.
type MapReduce ¶
type MapReduce struct {
// contains filtered or unexported fields
}
MapReduce is used to invoke a Map/Reduce algorithm across data on various remote nodes.
It should be created with New().
func New ¶
func New(fs FileSystem, network Network, algFetcher AlgorithmFetcher, opts ...MapReduceOption) MapReduce
New returns a new MapReduce.
func (MapReduce) Calculate ¶
func (r MapReduce) Calculate(route, algName string, ctx context.Context, meta []byte) (finalResult map[string][]byte, err error)
Calculate runs the given algorithm for the files returned from FileSystem for the given route and meta information. It uses the Network to run the calculations across the remote nodes that report having the given data.
type MapReduceOption ¶
type MapReduceOption func(*MapReduce)
MapReduceOption is used to configure a new MapReduce.
func WithLogger ¶
func WithLogger(l Log) MapReduceOption
WithLogger is used to set the given logger.
type Mapper ¶
type Mapper interface { // Map maps data to keys. It filters out the value if the // returned key has a length of 0. A non-nil error will abort // the operation. Map(value []byte) (key string, output []byte, err error) }
Mapper maps data ([]byte) to key (string).
type Network ¶
type Network interface { // Execute is invoked to run calculations for a file (file) on a remote node (nodeID) with the given algorithm // (algName). Any necessary information can be encoded into meta. The context (ctx) is used for lifecycle // management. Execute(file, algName, nodeID string, ctx context.Context, meta []byte) (result map[string][]byte, err error) }
Network is used to execute commands on remote node.
type ReduceFunc ¶
ReduceFunc wraps a function into a Reducer.
type Reducer ¶
type Reducer interface { // Reduce is called with marshalled data either from a mapper or // a reducer. It will be invoked until a slice of length 1 // or a non-nil error is returned. Reduce is expected to know how // to marshal and unmarshal the given data. Reduce(value [][]byte) (reduced [][]byte, err error) }
Reducer reduces a slice data points into a smaller set.