MapReduce
MapReduce
is a kind of programming model for processing large scale data sets in a distributed fashion inside machine cluster. The core idea behind MapReduce
is to map your data sets into a collection of {key, value} pairs, and then to reduce over all pairs with the same key.
Get Started
Prerequisites
Installation
Clone
Setup
# build the plugin
cd apps && make build
# build the binary
cd .. && make build
# start the master service
./mapreduce-master-service --conf=conf/master_conf.json --level=info
# start the worker-1 service
./mapreduce-worker-service --conf=conf/worker_1_conf.json --level=info
# start the worker-2 service
./mapreduce-worker-service --conf=conf/worker_2_conf.json --level=info
# start the worker-3 service
./mapreduce-worker-service --conf=conf/worker_3_conf.json --level=info
Example
# add task
curl -XPOST -d \
'{"task": {"inputs": ["/path/to/input-1.txt", "/path/to/input-2.txt", "/path/to/input-3.txt"]}}' \
http://localhost:18180/v1/task
Documentation
Api Design
Reference
Contributing
Step 1
Step 2
Step 3
FAQ
Support
License
- This project is licensed under the MIT License - see the MIT license for details.